Knowledge Transfer with Low-Quality Data: A Feature Extraction Issue

被引：39

作者：

Quanz, Brian ^{[1
]}

Huan, Jun ^{[1
]}

Mishra, Meenakshi ^{[1
]}

机构：

[1] Univ Kansas, Informat & Telecommun Technol Ctr, Dept Elect Engn & Comp Sci, Lawrence, KS 66045 USA

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2012年 / 24卷 / 10期

基金：

美国国家科学基金会;

关键词：

Knowledge transfer; transfer learning; feature extraction; sparse coding; low-quality data; ADAPTATION;

D O I：

10.1109/TKDE.2012.75

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Effectively utilizing readily available auxiliary data to improve predictive performance on new modeling tasks is a key problem in data mining. In this research, the goal is to transfer knowledge between sources of data, particularly when ground-truth information for the new modeling task is scarce or is expensive to collect where leveraging any auxiliary sources of data becomes a necessity. Toward seamless knowledge transfer among tasks, effective representation of the data is a critical but yet not fully explored research area for the data engineer and data miner. Here, we present a technique based on the idea of sparse coding, which essentially attempts to find an embedding for the data by assigning feature values based on subspace cluster membership. We modify the idea of sparse coding by focusing the identification of shared clusters between data when source and target data may have different distributions. In our paper, we point out cases where a direct application of sparse coding will lead to a failure of knowledge transfer. We then present the details of our extension to sparse coding, by incorporating distribution distance estimates for the embedded data, and show that the proposed algorithm can overcome the shortcomings of the sparse coding algorithm on synthetic data and achieve improved predictive performance on a real world chemical toxicity transfer learning task.

引用

页码：1789 / 1802

页数：14

共 50 条

[21] The use of contextual spatial knowledge for low-quality image segmentation
Kallel, Imene Khanfir
Almouahed, Shaban
Alsahwa, Bassem
Solaiman, Basel
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (08) : 9645 - 9665
[22] Investigation of Multiple Imputation in Low-Quality Questionnaire Data
Van Ginkel, Joost R.
MULTIVARIATE BEHAVIORAL RESEARCH, 2010, 45 (03) : 574 - 598
[23] Harnessing the information contained in low-quality data sources
Couso, Ines
Sanchez, Luciano
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2014, 55 (07) : 1485 - 1486
[24] Low-quality females prefer low-quality males when choosing a mate
Holveck, Marie-Jeanne
Riebel, Katharina
PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2010, 277 (1678) : 153 - 160
[25] Transferring deep knowledge for object recognition in Low-quality underwater videos
Sun, Xin
Shi, Junyu
Liu, Lipeng
Dong, Junyu
Plant, Claudia
Wang, Xinhua
Zhou, Huiyu
NEUROCOMPUTING, 2018, 275 : 897 - 908
[26] A Low-quality Data User Identification Method Based on Blockchain
Wei, Jiayong
Zhang, Hua
Chen, Yuebu
Xu, Yanxin
2023 INTERNATIONAL CONFERENCE ON DATA SECURITY AND PRIVACY PROTECTION, DSPP, 2023, : 136 - 142
[27] Mining fuzzy association rules from low-quality data
Palacios, A. M.
Gacto, M. J.
Alcala-Fdez, J.
SOFT COMPUTING, 2012, 16 (05) : 883 - 901
[28] Mining fuzzy association rules from low-quality data
A. M. Palacios
M. J. Gacto
J. Alcalá-Fdez
Soft Computing, 2012, 16 : 883 - 901
[29] Sifting Truths from Multiple Low-Quality Data Sources
Xie, Zizhe
Liu, Qizhi
Bao, Zhifeng
WEB AND BIG DATA, APWEB-WAIM 2017, PT I, 2017, 10366 : 74 - 81
[30] A novel method for clinical risk prediction with low-quality data
Wang, Zeyuan
Poon, Josiah
Wang, Shuze
Sun, Shiding
Poon, Simon
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 114

← 1 2 3 4 5 →