国家自然科学基金(61571233,61271082)
国家重点基础研究发展计划(973)(2011CB302903)
江苏省高校自然科学研究重大项目(14KJA510003)
江苏省重点研发计划(BE2015700)
南京信息工程大学PAPD与CICA- EET
[1] Jiang J, Zhai C X. Instance weighting for domain adaptation in NLP. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, 2007. 264--271. Google Scholar
Figure 1
Multi-instance multi-label transfer learning framework(TR-MIML), including the re-weighting data samples from source domain stageand the classification model construction stage
Figure 2
(Color online) Effect on protein function prediction of Rattus norvegicus by transfer learning using five species with different phylogenetic relationship
$\widehat{h}~=~$ 1. 2.$f_i^S~=~{\rm~miFV}(~{X_i^S~}~)$; 3. 4. 5.$f_i^T~=~{\rm~miFV}(~{X_i^T}~)$; 6. 7.Compute $\beta~$ by solving (2); 8.Learn the classifier $\widehat{h}$ by solving (5). |
Species | Proteins | GO terms | Domains per protein | GO terms per protein |
(Mean$\pm$std.) | (Mean$\pm$std.) | |||
Geobacter sulfurreducens | 379 | 320 | 3.20$\pm$1.21 | 3.14$\pm$3.33 |
Azotobacter vinelandii | 407 | 340 | 3.07$\pm$1.16 | 4.00$\pm$6.97 |
Mus musculus | 11676 | 3065 | 2.76$\pm$1.84 | 44.64$\pm$50.27 |
Rattus norvegicus | 5991 | 2600 | 2.53$\pm$1.71 | 39.51$\pm$44.78 |
Homo sapiens | 13773 | 3311 | 2.98$\pm$4.30 | 55.81$\pm$126.63 |
Arabidopsis thaliana | 8986 | 1811 | 2.02$\pm$1.46 | 27.68$\pm$70.37 |
Saccharomyces cerevisiae | 3509 | 1566 | 1.86$\pm$1.36 | 15.89$\pm$11.52 |
TR-MIMLfast | ||||||
MIMLfast | 0.44$\pm~$0.02 | 4.76$\pm~$0.03 | 0.20$\pm~$0.01 | 0.71$\pm~$0.03 | 0.43$\pm~$0.01 | |
Mus musculus | TR-MIMLNN | 0.56$\pm~$0.01 | 4.28$\pm~$0.20 | 0.21$\pm~$0.01 | 0.58$\pm~$0.02 | 0.36$\pm~$0.03 |
MIMLNN | 0.54$\pm~$0.01 | 4.51$\pm~$0.19 | 0.24$\pm~$0.01 | 0.58$\pm~$0.01 | 0.37$\pm~$0.01 | |
TR-MIMLSVM | 0.18$\pm~$0.0 | 0.37$\pm~$0.09 | ||||
MIMLSVM | 0.44$\pm~$0.01 | 4.62$\pm~$0.05 | 0.19$\pm~$0.02 | 0.67$\pm~$0.01 | 0.40$\pm~$0.01 | |
TR-MIMLfast | ||||||
MIMLfast | 0.43$\pm~$0.05 | 5.22$\pm~$0.07 | 0.23$\pm~$0.03 | 0.75$\pm~$0.08 | 0.42$\pm~$0.06 | |
Rattus norvegicus | TR-MIMLNN | 0.16$\pm~$0.02 | 0.38$\pm~$0.01 | |||
MIMLNN | 0.48$\pm~$0.01 | 4.74$\pm~$0.09 | 0.19$\pm~$0.00 | 0.66$\pm~$0.03 | 0.41$\pm~$0.01 | |
TR-MIMLSVM | 0.53$\pm~$0.03 | 0.17$\pm~$0.01 | 0.39$\pm~$0.01 | |||
MIMLSVM | 0.52$\pm~$0.01 | 5.10$\pm~$0.08 | 0.17$\pm~$0.01 | 0.66$\pm~$0.01 | 0.35$\pm~$0.02 | |
TR-MIMLfast | 0.53$\pm~$0.04 | 4.42$\pm~$0.14 | 0.62$\pm~$0.05 | 0.35$\pm~$0.01 | ||
MIMLfast | 0.50$\pm~$0.02 | 4.50$\pm~$0.14 | 0.22$\pm~$0.02 | 0.66$\pm~$0.04 | 0.35$\pm~$0.02 | |
Saccharomyces cerevisiae | TR-MIMLNN | 0.53$\pm~$0.01 | 4.58$\pm~$0.09 | 0.14$\pm~$0.04 | 0.61$\pm~$0.09 | |
MIMLNN | 0.52$\pm~$0.02 | 4.72$\pm~$0.11 | 0.16$\pm~$0.01 | 0.62$\pm~$0.02 | 0.41$\pm~$0.02 | |
TR-MIMLSVM | 0.54$\pm~$0.02 | 4.61$\pm~$0.07 | 0.17$\pm~$0.01 | 0.57$\pm~$0.01 | 0.40$\pm~$0.03 | |
MIMLSVM | 0.53$\pm~$0.01 | 4.72$\pm~$0.04 | 0.18$\pm~$0.01 | 0.60$\pm~$0.01 | 0.43$\pm~$0.01 |
a) 粗体表示有迁移学习的结果要显著好于无迁移学习的结果(基于置信度为95%的配对样本t检验).
TR-MIMLfast | 4.30$\pm~$0.40 | 0.34$\pm~$0.02 | ||||
MIMLfast | 0.49$\pm~$0.03 | 4.55$\pm~$0.15 | 0.21$\pm~$0.02 | 0.69$\pm~$0.05 | 0.38$\pm~$0.02 | |
Mus musculus | TR-MIMLNN | 0.52$\pm~$0.00 | 4.35$\pm~$0.0 | 0.68$\pm~$0.02 | 0.39$\pm~$0.01 | |
MIMLNN | 0.48$\pm~$0.01 | 4.79$\pm~$0.03 | 0.27$\pm~$0.01 | 0.65$\pm~$0.00 | 0.41$\pm~$0.00 | |
TR-MIMLSVM | 0.50$\pm~$0.0 | 4.55$\pm~$0.12 | 0.63$\pm~$0.01 | 0.41$\pm~$0.02 | ||
MIMLSVM | 0.47$\pm~$0.01 | 4.67$\pm~$0.12 | 0.28$\pm~$0.01 | 0.64$\pm~$0.02 | 0.44$\pm~$0.02 | |
TR-MIMLfast | 4.64$\pm~$0.23 | 0.64$\pm~$0.02 | 0.39$\pm~$0.04 | |||
MIMLfast | 0.49$\pm~$0.02 | 5.03$\pm~$0.25 | 0.25$\pm~$0.00 | 0.67$\pm~$0.03 | 0.40$\pm~$0.02 | |
Rattus norvegicus | TR-MIMLNN | 0.50$\pm~$0.03 | 4.93$\pm~$0.20 | 0.20$\pm~$0.01 | ||
MIMLNN | 0.46$\pm~$0.01 | 5.36$\pm~$0.13 | 0.23$\pm~$0.01 | 0.71$\pm~$0.02 | 0.46$\pm~$0.02 | |
TR-MIMLSVM | 0.52$\pm~$0.01 | 4.73$\pm~$0.05 | 0.21$\pm~$0.02 | |||
MIMLSVM | 0.49$\pm~$0.00 | 4.82$\pm~$0.03 | 0.25$\pm~$0.01 | 0.70$\pm~$0.01 | 0.46$\pm~$0.01 | |
TR-MIMLfast | 4.52$\pm~$0.10 | 0.37$\pm~$0.02 | ||||
MIMLfast | 0.55$\pm~$0.03 | 4.87$\pm~$0.28 | 0.24$\pm~$0.02 | 0.60$\pm~$0.03 | 0.35$\pm~$0.02 | |
Saccharomyces cerevisiae | TR-MIMLNN | 5.02$\pm~$0.10 | 0.19$\pm~$0.04 | |||
MIMLNN | 0.51$\pm~$0.00 | 5.49$\pm~$0.10 | 0.19$\pm~$0.01 | 0.64$\pm~$0.00 | 0.46$\pm~$0.00 | |
TR-MIMLSVM | 0.52$\pm~$0.01 | 4.77$\pm~$0.03 | 0.19$\pm~$0.01 | 0.62$\pm~$0.03 | ||
MIMLSVM | 0.49$\pm~$0.01 | 4.75$\pm~$0.00 | 0.20$\pm~$0.01 | 0.66$\pm~$0.01 | 0.46$\pm~$0.02 |
a) 每个评价指标上最好的结果用粗体表示.
Mus musculus | TrAdaBoost | |||||
AdaBoost | 0.47$\pm~$0.01 | 4.51$\pm~$0.04 | 0.28$\pm~$0.01 | 0.69$\pm~$0.01 | 0.41$\pm~$0.01 | |
DALR | 4.24$\pm~$0.02 | 0.26$\pm~$0.00 | ||||
LR | 0.35$\pm~$0.03 | 4.32$\pm~$0.05 | 0.27$\pm~$0.02 | 0.82$\pm~$0.01 | 0.45$\pm~$0.02 | |
Rattus norvegicus | TrAdaBoost | 4.23$\pm~$0.02 | 0.27$\pm~$0.01 | |||
AdaBoost | 0.36$\pm~$0.02 | 4.31$\pm~$0.03 | 0.28$\pm~$0.03 | 0.83$\pm~$0.03 | 0.45$\pm~$0.02 | |
DALR | 4.11$\pm~$0.03 | 0.9$\pm~$0.01 | 0.46$\pm~$0.04 | |||
LR | 0.32$\pm~$0.01 | 4.41$\pm~$0.05 | 0.30$\pm~$0.01 | 0.88$\pm~$0.00 | 0.54$\pm~$0.01 | |
Saccharomyces cerevisiae | TrAdaBoost | 4.11$\pm~$0.02 | 0.28$\pm~$0.01 | |||
AdaBoost | 0.32$\pm~$0.01 | 4.41$\pm~$0.05 | 0.30$\pm~$0.01 | 0.88$\pm~$0.01 | 0.54$\pm~$0.01 | |
DALR | 4.35$\pm~$0.01 | 0.28$\pm~$0.02 | 0.61$\pm~$0.00 | 0.32$\pm~$0.01 | ||
LR | 0.47$\pm~$0.02 | 4.48$\pm~$0.00 | 0.33$\pm~$0.01 | 0.67$\pm~$0.01 | 0.34$\pm~$0.00 |
a) 每个评价指标上最好的结果用粗体表示.
Mus musculus | TrAdaBoost | 0.52$\pm~$0.00 | 0.40$\pm~$0.00 | |||
AdaBoost | 0.50$\pm~$0.00 | 4.77$\pm~$0.03 | 0.36$\pm~$0.00 | 0.69$\pm~$0.00 | 0.42$\pm~$0.00 | |
DALR | 4.47$\pm~$0.02 | 0.38$\pm~$0.00 | ||||
LR | 0.48$\pm~$0.01 | 4.79$\pm~$0.05 | 0.31$\pm~$0.02 | 0.67$\pm~$0.03 | 0.41$\pm~$0.02 | |
Rattus norvegicus | TrAdaBoost | 4.25$\pm~$0.00 | 0.27$\pm~$0.01 | 0.62$\pm~$0.01 | 0.33$\pm~$0.01 | |
AdaBoost | 0.46$\pm~$0.02 | 4.49$\pm~$0.01 | 0.34$\pm~$0.02 | 0.66$\pm~$0.02 | 0.35$\pm~$0.02 | |
DALR | 0.17$\pm~$0.00 | 0.60$\pm~$0.01 | 0.36$\pm~$0.09 | |||
LR | 0.44$\pm~$0.00 | 4.61$\pm~$0.05 | 0.19$\pm~$0.01 | 0.67$\pm~$0.00 | 0.39$\pm~$0.00 | |
Saccharomyces cerevisiae | TrAdaBoost | 0.49$\pm~$0.01 | 4.43$\pm~$0.01 | 0.30$\pm~$0.00 | 0.60$\pm~$0.01 | 0.41$\pm~$0.04 |
AdaBoost | 0.45$\pm~$0.01 | 4.69$\pm~$0.09 | 0.37$\pm~$0.01 | 0.66$\pm~$0.01 | 0.48$\pm~$0.01 | |
DALR | 4.24$\pm~$0.02 | 0.26$\pm~$0.00 | ||||
LR | 0.35$\pm~$0.01 | 4.33$\pm~$0.07 | 0.28$\pm~$0.00 | 0.84$\pm~$0.01 | 0.44$\pm~$0.01 |
a) 每个评价指标上最好的结果用粗体表示.