SCIENCE CHINA Information Sciences, Volume 63 , Issue 11 : 210205(2020) https://doi.org/10.1007/s11432-020-3082-9

## Active learning based on belief functions

• AcceptedSep 10, 2020
• PublishedOct 22, 2020
Share
Rating

### Acknowledgment

This work was supported by National Natural Science Foundation of China (Grant No. 61671370), Postdoctoral Science Foundation of China (Grant No. 2016M592790), and Postdoctoral Science Research Foundation of Shaanxi Province (Grant No. 2016BSHEDZZ46).

### References

[1] Zhou Z W, Shin J, Zhang L, et al. Fine-tuning convolutional neural networks for biomedical image analysis: actively and incrementally. In: Proceedings of Computer Vision and Pattern Recognition, 2017. 4761--4772. Google Scholar

[2] Hoi S C H, Rong J, Zhu J K, et al. Semi-supervised svm batch mode active learning for image retrieval. In: Proceedings of Computer Vision and Pattern Recognition, 2008. 1--7. Google Scholar

[3] Hoi S C H, Rong Jin S C H, Lyu M R. Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval. IEEE Trans Knowl Data Eng, 2009, 21: 1233-1248 CrossRef Google Scholar

[4] Raghavan H, Madani O, Jones R. Active Learning with Feedback on Features and Instances. J Mach Learn Res, 2006, 7: 1655--1686. Google Scholar

[5] Lewis D D, Catlett J. Heterogenous uncertainty sampling for supervised learning. In: Proceedings of the 11th International Conference, 1994. 10--13. Google Scholar

[6] Settles B. Active Learning Literature Survey. Technical Report, Department of Computer Science, University of Wisconsin-Madison. 2010. Google Scholar

[7] Sharma M, Bilgic M. Evidence-based uncertainty sampling for active learning. Data Min Knowl Disc, 2017, 31: 164-202 CrossRef Google Scholar

[8] Li X, Guo Y H. Active learning with multi-label SVM classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, 2013. 1479--1485. Google Scholar

[9] Zhang T, Oles F. The value of unlabeled data for classification problems. In: Proceedings of the 17th International Conference on Machine Learning, 2000. 1191--1198. Google Scholar

[10] Cai W, Zhang Y, Zhang Y. Active Learning for Classification with Maximum Model Change. ACM Trans Inf Syst, 2017, 36: 1-28 CrossRef Google Scholar

[11] Zhu J, Wang H, Yao T, et al. Active learning with sampling by uncertainty and density for word sense disambiguation and text classification. In: Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, 2008. 18--22. Google Scholar

[12] Kee S, del Castillo E, Runger G. Query-by-committee improvement with diversity and density in batch active learning. Inf Sci, 2018, 454-455: 401-418 CrossRef Google Scholar

[13] Huang S J, Jin R, Zhou Z H. Active Learning by Querying Informative and Representative Examples. IEEE Trans Pattern Anal Mach Intell, 2014, 36: 1936-1949 CrossRef Google Scholar

[14] Zhu J J, Bento J. Generative adversarial active learning,. arXiv Google Scholar

[15] Sinha S, Ebrahimi S, Darrell T. Variational adversarial active learning,. arXiv Google Scholar

[16] Yang Y, Loog M. A variance maximization criterion for active learning. Pattern Recognition, 2018, 78: 358-370 CrossRef Google Scholar

[17] Cai W B, Zhang Y, Zhou S Y, Wang W Q, Ding C, Gu X. Active learning for support vector machines with maximum model change. Machine Learn Knowl Discov Databases, 2014, 9: 211--226. Google Scholar

[18] Zhu J, Wang H, Tsou B K. Active Learning With Sampling by Uncertainty and Density for Data Annotations. IEEE Trans Audio Speech Lang Process, 2010, 18: 1323-1331 CrossRef Google Scholar

[19] Masson M H, Den?ux T. ECM: An evidential version of the fuzzy c-means algorithm. Pattern Recognition, 2008, 41: 1384-1397 CrossRef Google Scholar

[20] Han D, Liu W, Dezert J. A novel approach to pre-extracting support vectors based on the theory of belief functions. Knowledge-Based Syst, 2016, 110: 210-223 CrossRef Google Scholar

[21] Smets P, Kennes R. The transferable belief model. Artificial Intelligence, 1994, 66: 191-234 CrossRef Google Scholar

[22] Fan R E, Chang K W, Hsieh C J, et al. LIBLINEAR: a library for large linear classification. J Machine Learn Res, 2008, 9: 1871--1874. Google Scholar

• Figure 1

The process of active learning.

• Figure 2

(Color online) The problem of traditional uncertainty sampling.

• Figure 3

Uncertainty sampling based on belief functions.

• Figure 5

(Color online) Distribution of artificial dataset Toy2.

• Figure 6

(Color online) Active learning performance on Heart dataset based on LR. (a) Learning curve; (b) distributions of the classifier's accuracy.

• Figure 14

(Color online) Active learning performance on Toy2 dataset based on LR. (a) Learning curve; (b) distributions of the classifier's accuracy.

• Table 1

Table 1Uncertainty based on probability entropy

 $p_{1}$ $p_{2}$ Probability entropy Sample1 0.5 0.5 series0.6931 Sample2 0.9933 0.0067 0.0402 Sample3 0.9933 0.0067 0.0402 Sample4 0.5 0.5 series0.6931
•

Require:A set of labeled samples $L$, a set of unlabeled samples $U$.

while Termination condition not satisfied do

Train a classifier $\phi_{c}(\cdot~|~L)$ based on labeled samples;

for $i=1:|U|$

Calculate the uncertainty of the sample, ${\rm~Un}(x_{i}^u)$;

end for

$x^*=~\mathop{\rm~argmax}\nolimits_{i}{\rm~Un}(x_{i}^u)$;

$U=U-x^*$;

$L=L~\cup~(x^*,{\rm~GetLabel}(x^*))$;

end while

•

Algorithm 2 USBF

Require:A set of labeled samples $L$, a set of unlabeled samples $U$.

while Termination condition not satisfied do

Train a classifier $\phi_{c}(\cdot~|~L)$ based on labeled samples;

$U_{R}=~\emptyset$;

for $i=1:|U|$

$p(x_{i}^u)=\phi_{c}(~x_{i}^u|~L)$;

Calculate the belief functions of $x_{i}^u$,$m_{x_{i}^u}(\theta)$;

if ($m_{x_{i}^u}(\emptyset)<\eta$) then

$U_{R}=U_{R}+x_{i}^u$;

end if

end for

for $i=1:|U_{R}|$

Calculate the AM$(x_{i}^u)$;

end for

$x^*=~\mathop{\rm~argmax}\limits_{i}{\rm~AM}(x_{i}^u)$;

$U=U-x^*$;

$L=L~\cup~(x^*,{\rm~GetLabel}(x^*))$;

end while

• Table 2

Table 2Symbolic representation involved in the algorithm

 Symbolic Description Symbolic Description $L$ All labeled samples $U$ All unlabeled samples $x^*$ Selected samples ${\rm~AM}(x_{i}^u)$ The ambiguity measure of $x_{i}^u$ $\phi_{c}(\cdot~|~L)$ Classifier trained on labeled samples $p(x_{i}^u)$ The output probability of $x_{i}^u$ $|U|$ The number of unlabeled samples $x_{i}^u$ An unlabeled sample ${\rm~Un}(x_{i}^u)$ The uncertainty of $x_{i}^u$ $m_{x_{i}^u}(\theta)$ The belief function of $x_{i}^u$ $\eta$ Threshold about $m_{\emptyset}$
• Table 3

Table 3Uncertainty based on belief functions

 $m(\emptyset)$ $m(\theta_{1})$ $m(\theta_{2})$ $m(\Theta)$ AM Sample1 0.1146 0.1839 0.1839 0.5175 series0.6931 Sample2 0.0374 0.6025 0.0041 0.3560 0.4850 Sample3 0.0911 0.0420 2.833$\times$E$-$4 0.8665 series0.6921 Sample4 series 0.3739 0.0175 0.0175 0.5853 –
• Table 4

Table 4Details of datasets

 Data Category Num Proportion Features Heart 2 270 150/120 13 Breast 2 683 444/239 9 LetterDP 2 1608 805/803 16 LetterIJ 2 1502 755/747 16 LetterVY 2 1550 764/786 16 LetterEF 2 1543 768/755 16 7VS9 2 14251 7293/6958 784(PCA10) Toy1 2 1000 500/500 2 Toy2 2 400 200/200 2
• Table 5

Table 5Performance comparison of active learning algorithms in terms of ALC (the classifier is LR)$^{\rm~a)}$

 Random MaxEntropy MaxAM MMC VR Diversity Pr Heart 20.7303 21.3208 series 22.5876 21.3792 20.9729 22.1399 $<$0.001 Breast 23.8481 24.1926 series 24.3251 23.6402 24.1977 23.6796 $<$0.001 DvsP 27.3869 28.5772 series 28.6395 28.6394 27.6903 28.2065 $<$0.001 IvsJ 25.2297 26.1200 series 26.2846 25.5102 25.6006 25.4774 $<$0.001 VvsY 24.8488 25.5989 series 25.7868 25.5066 25.1008 25.0405 0.281 EvsF 27.1054 28.1053 series 28.1780 27.4638 27.2350 27.6923 $<$0.001 7VS9 26.7563 26.9520 series 26.9590 26.7247 26.8880 26.8400 $<$0.001 Toy1 29.1916 29.3444 29.3830 series 29.3839 29.3553 29.3433 $<$0.001 Toy2 29.1606 27.1131 series 29.7385 27.2998 27.2559 29.1533 $<$0.001 Win 0 0 series 8 1 0 0 – Rank 5.20 3.11 series 1.11 3.67 3.89 4 –

a

• Table 6

Table 6Performance comparison of active learning algorithms in terms of ALC (the classifier is SVM)$^{\rm~a)}$

 Random MaxEntropy MaxAM MMC VR Diversity Pr Heart 19.9130 19.8200 series 20.4494 19.6234 19.6234 19.9469 0.6839 Breast 23.1817 23.5343 series 23.8023 23.5646 23.4649 23.6947 $<$0.001 DvsP 28.0414 28.4266 series 28.7438 28.3654 28.0947 28.2963 $<$0.001 IvsJ 26.7640 25.5787 series 25.8420 25.1469 24.8652 25.3375 $<$0.001 VvsY 24.3555 25.2773 series 25.6406 24.8179 24.0420 24.1685 $<$0.001 EvsF 27.4661 28.1299 series 28.2953 27.7841 27.5363 27.4708 $<$0.001 7VS9 26.5471 26.8171 series 26.8912 26.6995 26.7957 26.5487 $<$0.001 Toy1 29.2302 29.4354 series 29.4426 29.3466 29.4112 29.3556 $<$0.001 Toy2 26.8595 26.4866 series 28.5427 27.2929 26.2620 28.0407 $<$0.001 Win 0 0 series 9 0 0 0 – Rank 5.38 2.5 series 1 3.88 4.63 3.75 –

a

• Table 7

Table 7Average run time of active learning algorithms

 Random MaxEntropy MaxAM MMC VR Diversity LR 0.023 0.026 series 0.645 0.052 0.263 0.058 SVM 0.024 0.828 series 1.687 0.923 1.122 0.056

Citations

Altmetric