logo

SCIENCE CHINA Information Sciences, Volume 63 , Issue 11 : 210205(2020) https://doi.org/10.1007/s11432-020-3082-9

Active learning based on belief functions

More info
  • ReceivedJun 27, 2020
  • AcceptedSep 10, 2020
  • PublishedOct 22, 2020

Abstract


Acknowledgment

This work was supported by National Natural Science Foundation of China (Grant No. 61671370), Postdoctoral Science Foundation of China (Grant No. 2016M592790), and Postdoctoral Science Research Foundation of Shaanxi Province (Grant No. 2016BSHEDZZ46).


References

[1] Zhou Z W, Shin J, Zhang L, et al. Fine-tuning convolutional neural networks for biomedical image analysis: actively and incrementally. In: Proceedings of Computer Vision and Pattern Recognition, 2017. 4761--4772. Google Scholar

[2] Hoi S C H, Rong J, Zhu J K, et al. Semi-supervised svm batch mode active learning for image retrieval. In: Proceedings of Computer Vision and Pattern Recognition, 2008. 1--7. Google Scholar

[3] Hoi S C H, Rong Jin S C H, Lyu M R. Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval. IEEE Trans Knowl Data Eng, 2009, 21: 1233-1248 CrossRef Google Scholar

[4] Raghavan H, Madani O, Jones R. Active Learning with Feedback on Features and Instances. J Mach Learn Res, 2006, 7: 1655--1686. Google Scholar

[5] Lewis D D, Catlett J. Heterogenous uncertainty sampling for supervised learning. In: Proceedings of the 11th International Conference, 1994. 10--13. Google Scholar

[6] Settles B. Active Learning Literature Survey. Technical Report, Department of Computer Science, University of Wisconsin-Madison. 2010. Google Scholar

[7] Sharma M, Bilgic M. Evidence-based uncertainty sampling for active learning. Data Min Knowl Disc, 2017, 31: 164-202 CrossRef Google Scholar

[8] Li X, Guo Y H. Active learning with multi-label SVM classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, 2013. 1479--1485. Google Scholar

[9] Zhang T, Oles F. The value of unlabeled data for classification problems. In: Proceedings of the 17th International Conference on Machine Learning, 2000. 1191--1198. Google Scholar

[10] Cai W, Zhang Y, Zhang Y. Active Learning for Classification with Maximum Model Change. ACM Trans Inf Syst, 2017, 36: 1-28 CrossRef Google Scholar

[11] Zhu J, Wang H, Yao T, et al. Active learning with sampling by uncertainty and density for word sense disambiguation and text classification. In: Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, 2008. 18--22. Google Scholar

[12] Kee S, del Castillo E, Runger G. Query-by-committee improvement with diversity and density in batch active learning. Inf Sci, 2018, 454-455: 401-418 CrossRef Google Scholar

[13] Huang S J, Jin R, Zhou Z H. Active Learning by Querying Informative and Representative Examples. IEEE Trans Pattern Anal Mach Intell, 2014, 36: 1936-1949 CrossRef Google Scholar

[14] Zhu J J, Bento J. Generative adversarial active learning,. arXiv Google Scholar

[15] Sinha S, Ebrahimi S, Darrell T. Variational adversarial active learning,. arXiv Google Scholar

[16] Yang Y, Loog M. A variance maximization criterion for active learning. Pattern Recognition, 2018, 78: 358-370 CrossRef Google Scholar

[17] Cai W B, Zhang Y, Zhou S Y, Wang W Q, Ding C, Gu X. Active learning for support vector machines with maximum model change. Machine Learn Knowl Discov Databases, 2014, 9: 211--226. Google Scholar

[18] Zhu J, Wang H, Tsou B K. Active Learning With Sampling by Uncertainty and Density for Data Annotations. IEEE Trans Audio Speech Lang Process, 2010, 18: 1323-1331 CrossRef Google Scholar

[19] Masson M H, Den?ux T. ECM: An evidential version of the fuzzy c-means algorithm. Pattern Recognition, 2008, 41: 1384-1397 CrossRef Google Scholar

[20] Han D, Liu W, Dezert J. A novel approach to pre-extracting support vectors based on the theory of belief functions. Knowledge-Based Syst, 2016, 110: 210-223 CrossRef Google Scholar

[21] Smets P, Kennes R. The transferable belief model. Artificial Intelligence, 1994, 66: 191-234 CrossRef Google Scholar

[22] Fan R E, Chang K W, Hsieh C J, et al. LIBLINEAR: a library for large linear classification. J Machine Learn Res, 2008, 9: 1871--1874. Google Scholar

  • Figure 1

    The process of active learning.

  • Figure 2

    (Color online) The problem of traditional uncertainty sampling.

  • Figure 3

    Uncertainty sampling based on belief functions.

  • Figure 5

    (Color online) Distribution of artificial dataset Toy2.

  • Figure 6

    (Color online) Active learning performance on Heart dataset based on LR. (a) Learning curve; (b) distributions of the classifier's accuracy.

  • Figure 14

    (Color online) Active learning performance on Toy2 dataset based on LR. (a) Learning curve; (b) distributions of the classifier's accuracy.

  • Table 1  

    Table 1Uncertainty based on probability entropy

    $p_{1}$ $p_{2}$ Probability entropy
    Sample10.50.5series0.6931
    Sample20.99330.00670.0402
    Sample30.99330.00670.0402
    Sample40.50.5series0.6931
  •   

    Algorithm 1 Traditional uncertainty sampling

    Require:A set of labeled samples $L$, a set of unlabeled samples $U$.

    while Termination condition not satisfied do

    Train a classifier $\phi_{c}(\cdot~|~L)$ based on labeled samples;

    for $i=1:|U|$

    Calculate the uncertainty of the sample, ${\rm~Un}(x_{i}^u)$;

    end for

    $x^*=~\mathop{\rm~argmax}\nolimits_{i}{\rm~Un}(x_{i}^u)$;

    $U=U-x^*$;

    $L=L~\cup~(x^*,{\rm~GetLabel}(x^*))$;

    end while

  •   

    Algorithm 2 USBF

    Require:A set of labeled samples $L$, a set of unlabeled samples $U$.

    while Termination condition not satisfied do

    Train a classifier $\phi_{c}(\cdot~|~L)$ based on labeled samples;

    $U_{R}=~\emptyset$;

    for $i=1:|U|$

    $p(x_{i}^u)=\phi_{c}(~x_{i}^u|~L)$;

    Calculate the belief functions of $x_{i}^u$,$m_{x_{i}^u}(\theta)$;

    if ($m_{x_{i}^u}(\emptyset)<\eta$) then

    $U_{R}=U_{R}+x_{i}^u$;

    end if

    end for

    for $i=1:|U_{R}|$

    Calculate the AM$(x_{i}^u)$;

    end for

    $x^*=~\mathop{\rm~argmax}\limits_{i}{\rm~AM}(x_{i}^u)$;

    $U=U-x^*$;

    $L=L~\cup~(x^*,{\rm~GetLabel}(x^*))$;

    end while

  • Table 2  

    Table 2Symbolic representation involved in the algorithm

    SymbolicDescriptionSymbolicDescription
    $L$All labeled samples$U$All unlabeled samples
    $x^*$Selected samples${\rm~AM}(x_{i}^u)$The ambiguity measure of $x_{i}^u$
    $\phi_{c}(\cdot~|~L)$Classifier trained on labeled samples$p(x_{i}^u)$The output probability of $x_{i}^u$
    $|U|$The number of unlabeled samples$x_{i}^u$An unlabeled sample
    ${\rm~Un}(x_{i}^u)$The uncertainty of $x_{i}^u$$m_{x_{i}^u}(\theta)$The belief function of $x_{i}^u$
    $\eta$Threshold about $m_{\emptyset}$
  • Table 3  

    Table 3Uncertainty based on belief functions

    $m(\emptyset)$$m(\theta_{1})$$m(\theta_{2})$$m(\Theta)$AM
    Sample10.11460.18390.18390.5175series0.6931
    Sample20.03740.60250.00410.35600.4850
    Sample30.09110.04202.833$\times$E$-$40.8665series0.6921
    Sample4series 0.37390.01750.01750.5853
  • Table 4  

    Table 4Details of datasets

    Data Category Num Proportion Features
    Heart 2 270 150/120 13
    Breast2 683 444/239 9
    LetterDP21608805/80316
    LetterIJ21502755/74716
    LetterVY21550764/78616
    LetterEF21543768/75516
    7VS92142517293/6958784(PCA10)
    Toy121000500/5002
    Toy22400200/2002
  • Table 5  

    Table 5Performance comparison of active learning algorithms in terms of ALC (the classifier is LR)$^{\rm~a)}$

    Random MaxEntropy MaxAM MMCVRDiversityPr
    Heart20.730321.3208series 22.587621.379220.972922.1399$<$0.001
    Breast23.848124.1926series 24.325123.640224.197723.6796$<$0.001
    DvsP27.386928.5772series 28.639528.639427.690328.2065$<$0.001
    IvsJ25.229726.1200series 26.284625.510225.600625.4774$<$0.001
    VvsY24.848825.5989series 25.786825.506625.100825.04050.281
    EvsF27.105428.1053series 28.178027.463827.235027.6923$<$0.001
    7VS926.756326.9520series 26.959026.724726.888026.8400$<$0.001
    Toy129.191629.344429.3830series 29.383929.355329.3433$<$0.001
    Toy229.160627.1131series 29.738527.299827.255929.1533$<$0.001
    Win00series 8100
    Rank5.203.11series 1.113.673.894

    a

  • Table 6  

    Table 6Performance comparison of active learning algorithms in terms of ALC (the classifier is SVM)$^{\rm~a)}$

    Random MaxEntropy MaxAM MMCVRDiversityPr
    Heart19.913019.8200series 20.449419.623419.623419.94690.6839
    Breast23.181723.5343series 23.802323.564623.464923.6947$<$0.001
    DvsP28.041428.4266series 28.743828.365428.094728.2963$<$0.001
    IvsJ26.764025.5787series 25.842025.146924.865225.3375$<$0.001
    VvsY24.355525.2773series 25.640624.817924.042024.1685$<$0.001
    EvsF27.466128.1299series 28.295327.784127.536327.4708$<$0.001
    7VS926.547126.8171series 26.891226.699526.795726.5487$<$0.001
    Toy129.230229.4354series 29.442629.346629.411229.3556$<$0.001
    Toy226.859526.4866series 28.542727.292926.262028.0407$<$0.001
    Win00series 9000
    Rank5.382.5series 13.884.633.75

    a

  • Table 7  

    Table 7Average run time of active learning algorithms

    Random MaxEntropy MaxAM MMCVRDiversity
    LR0.023 0.026 series 0.645 0.052 0.2630.058
    SVM0.024 0.828 series 1.687 0.923 1.1220.056