logo

SCIENTIA SINICA Informationis, Volume 50 , Issue 10 : 1529(2020) https://doi.org/10.1360/N112018-00188

A negative selection algorithm hole improvement method based on MHC

More info
  • ReceivedFeb 25, 2019
  • AcceptedAug 8, 2019
  • PublishedOct 13, 2020

Abstract


Funded by

国家重点研发计划(2016YFB0800600)

国家自然科学基金(U1736212,61572334)

四川省重点研发项目(2018GZ0183)


References

[1] Castro L R D, Timmis J. Artificial Immune Systems: A New Computational Intelligence Paradigm. New York: Springer, 2002. 368. Google Scholar

[2] Forrest S, Perelson A S, Allen L, et al. Self-nonself discrimination in a computer. In: Proceedings of IEEE Symposium on Security and Privacy, Oakland, 1994. 202--212. Google Scholar

[3] Klarreich E. Inspired by immunity. Nature, 2002, 415: 468-470 CrossRef PubMed ADS Google Scholar

[4] Balthrop J. Computer science. Technological networks and the spread of computer viruses.. Science, 2004, 304: 527-529 CrossRef PubMed Google Scholar

[5] Ji Z, Dasgupta D. Revisiting negative selection algorithms.. Evolary Computation, 2007, 15: 223-251 CrossRef PubMed Google Scholar

[6] Ji Z, Dasgupta D. V-detector: An efficient negative selection algorithm with "probably adequate" detector coverage. Inf Sci, 2009, 179: 1390-1406 CrossRef Google Scholar

[7] Gonzàlez F A, Dasgupta D, Ni no L F. A Randomized Real-Valued Negative Selection Algorithm. In: Proceedings of the 2nd International Conference on Artificial Immune Systems, Edinburgh, 2003. 261--272. Google Scholar

[8] Stibor T, Timmis J, Eckert C. A comparative study of real-valued negative selection to statistical anomaly detection techniques. In: Proceedings of International Conference on Artificial Immune Systems, Banff, 2005. 262--275. Google Scholar

[9] Zhang H, Wu L F, Zhang Y S, et al. An algorithm of $r$-adjustable negative selection algorithm and its simulation analysis. Chin J Comput, 2005, 28: 1614--1619. Google Scholar

[10] He S. A Negative Selection Algorithm with the Variable Length Detector. J Software, 2007, 18: 1361-1368 CrossRef Google Scholar

[11] Idris I, Selamat A, Thanh Nguyen N. A combined negative selection algorithm-particle swarm optimization for an email spam detection system. Eng Appl Artificial Intelligence, 2015, 39: 33-44 CrossRef Google Scholar

[12] Cui L, Pi D, Chen C. BIORV-NSA: Bidirectional inhibition optimization r-variable negative selection algorithm and its application. Appl Soft Computing, 2015, 32: 544-552 CrossRef Google Scholar

[13] Gong M, Zhang J, Ma J. An efficient negative selection algorithm with further training for anomaly detection. Knowledge-Based Syst, 2012, 30: 185-191 CrossRef Google Scholar

[14] Fouladvand S, Osareh A, Shadgar B. DENSA: An effective negative selection algorithm with flexible boundaries for self-space and dynamic number of detectors. Eng Appl Artificial Intelligence, 2017, 62: 359-372 CrossRef Google Scholar

[15] Li D, Liu S, Zhang H. Negative selection algorithm with constant detectors for anomaly detection. Appl Soft Computing, 2015, 36: 618-632 CrossRef Google Scholar

[16] Ljunggren H G, K?rre K. In search of the 'missing self': MHC molecules and NK cell recognition. Immunol Today, 1990, 11: 237-244 CrossRef Google Scholar

[17] Yates A J. Theories and quantification of thymic selection.. Front Immunol, 2014, 5 CrossRef PubMed Google Scholar

[18] Carpenter A C, Bosselut R. Decision checkpoints in the thymus.. Nat Immunol, 2010, 11: 666-673 CrossRef PubMed Google Scholar

[19] De Berg M. Computational Geometry: Algorithms and Applications. Berlin: Springer, 2008. Google Scholar

[20] Klee V. On the complexity ofd- dimensional Voronoi diagrams. Arch Math, 1980, 34: 75-80 CrossRef Google Scholar

[21] Herceg M, Kvasnica M, Jones C, et al. Multi-parametric toolbox 3.0. In: Proceedings of the European control conference, Zurich, 2013. 502--510. Google Scholar

[22] Costa Silva G, Caminhas W M, Palhares R M. Artificial immune systems applied to fault detection and isolation: A brief review of immune response-based approaches and a case study. Appl Soft Computing, 2017, 57: 118-131 CrossRef Google Scholar

[23] Yu Z, Tsai J J P. Intrusion Detection: A Machine Learning Approach. London: Imperial College Press, 2011. Google Scholar

[24] Yamanishi K, Takeuchi J-I, Williams G, et al. On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. In: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2000. 320--324. Google Scholar

[25] Liu F T, Ting K M, Zhou Z. Isolation Forest. In: Proceedings of 2008 8th IEEE International Conference on Data Mining, Oakland, 2008. 413--422. Google Scholar

[26] Ting K M, Zhou G-T, Liu F T, et al. Mass estimation and its applications. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2010. 989--998. Google Scholar

  • Figure 1

    (Color online) A partition of feature space. $S_1$–$S_5$ are 5 base points.The solid lines are the Voronoi diagram ${\rm~Vor}(S)$. The feature space $[0,1]^2$ is partitioned into 5 cells $\nu(S_1)$–$\nu(S_5)$. $p$ is a center of MHC-I detector. $q$ is a center of MHC-II detector

  • Figure 2

    (Color online) The distribution of holes. $S_1$ and $S_2$ are self samples. The points $a,b,c$ and $d$ are vertexes in the cell.The pure color circle centered on $S_1$ and $S_2$ represents the self region.The fringe shadow circle centered on point $a,~b,~c$ and $d$ represents the MHC-I detectoror the MHC-II detector. The projection of $S_1$ on line $ab$ is located in line segment $ab$, and the distance between $S_1$ and line $ab$ is (a) greater than self radius $r_s$, and (b) less than self radius $r_s$.The projection of $S_2$ on line $cd$ is not located in line segment $cd$, and the distance between $S_2$ and line $cd$ is (c) greater than self radius $r_s$, and (d) less than self radius $r_s$

  • Figure 3

    (Color online) Part of the hole.The sector with a point $S$ as the center is a self region. The sector with a point $A$ or $B$ as the center is an MHC-I or MHC-II detector region. The blank area is the hole

  •   

    Algorithm 1 To calculate MHC-I detectors and MHC-II detectors

    Require:Output: Set of the hole improvement detectors $D$.

    $S~\Leftarrow$ normalize $S$ into $[0,1]^d$;

    ${\rm~Vor}\left(~S~\right)~\Leftarrow$ construct voronoi diagram by $S$;

    ${\rm~Vet}\left(~S_i~\right)$ $\Leftarrow$ get all vertex in ${\rm~Vor}\left(~S~\right)$;

    ${\rm~Vet}\left(~S_j~\right)$ $\Leftarrow$ get duplicate samples in ${\rm~Vet}\left(~S_i~\right)$;

    for ${\rm~Vet}\left(~S_k~\right)~\in~{\rm~Vet}\left(~S_j~\right)$

    $r~=~{\rm~dist}\left(~{\rm~Vet}\left(~S_k~\right),~S_k~\right)$;

    if $r~>~\theta$ then

    $D$ $\Leftarrow$ $\langle{\rm~Vet}\left(~S_k~\right),~r\rangle$;

    end if

    end for

    return $D$.

  • Table 1   Comparison of time complexity
    Time complexity Comment
    NNSA [2] $O(~-\frac{\ln~{{P}_{f}}\cdot{{N}_{S}}}{{{P}_{m}}{{\left(~1-{{P}_{m}}~\right)}^{{{N}_{S}}}}}~)$ Exponential level
    RNSA [7] $O(~\frac{\left|~D~\right|\cdot{{N}_{S}}}{{{\left(~1-{{P}_{m}}~\right)}^{{{N}_{S}}}}}~)$ Exponential level
    V-Detector [6] $O(~\frac{\left|~D~\right|\cdot{{N}_{S}}}{{{\left(~1-{{P}_{m}}~\right)}^{{{N}_{S}}}}}~)$ Exponential level
    MHC-NSA $O(~{{N}_{S}}\log~{{N}_{S}}+{{N}_{S}}^{\left\lceil~{d}/{2}\;~\right\rceil~}~)$ Polynomial level
  • Table 2   The results on Balance Scale dataset
    lllll
    DR (%)FAR (%)Generated detectorsEliminated detectorsTraining time (s)
    Mean SD Mean SD Mean SD Mean SD Mean SD
    RNSA 55.18 1.56 1.43 1.76 1500.00 0 2700.45 171.09 6.83 0.29
    V-Detector 64.98 1.84 4.18 2.04 1485.95 62.83 3769.75 288.11 88.52 7.66
    BIORV-NSA 79.61 1.60 4.18 2.14 1500.00 0 230.30 26.07 43.04 0.06
    MHC-NSA 93.57 0.56 4.49 2.15 366.60 24.67 60.10 7.57 3.16 0.76

    a

  • Table 3   The simulation parameters in KDD CUP 99 dataset
    llll
    MHC-NSAOne-Class SVMIsolation ForestLOF
    $N_S$ $r_s$ $\theta$ Kernel $\nu$ $\gamma$ Max_samples Contamination $k$ Contamination
    HTTP 300 0.2 0.001 RBF 0.004 0.33 256 0.004 20 0.004
    FTP 100 0.05 0.001 RBF 0.0772 0.33 256 0.0772 20 0.0772
    FTP-Data 300 0.004 0.0001 RBF 0.0237 0.33 256 0.0237 20 0.0237
  • Table 4   The results on KDD CUP 99 dataset
    llll
    MHC-NSAOne-Class SVMIsolation ForestLOF
    DR (%) FAR (%) DR (%) FAR (%) DR (%) FAR (%) DR (%) FAR (%)
    HTTP 99.88 0.40 99.02 0.39 99.74 89.14 99.43 51.79
    FTP 99.92 1.45 91.60 61.64 90.70 0.27 91.94 0.00
    FTP-Data 94.56 3.23 97.49 84.52 97.55 85.00 97.99 54.93
    Mean 98.12 1.69 96.03 48.85 96.00 58.14 96.45 35.57

    a