logo

SCIENTIA SINICA Informationis, Volume 51 , Issue 7 : 1116(2021) https://doi.org/10.1360/SSI-2020-0209

Network-splitter: a network feature extraction algorithm based on overlapping community and its application in link prediction

More info
  • ReceivedJul 13, 2020
  • AcceptedOct 11, 2020
  • PublishedJun 30, 2021

Abstract


Funded by

国家自然科学基金(61803266,61703281,62072311,71874172)

广东省自然科学基金(2019A1515011173,2019A1515011064,2017B030314073)

深圳市自然科学基金(JCYJ20190808162601658,JCYJ20180305124628810)


References

[1] Zhang S, Yao L, Sun A. Deep Learning Based Recommender System. ACM Comput Surv, 2019, 52: 1-38 CrossRef Google Scholar

[2] Lü L, Zhou T. Link prediction in complex networks: A survey. Physica A-Statistical Mech its Appl, 2011, 390: 1150-1170 CrossRef ADS arXiv Google Scholar

[3] Wang Z T, Chen C Y, Li W J. Predictive network representation learning for link prediction. In: Proceedings of the 40th International ACM SIGIR Conference, 2017. 969--972. Google Scholar

[4] Costa L F, Oliveira Jr. O N, Travieso G. Analyzing and modeling real-world phenomena with complex networks: a survey of applications. Adv Phys, 2011, 60: 329-412 CrossRef ADS arXiv Google Scholar

[5] Wen G, Yu W, Yu X. Complex cyber-physical networks: From cybersecurity to security control. J Syst Sci Complex, 2017, 30: 46-67 CrossRef Google Scholar

[6] Kossinets G. Effects of missing data in social networks. Social Networks, 2006, 28: 247-268 CrossRef Google Scholar

[7] Guimerà R, Sales-Pardo M. Missing and spurious interactions and the reconstruction of complex networks. Proc Natl Acad Sci USA, 2009, 106: 22073-22078 CrossRef PubMed ADS arXiv Google Scholar

[8] Dhote Y, Mishra N, Sharma S. Survey and analysis of temporal link prediction in online social networks. In: Proceedings of the 2013 International Conference on Advances in Computing, Communications and Informatics(ICACCI), India, 2013. 1178--1183. Google Scholar

[9] Adamic L A, Adar E. Friends and neighbors on the Web. Social Networks, 2003, 25: 211-230 CrossRef Google Scholar

[10] Newman M E J. From the Cover: The structure of scientific collaboration networks. Proc Natl Acad Sci USA, 2001, 98: 404-409 CrossRef ADS arXiv Google Scholar

[11] Watts D J, Strogatz S H. Collective dynamics of `small-world' networks. Nature, 1998, 393: 440-442 CrossRef PubMed ADS Google Scholar

[12] Sun J, Feng L, Xie J. Revealing the predictability of intrinsic structure in complex networks. Nat Commun, 2020, 11: 574 CrossRef PubMed ADS Google Scholar

[13] Zhou T, Lü L, Zhang Y C. Predicting missing links via local information. Eur Phys J B, 2009, 71: 623-630 CrossRef ADS arXiv Google Scholar

[14] Barabási A L, Albert R. Emergence of Scaling in Random Networks. Science, 1999, 286: 509-512 CrossRef PubMed ADS arXiv Google Scholar

[15] Rücker G. Network meta-analysis, electrical networks and graph theory.. Res Syn Meth, 2012, 3: 312-324 CrossRef PubMed Google Scholar

[16] Liu H, Hu Z, Haddadi H. Hidden link prediction based on node centrality and weak ties. EPL, 2013, 101: 18004 CrossRef ADS Google Scholar

[17] Neville J, Jensen D. Relational dependency networks. J MACH LEARN RES, 2007, 8(24): 653-692. Google Scholar

[18] Yu K, Chu W, Yu S P, et al. Stochastic relational models for discriminative link prediction. In: Proceedings of the 19th International Conference on Neural Information Processing Systems, 2007. 1553--1560. Google Scholar

[19] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, 2013. 3111--3119. Google Scholar

[20] Ou M D, Cui P, Pei J, et al. Asymmetric transitivity preserving graph embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016. 1105--1114. Google Scholar

[21] Zhang Z, Cui P, Zhu W. Deep Learning on Graphs: A Survey. IEEE Trans Knowl Data Eng, 2020, : 1-1 CrossRef Google Scholar

[22] Coscia M, Rossetti G, Giannotti F. Uncovering Hierarchical and Overlapping Communities with a Local-First Approach. ACM Trans Knowl Discov Data, 2014, 9: 1-27 CrossRef Google Scholar

[23] Epasto A, Lattanzi S, Paes Leme R. Ego-splitting framework: from non-overlapping to overlapping clusters. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017. 145--154. Google Scholar

[24] Rees B S, Gallagher K B. Overlapping community detection by collective friendship group inference.In: Proceedings of International Conference on Advances in Social Networks Analysis and Mining(ASONAM), Denmark, 2010. 375--379. Google Scholar

[25] Jaccard P. THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1. New Phytol, 1912, 11: 37-50 CrossRef Google Scholar

[26] Perozzi B, Al-Rfou R, Skiena S. Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014. 701--710. Google Scholar

[27] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, 2013. 3111--3119. Google Scholar

[28] Grover A, Leskovec J. Node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International, 2016. 855--864. Google Scholar

[29] Tang J, Qu M, Wang M Z, et al. Line: Large-scale information network embedding. WWW, 2015: 1067-1077. doi: 10.1145/2736277.2741093. Google Scholar

[30] Wang H, Wang J, Wang J. Learning Graph Representation with Generative Adversarial Nets. IEEE Trans Knowl Data Eng, 2020, : 1-1 CrossRef Google Scholar

[31] Epasto A, Perozzi B. Is a single embedding enough? Learning node representations that capture multiple social contexts. In: Proceedings of the World Wide Web Conference, 2019. 394--404. Google Scholar

[32] Fawcett T. An introduction to ROC analysis. Pattern Recognition Lett, 2006, 27: 861-874 CrossRef Google Scholar

[33] Hochreiter S, Schmidhuber J. Long short-term memory.. Neural Computation, 1997, 9: 1735-1780 CrossRef PubMed Google Scholar

[34] Zhou P, Shi W, Tian J, et al. Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016. 207--212. Google Scholar

[35] Cho K, Van Merriënboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing(EMNLP), 2014. Google Scholar

[36] Raghavan U N, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E, 2007, 76: 036106 CrossRef PubMed ADS arXiv Google Scholar

  • Figure 1

    (Color online) The modeling process of network-splitter

  • Figure 2

    (Color online) LSTM model

  • Figure 4

    (Color online) Influence of different dimensions on the algorithms in the four real datasets. (a) Musae-Facebook; protectłinebreak (b) Musae-Github; (c) PPI; (d) power-grid

  • Figure 5

    (Color online) Influence of data partition on the algorithms in the three real datasets. (a) Musae-Github;protect łinebreak (b) PPI; (c) power-grid

  • Figure 6

    (Color online) Comparison of visualization results of the five methods. (a) Zachary Karate Club network;protectłinebreak (b) Node2vec; (c) network-LSTM; (d) network-BiLSTM; (e) network-GRU; (f) network-BiGRU

  • Table 1   Running time of network-splitter model in 64-dimension-vectorizing PPI dataset
    Model Running time (s)
    Splitter 399.48($\pm44.28$)
    Network-LSTM 404.04($\pm44.13$)
    Network-BiLSTM 410.87($\pm43.74$)
    Network-GRU 403.80($\pm44.19$)
    Network-BiGRU 409.27($\pm43.56$)
  • Table 2   Basic properties of the four real networks
    Musae-Facebook Musae-Github PPI Power-grid
    Nodes 22470 37700 3852 4941
    Edges 171002 289003 20881 6594
    Publication year 2017 2019 2008 1998
    Average degree 15.21 15.33 9.85 2.67
  • Table 3   AUC comparison of algorithms in 64-dimension-vectorizing networks
    Musae-Facebook Musae-Github PPI Power-grid
    AA [9] 0.945 0.855 0.913 0.628
    RA [13] 0.945 0.856 0.913 0.628
    JC [25] 0.944 0.802 0.912 0.628
    Deepwalk [26] 0.957($\pm0.00015$) 0.903($\pm0.00037$) 0.926($\pm0.00044$) 0.798($\pm0.00089$)
    Node2vec [28] 0.958($\pm0.00021$) 0.902($\pm0.00014$) 0.926($\pm0.00011$) 0.799($\pm0.00049$)
    LINE [29] 0.923($\pm0.00025$) 0.854($\pm0.00024$) 0.893($\pm0.00027$) 0.773($\pm0.00747$)
    Graph-GAN [30] 0.942($\pm0.00029$) 0.876($\pm0.00038$) 0.936($\pm0.00041$) 0.766($\pm0.00035$)
    Splitter [31] 0.958($\pm0.00112$) 0.886($\pm0.00308$) 0.937($\pm0.00224$) 0.849($\pm0.01043$)
    Network-splitter 0.959($\pm$0.00098) 0.914($\pm$0.00017) 0.955($\pm$0.00392) 0.903($\pm$0.00641)
  • Table 4   Precision comparison of algorithms in 64-dimension-vectorizing networks
    Musae-Facebook Musae-Github PPI Power-grid
    AA [9] 0.945 0.864 0.914 0.628
    RA [13] 0.946 0.864 0.913 0.628
    JC [25] 0.944 0.727 0.913 0.627
    Deepwalk [26] 0.956($\pm0.00027$) 0.918($\pm0.00033$) 0.922($\pm0.00083$) 0.800($\pm0.00192$)
    Node2vec [28] 0.958($\pm0.00012$) 0.917($\pm0.00016$) 0.927($\pm0.00035$) 0.801($\pm0.00057$)
    LINE [29] 0.931($\pm0.00022$) 0.859($\pm0.00058$) 0.824($\pm0.00036$) 0.838($\pm0.00087$)
    Graph-GAN [30] 0.945($\pm0.00023$) 0.880($\pm0.00058$) 0.926($\pm0.00048$) 0.747($\pm0.00061$)
    Splitter [31] 0.963($\pm0.00146$) 0.895($\pm0.00292$) 0.934($\pm0.00282$) 0.887($\pm0.00693$)
    Network-splitter 0.966($\pm$0.00117) 0.924($\pm$0.00045) 0.965($\pm$0.00377) 0.926($\pm$0.00485)
qqqq

Contact and support