SCIENTIA SINICA Informationis, Volume 51 , Issue 7 : 1131(2021) https://doi.org/10.1360/SSI-2020-0229

A multi-objective reinforcement learning framework for community deception

More info
  • ReceivedJul 26, 2020
  • AcceptedOct 27, 2020
  • PublishedJul 13, 2021


Funded by





[1] Newman M E J. Communities, modules and large-scale structure in networks. Nat Phys, 2012, 8: 25-31 CrossRef Google Scholar

[2] Li C Y, Tang Y, Lin H, et al. Parallel overlapping community detection algorithm in complex networks based on label propagation. Sci Sin Inform, 2016, 46: 212-227. Google Scholar

[3] Qiao S J, Han N, Zhang K F, et al. Algorithm for detecting overlapping communities from complex network big data. J Softw, 2017, 28: 631-647. Google Scholar

[4] Li D, Cheng M, Xu Y. Optimal community detection method based on average mutual information. Sci Sin-Inf, 2019, 49: 613-629 CrossRef Google Scholar

[5] Barbieri N, Bonchi F, Manco G. Influence-based network-oblivious community detection. In: Proceedings of IEEE International Conference on Data Mining. Dallas, TX, USA, 2013. 955-960. Google Scholar

[6] Zhang K, Bhattacharyya S, Ram S. Large-Scale Network Analysis for Online Social Brand Advertising. MISQ, 2016, 40: 849-868 CrossRef Google Scholar

[7] Kearns M, Roth A, Wu Z S. Private algorithms for the protected in social network search. Proc Natl Acad Sci USA, 2016, 113: 913-918 CrossRef Google Scholar

[8] Akoglu L, Tong H, Koutra D. Graph based anomaly detection and description: a survey. Data Min Knowl Disc, 2015, 29: 626-688 CrossRef Google Scholar

[9] Johnson N F, Zheng M, Vorobyeva Y. New online ecology of adversarial aggregates: ISIS and beyond. Science, 2016, 352: 1459-1463 CrossRef Google Scholar

[10] Li H J, Bu Z, Li A. Fast and Accurate Mining the Community Structure: Integrating Center Locating and Membership Optimization. IEEE Trans Knowl Data Eng, 2016, 28: 2349-2362 CrossRef Google Scholar

[11] Balasundaram B, Butenko S, Hicks I V. Operations Res, 2011, 59: 133-142 CrossRef Google Scholar

[12] Pizzuti C. Evolutionary computation for community detection in networks: a review. IEEE Transactions on Evolutionary Computation, 2017, 22(3): 464-483. Google Scholar

[13] Nagaraja S. The impact of unlinkability on adversarial community detection: effects and countermeasures. In: Proceedings of International Symposium on Privacy Enhancing Technologies Symposium. Berlin, Germany, 2010. 253-272. Google Scholar

[14] Waniek M, Michalak T P, Wooldridge M J. Hiding individuals and communities in a social network. Nat Hum Behav, 2018, 2: 139-147 CrossRef Google Scholar

[15] Fionda V, Pirro G. Community deception or: How to stop fearing community detection algorithms. IEEE Transactions on Knowledge and Data Engineering, 2017, 30(4): 660-673. Google Scholar

[16] Chen J, Chen L, Chen Y. GA-Based Q-Attack on Community Detection. IEEE Trans Comput Soc Syst, 2019, 6: 491-503 CrossRef Google Scholar

[17] Liu Y, Liu J, Zhang Z, et al. REM: From Structural Entropy to Community Structure Deception. In: Proceedings of Advances in Neural Information Processing Systems. Vancouver, BC, Canada, 2019. 12918-12928. Google Scholar

[18] Li A, Pan Y. Structural Information and Dynamical Complexity of Networks. IEEE Trans Inform Theor, 2016, 62: 3290-3339 CrossRef Google Scholar

[19] Traag V A, Waltman L, van Eck N J. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep, 2019, 9: 1--12. Google Scholar

[20] Frieze A, Karonski M. Introduction to Random Graphs. Cambridge University Press, 2016. ISBN: 9781107118508. Google Scholar

[21] Barabasi A L, Network Science, Cambridge University Press, Cambridge, 2016, ISBN: 9781107076266. Google Scholar

[22] Ghadge S, Killingback T, Sundaram B. A statistical construction of power-law networks. Int J Parallel Emergent Distributed Syst, 2010, 25: 223-235 CrossRef Google Scholar

[23] Arulkumaran K, Deisenroth M P, Brundage M. Deep Reinforcement Learning: A Brief Survey. IEEE Signal Process Mag, 2017, 34: 26-38 CrossRef Google Scholar

[24] Kiumarsi B, Vamvoudakis K G, Modares H, et al. Optimal and autonomous control using reinforcement learning: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(6): 2042-2062. Google Scholar

[25] Barrett L, Narayanan S. Learning all optimal policies with multiple criteria. In: Proceedings of the 25th international conference on Machine learning. Helsinki, Finland, 2008: 41-47. Google Scholar

[26] Van Moffaert K, Nowé A. Multi-objective reinforcement learning using sets of pareto dominating policies. The Journal of Machine Learning Research, 2014, 15(1): 3483-3512. Google Scholar

[27] Shang K, Ishibuchi H, He L. A Survey on the Hypervolume Indicator in Evolutionary Multi-objective Optimization. IEEE Trans Evol Computat, 2020, : 1-1 CrossRef Google Scholar

  • Figure 1

    (Color online) Community deception performance for three community detection methods with different budgets on Karate. (a) Louvain; (b) CPM; (c) LPA

  • Figure 3

    (Color online) Learning curves on Karate

  • Figure 4

    (Color online) Performance comparison of HD, REM, and ParetoRein on four datasets

  • Figure 5

    (Color online) Communities detected on the original and rewired Karate networks

  • Table 1   A comparison of different methods for community deception
    Methods Updates Techniques Measures Categories
    Nagaraja [13] Edge additions Centrality measure Miss-ratio (false negative rate) G1
    Waniek [14] Edge additions or deletions Modularity Concealment G1
    Fionda [15] Edge additions or deletions Safeness & modularity Deception score G1
    Chen [16] Edge additions and deletions Modularity Modularity & NMI G2
    Liu [17] Edge additions Normalized residual entropy Partition similarity & mutual G2
    information & query accuracy
    Our method Node and edge Multi-objective Negative ratio association G2
    additions reinforcement learning & ratio cut

    Algorithm 1 Community deception based on the network growth model

    Input: The original network $G=\{V,~E\}$, the number of edge additions $b$, the set of node additions $V^{+}$. Output: The rewired network $G'~=~\{V',~E'\}$.

    Initialize ${\rm~budget}~=~0$, $G'~=~G$;

    for ${\rm~budget}\texttt{++}~<~b$

    while True do

    Choose $V_a$ randomly from $V^{+}$, add links between $V_a$ and $V_i$ in $G$ based on the four network growth models, i.e., $e_{ia}$;

    if ${e_{ia}}$ is not in $G&apos;$ then




    end if

    end while

    end for

    Return $G&apos;$.

  • Table 2   Real networks
    Name $|V|$ $|E|$ $\langle~k~\rangle~$ $\langle~C~\rangle~$
    Karate 34 78 4.59 0.588
    Dolphins 62 159 5.13 0.303
    Football 115 616 10.66 0.403
    Email 1133 5451 9.62 0.254
    Power 4941 6594 2.67 0.107

    Algorithm 2 Action strategy

    SelectAction(evalFunction, $Q$, ${\boldsymbol~s}$, $\epsilon$).

    Initialize an empty list ${\rm~evalList}~\gets~\{\}$;

    Get a random number rand, ${\rm~rand}~\in~[0,~1]$;

    if ${\rm~rand}~\leq~\epsilon$ then

    for each action ${\boldsymbol~a}~\in~\mathcal{A}$

    Get the index value $I_{{\boldsymbol~a}}~=~{\rm~evalFunction}(Q({\boldsymbol~s},{\boldsymbol~a}))$;

    Add $({\boldsymbol~a},~I_{{\boldsymbol~a}})$ into evalList;

    end for

    Choose the action ${\boldsymbol~a}^{\rm~final}$ with respect to the largest $I_{{\boldsymbol~a}}$ in evalList;


    Choose ${\boldsymbol~a}^{\rm~final}$ randomly in $\mathcal{A}$;

    end if

    Return ${\boldsymbol~a}^{\rm~final}$.


    Algorithm 3 Community deception based on scalarized multi-objective $Q$-learning

    Input: The number of edge additions $b$, the set of node additions $V^{+}$, the number of iterations $T$, $\epsilon~\in~[0,~1]$. Output: The rewired network.

    Initialize $Q({\boldsymbol~s},{\boldsymbol~a})\gets~0$, $\forall~({\boldsymbol~s},{\boldsymbol~a})\in\mathcal{S}\times\mathcal{A}$;

    for ${\rm~episode}=1$ to $~T$


    for $\tau~=~1$ to $b$

    Call the Algorithm 2 based on Eq. (12), i.e., SelectAction($f({\boldsymbol~Q},~{\boldsymbol~w})$, $Q$, ${\boldsymbol~s}^\tau$, $\epsilon$) and get the action ${\boldsymbol~a}^\tau~\in~\mathcal{A}$;

    Update the network based on the action ${\boldsymbol~a}^\tau~\in~\mathcal{A}$, i.e., add links between two nodes in $V^{+}$ and $V$, respectively;

    Get the next state ${\boldsymbol~s}^{\tau+1}$ and reward ${\boldsymbol~r}$ based on the action ${\boldsymbol~a}^\tau~\in~\mathcal{A}$;

    Call the SelectAction($Q_{{\boldsymbol~w}}$, $Q$, ${\boldsymbol~s}^{\tau+1}$, $\epsilon$) in which $\epsilon=1$ and then get the action ${\boldsymbol~a}^{\tau~+~1}~\in~\mathcal{A}$;

    for each objective $o$

    Update $Q({\boldsymbol~s},{\boldsymbol~a})$:


    end for


    end for

    end for

    Return the rewired network.

  • Table 31  

    Algorithm 4 Community deception based on Pareto multi-objective $Q$-learning

    Input: The number of edge additions $b$, the set of node additions $V^{+}$, the number of iterations $T$, $\epsilon~\in~[0,~1]$. Output: The rewired network.

    Initialize $Q({\boldsymbol~s},{\boldsymbol~a})\gets~0$, $\forall~({\boldsymbol~s},{\boldsymbol~a})\in\mathcal{S}\times\mathcal{A}$;

    for ${\rm~episode}~=1$ to $T$


    for $\tau~=~1$ to $b$

    Call the Algorithm 2 based on the Hypervolum, i.e., SelectAction(Hypervolum, $Q$, ${\boldsymbol~s}^\tau$, $\epsilon$) and get the action${\boldsymbol~a}^\tau~\in~\mathcal{A}$;

    Update the network based on the action ${\boldsymbol~a}^\tau~\in~\mathcal{A}$, i.e., add links between two nodes in $V^{+}$ and $V$, respectively;

    Get the next state ${\boldsymbol~s}^{\tau+1}$ and reward ${\boldsymbol~r}$ based on the action ${\boldsymbol~a}^\tau~\in~\mathcal{A}$;

    Update the non-dominated set, i.e., ${\rm~ND}({\boldsymbol~s}^{\tau},{\boldsymbol~a}^{\tau})~=~{\rm~ND}(\cup_{{\boldsymbol~a}&apos;}Q({\boldsymbol~s}^{\tau+1},{\boldsymbol~a}&apos;))$;

    Update $\overline{R}({\boldsymbol~s}^{\tau},{\boldsymbol~a}^{\tau})~=~~\overline{R}({\boldsymbol~s}^{\tau},{\boldsymbol~a}^{\tau})~+~\frac{{\boldsymbol~r}~-~~\overline{R}({\boldsymbol~s}^{\tau},{\boldsymbol~a}^{\tau})}{n({\boldsymbol~s}^{\tau},{\boldsymbol~a}^{\tau})}$;

    Update $Q({\boldsymbol~s}^{\tau},{\boldsymbol~a}^{\tau})~=~\overline{R}({\boldsymbol~s}^{\tau},{\boldsymbol~a}^{\tau})~\oplus~\gamma~{\rm~ND}({\boldsymbol~s}^{\tau},{\boldsymbol~a}^{\tau})$;

    end for

    end for

    Return the rewired network.


Contact and support