logo

SCIENTIA SINICA Informationis, Volume 49 , Issue 12 : 1586-1605(2019) https://doi.org/10.1360/SSI-2019-0119

Bayesian network-based high-dimensional crowdsourced data publication with local differential privacy

More info
  • ReceivedJun 4, 2019
  • AcceptedAug 26, 2019
  • PublishedDec 16, 2019

Abstract


Funded by

国家自然科学基金(61572398,61772410,61802298,U1811461,11690011)

国家重点研发计划(2017YFB 1010004)

中央高校基本科研业务费(xjj2018237)

中国博士后基金(2017M623177)


References

[1] Guo B, Wang Z, Yu Z. Mobile Crowd Sensing and Computing. ACM Comput Surv, 2015, 48: 1-31 CrossRef Google Scholar

[2] Yurur O, Liu C H, Sheng Z. Context-Awareness for Mobile Sensing: A Survey and Future Directions. IEEE Commun Surv Tutorials, 2016, 18: 68-93 CrossRef Google Scholar

[3] Li G, Wang J, Zheng Y. Crowdsourced Data Management: A Survey. IEEE Trans Knowl Data Eng, 2016, 28: 2296-2319 CrossRef Google Scholar

[4] Mohammed N, Chen R, Fung B, et al. Differentially private data release for data mining. In: Proceedings of ACM SIGKDD. New York: ACM, 2011. 493--501. Google Scholar

[5] Naveed M, Ayday E, Clayton E W. Privacy in the Genomic Era.. ACM Comput Surv, 2015, 48: 1-44 CrossRef PubMed Google Scholar

[6] Kohavi R, Provost F. Applications of data mining to electronic commerce. Data Min Knowl Discov, 2001, 5: 5--10. Google Scholar

[7] Clarke R. What's `privacy'. 2006. http://www.rogerclarke.com/DV/Privacy.html. Google Scholar

[8] Sweeney L, Abu A, Winn J. Identifying participants in the personal genome project by name. 2013,. arXiv Google Scholar

[9] Zhou X, Demetriou S, He D, et al. Identity, location, disease and more: Inferring your secrets from android public resources. In: Proceedings of ACM CCS. New York: ACM, 2013. 1017--1028. Google Scholar

[10] Jin H M, Su L, Ding B L, et al. Enabling privacy-preserving incentives for mobile crowd sensing systems. In: Proceedings of IEEE ICDCS, Nara, 2016. 344--353. Google Scholar

[11] Sweeney L. k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY. Int J Unc Fuzz Knowl Based Syst, 2002, 10: 557-570 CrossRef Google Scholar

[12] Rongxing Lu , Xiaohui Liang , Xu Li . EPPA: An Efficient and Privacy-Preserving Aggregation Scheme for Secure Smart Grid Communications. IEEE Trans Parallel Distrib Syst, 2012, 23: 1621-1631 CrossRef Google Scholar

[13] Mármol F, Sorge C, Ugus O. Do not snoop my habits: preserving privacy in the smart grid. IEEE Commun Mag, 2012, 50: 166-172 CrossRef Google Scholar

[14] Narayanan A, Shmatikov V. Robust de-anonymization of large sparse datasets. In: Proceedings of IEEE S&P, Washington, 2008. 111--125. Google Scholar

[15] Dwork C, Roth A. The Algorithmic Foundations of Differential Privacy. FNT Theor Comput Sci, 2014, 9: 211-407 CrossRef Google Scholar

[16] Dwork C. Differential privacy. In: Proceedings of ICALP. Berlin: Springer, 2006. 1--12. Google Scholar

[17] Acs G, Castelluccia C. I have a dream (differentially private smart metering). In: International Workshop on Information Hiding. Berlin: Springer, 2011. 118--132. Google Scholar

[18] Tianqing Zhu , Ping Xiong , Gang Li . Correlated Differential Privacy: Hiding Information in Non-IID Data Set. IEEE TransInformForensic Secur, 2015, 10: 229-242 CrossRef Google Scholar

[19] McSherry F D. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of ACM SIGMOD. New York: ACM, 2009. 19--30. Google Scholar

[20] Zhang J, Cormode G, Procopiuc C M, et al. Privbayes: private data release via Bayesian networks. In: Proceedings of ACM SIGMOD. New York: ACM, 2014. 1423--1434. Google Scholar

[21] Chen R, Xiao Q, Zhang Y, et al. Differentially private high-dimensional data publication via sampling-based inference. In: Proceedings of ACM SIGKDD. New York: ACM, 2015. 129--138. Google Scholar

[22] Sun L, Zhang L, Ge C. An algorithm for differential privacy streaming data publication based on matrix mechanism under exponential decay mode. Sci Sin-Inf, 2017, 47: 1493-1509 CrossRef Google Scholar

[23] Su D, Cao J, Li N. PrivPfC: differentially private data publication for classification. VLDB J, 2018, 27: 201-223 CrossRef Google Scholar

[24] Zhu T, Li G, Zhou W. Differentially Private Data Publishing and Analysis: A Survey. IEEE Trans Knowl Data Eng, 2017, 29: 1619-1638 CrossRef Google Scholar

[25] Qardaji W, Yang W N, Li N H. Priview: practical differentially private release of marginal contingency tables. In: Proceedings of ACM SIGMOD. New York: ACM, 2014. 1435--1446. Google Scholar

[26] Wang L, Wang W P, Meng D. Privacy preserving data publishing via weighted Bayesian networks, J Comput Res Develop, 2016, 53: 2343--2353. Google Scholar

[27] Zhang X J, Chen L, Jin K Z, et al. Private high-dimensional data publication with junction tree. J Comput Res Develop, 2018, 55: 2794--2809. Google Scholar

[28] Ye Q Q, Meng X F, Zhu M J, et al. Survey on local differential privacy. J Soft, 2018, 29: 159--183. Google Scholar

[29] Erlingsson Ú Korolova A, Pihur V. Rappor: randomized aggregatable privacy-preserving ordinal response. In: Proceedings of ACM CCS. New York: ACM, 2014. 1054--1067. Google Scholar

[30] Chen R, Li H R, Qin A K, et al. Private spatial data aggregation in the local setting. In: Proceedings of IEEE ICDE, Piscataway, 2016. 289--300. Google Scholar

[31] Cormode G, Kulkarni T, Srivastava D. Marginal release under local differential privacy. In: Proceedings of ACM SIGMOD. New York: ACM, 2018. 131--146. Google Scholar

[32] Wang N, Xiao X K, Yang Y, et al. Privtrie: effective frequent term discovery under local differential privacy. In: Proceedings of IEEE ICDE, Piscataway, 2018. 821--832. Google Scholar

[33] Ye Q Q, Hu H B, Meng X F, et al. Privkv: key-value data collection with local differential privacy. In: Proceedings of IEEE S&P, Piscataway, 2019. 1--15. Google Scholar

[34] Zhang Z K, Wang T H, Li N H, et al. CALM: consistent adaptive local marginal for marginal release under local differential privacy. In: Proceedings of ACM CCS. New York: ACM, 2018. 212--229. Google Scholar

[35] Xiong S J, Sarwate A D, Mandayam N B. Randomized requantization with local differential privacy. In: Proceedings of IEEE ICASSP, Shanghai, 2016. 2189--2193. Google Scholar

[36] Sarwate A D, Sankar L. A rate-disortion perspective on local differential privacy. In: Proceedings of IEEE Allerton Conference on Communication, Control, and Computing, Monticello, 2015. 903--908. Google Scholar

[37] Warner S L. Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias. J Am Statistical Association, 1965, 60: 63-69 CrossRef Google Scholar

[38] Kairouz P, Bonawitz K, Ramage D. Discrete distribution estimation under local privacy. In: Proceedings of ICML, 2016. 2436--2444. Google Scholar

[39] Kairouz P, Oh S, Viswanath P. Extremal mechanisms for local differential privacy. In: Proceedings of NIPS. Cambridge: MIT Press, 2014. 2879--2887. Google Scholar

[40] Wang S W, Huang L S, Wang P Z, et al. Mutual information optimally local private discrete distribution estimation. 2016,. arXiv Google Scholar

[41] Qin Z, Yu T, Yang Y, et al. Generating synthetic decentralized social graphs with local differential privacy. In: Proceedings of ACM CCS. New York: ACM, 2017. 425--438. Google Scholar

[42] Qin Z, Yang Y, Yu T, et al. Heavy hitter estimation over set-valued data with local differential privacy. In: Proceedings of ACM CCS. New York: ACM, 2016. 192--203. Google Scholar

[43] 王璞玉, 张海. 分布式隐私保护-Logistic回归. 中国科学: 信息科学, https://doi.org/10.1360/N112018-00214. Google Scholar

[44] Duchi J C, Jordan M I, Wainwright M J. Local privacy and statistical minimax rates. In: Proceedings of IEEE FOCS, Washington, 2013. 429--438. Google Scholar

[45] Fanti G, Pihur V, Erlingsson . Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries. Proc Privacy Enhancing Technologies, 2016, 2016(3): 41-61 CrossRef Google Scholar

[46] Ren X, Yu C M, Yu W. IEEE TransInformForensic Secur, 2018, 13: 2151-2166 CrossRef Google Scholar

[47] Zhang J, Cormode G, Procopiuc C M, et al. PrivBayes: private data release via Bayesian networks. ACM Trans Database Syst, 2017, 42: 25. Google Scholar

[48] Acs G, Castelluccia C, Chen R. Differentially private histogram publishing through lossy compression. In: Proceedings of IEEE ICDE, Washington, 2012. 1--10. Google Scholar

  • Table 1   Details of datasets
    Dataset Data type Dataset size Dimension Domain size (processed)
    NLTCS Binary 25174 16 ${2^{16}}$
    Adult Non-Binary 45222 15 $~\approx~{2^{26}}$
    TPC-E Non-Binary 40000 24 $~\approx~{2^{38}}$