logo

SCIENTIA SINICA Informationis, Volume 50 , Issue 11 : 1629(2020) https://doi.org/10.1360/SSI-2020-0067

Context-aware adaptation of deep learning models for IoT devices

More info
  • ReceivedMar 21, 2020
  • AcceptedAug 31, 2020
  • PublishedNov 9, 2020

Abstract


Funded by

国家重点研发计划(2017YFB1001800)

国家自然科学基金(61772428,61725205)


References

[1] Iresearch. White Paper on China's Artificial Intelligence Internet of Things (AIoT) in 2020. Iresearch series of research reports, 2020, 2: 36--80. Google Scholar

[2] Zhou Z, Chen X, Li E. Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing. Proc IEEE, 2019, 107: 1738-1762 CrossRef Google Scholar

[3] Liu S C, Lin Y Y, Zhou Z M, et al. On-demand deep model compression for mobile devices: a usage-driven model selection framework. In: Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, 2018. 389--400. Google Scholar

[4] Teerapittayanon S, McDanel B, Kung H T. Branchynet: fast inference via early exiting from deep neural networks. In: Proceedings of 2016 23rd International Conference on Pattern Recognition (ICPR), 2016. 2464--2469. Google Scholar

[5] Zhao Z, Barijough K M, Gerstlauer A. DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters. IEEE Trans Comput-Aided Des Integr Circuits Syst, 2018, 37: 2348-2359 CrossRef Google Scholar

[6] Yang Q, Zhang Y, Dai W, et al. Transfer Learning. Cambridge: Cambridge University Press, 2020. Google Scholar

[7] Wang L, Guo B, Yang Q. Smart city development with transfer learning. Computer, 2018, 51: 32--41. Google Scholar

[8] Jordan M I, Mitchell T M. Machine learning: Trends, perspectives, and prospects. Science, 2015, 349: 255-260 CrossRef ADS Google Scholar

[9] Chen Z, Liu B. Lifelong machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2018, 12: 1--207. Google Scholar

[10] Mitchell T, Cohen W, Hruschka E, et al. Never-ending learning. Communications of the ACM, 2018, 61(5): 103-115. Google Scholar

[11] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014,. arXiv Google Scholar

[12] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. 770--778. Google Scholar

[13] Han S, Mao H, Dally W J. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. 2015,. arXiv Google Scholar

[14] He Y, Lin J, Liu Z, et al. AMC: automl for model compression and acceleration on mobile devices. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 784--800. Google Scholar

[15] Cai H, Gan C, Wang T, et al. Once-for-all: train one network and specialize it for efficient deployment. 2019,. arXiv Google Scholar

[16] Chen T, Moreau T, Jiang Z, et al. TVM: an automated end-to-end optimizing compiler for deep learning. In: Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018. 578--594. Google Scholar

[17] Ma X, Guo F M, Niu W, et al. PCONV: the missing but desirable sparsity in DNN weight pruning for real-time execution on mobile devices. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2020. 5117--5124. Google Scholar

[18] Kang Y, Hauswald J, Gao C. Neurosurgeon. SIGARCH Comput Archit News, 2017, 45: 615-629 CrossRef Google Scholar

[19] Li E, Zhou Z, Chen X. Edge intelligence: on-demand deep learning model co-inference with device-edge synergy. In: Proceedings of the 2018 Workshop on Mobile Edge Communications, 2018. 31--36. Google Scholar

[20] Ganin Y, Ustinova E, Ajakan H, et al. Domain-adversarial training of neural networks. J Machine Learn Res, 2016, 17: 2096--2030. Google Scholar

[21] Wang X, Li L, Ye W. Transferable Attention for Domain Adaptation. AAAI, 2019, 33: 5345-5352 CrossRef Google Scholar

[22] Nagabandi A, Clavera I, Liu S, et al. Learning to adapt in dynamic, real-world environments through metareinforcement learning. 2018,. arXiv Google Scholar

[23] Kirkpatrick J, Pascanu R, Rabinowitz N. Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci USA, 2017, 114: 3521-3526 CrossRef Google Scholar

[24] Wen W, Wu C, Wang Y, et al. Learning structured sparsity in deep neural networks. In: Proceedings of Advances in Neural Information Processing Systems, 2016. 2074--2082. Google Scholar

[25] Luo J H, Wu J. An entropy-based pruning method for cnn compression. 2017,. arXiv Google Scholar

[26] Guo Y, Yao A, Chen Y. Dynamic network surgery for efficient dnns. In: Proceedings of Advances in Neural Information Processing Systems, 2016. 1379--1387. Google Scholar

[27] Han S, Pool J, Tran J, et al. Learning both weights and connections for efficient neural network. In: Proceedings of Advances in Neural Information Processing Systems, 2015. 1135--1143. Google Scholar

[28] He Y, Zhang X, Sun J. Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 1389--1397. Google Scholar

[29] Zhang T, Ye S, Zhang K, et al. A systematic dnn weight pruning framework using alternating direction method of multipliers. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 184--199. Google Scholar

[30] Zhang T, Zhang K, Ye S, et al. ADAM-ADMM: a unified, systematic framework of structured weight pruning for DNNs. 2018,. arXiv Google Scholar

[31] LeCun Y, Denker J S, Solla S A. Optimal brain damage. In: Proceedings of Advances in Neural Information Processing Systems, 1990. 598--605. Google Scholar

[32] Luo J H, Wu J, Lin W. Thinet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 5058--5066. Google Scholar

[33] Polyak A, Wolf L. Channel-level acceleration of deep face representations. IEEE Access, 2015, 3: 2163-2175 CrossRef Google Scholar

[34] Liu N, Ma X, Xu Z, et al. AutoCompress: an automatic DNN structured pruning framework for ultra-high compression rates. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2020. 4876--4883. Google Scholar

[35] Yao S, Zhao Y, Zhang A, et al. Deepiot: compressing deep neural network structures for sensing systems with a compressor-critic framework. In: Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems, 2017. 1--14. Google Scholar

[36] Dai X, Zhang P, Wu B, et al. Chamnet: towards efficient network design through platform-aware model adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019. 11398--11407. Google Scholar

[37] Julie B. The `Google Brain' is a real thing but very few people have seen it. Business insider, 2016, [2016-09-02]. https://www.businessinsider.com.au/what-is-google-brain-2016-9.htm. Google Scholar

[38] Norm J. Google supercharges machine learning tasks with TPU custom chip. AI Machine learning, [2016-05-19]. https://cloud.google.com/blog/products/gcp/google-supercharges-machine-learning-tasks-with-custom-chip.htm. Google Scholar

[39] Siegler MG. Apple's Massive New Data Center Set To Host Nuance Tech. TC sessions, [2011-05-10]. https://techcrunch.com/2011/05/09/apple-nuance-data-center-deal/.htm. Google Scholar

[40] Ben L. Apple moves to third-generation Siri back-end, built on opensourceMesos platform. 9To5Mac, [2015-04-27] http://9to5mac.com/2015/04/27/siri-backend-mesos/. htm. Google Scholar

[41] Osia S A, Shahin Shamsabadi A, Sajadmanesh S. A Hybrid Deep Learning Architecture for Privacy-Preserving Mobile Analytics. IEEE Internet Things J, 2020, 7: 4505-4518 CrossRef Google Scholar

[42] Mao Y, Yi S, Li Q, et al. A privacy-preserving deep learning approach for face recognition with edge computing. In: Proceedings of USENIX Workshop Hot Topics Edge Comput (HotEdge), 2018. 1--6. Google Scholar

[43] Ko J H, Na T, Amir M F, et al. Edge-host partitioning of deep neural networks with feature space encoding for resource-constrained internet-of-things platforms. In: Proceedings of 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2018. 1--6. Google Scholar

[44] Li H, Hu C, Jiang J, et al. Jalad: joint accuracy-and latency-aware deep structure decoupling for edge-cloud execution. In: Proceedings of 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), 2018. 671--678. Google Scholar

[45] Xu M, Qian F, Zhu M. DeepWear: Adaptive Local Offloading for On-Wearable Deep Learning. IEEE Trans Mobile Comput, 2020, 19: 314-330 CrossRef Google Scholar

[46] Xiao D, Huang Y, Zhao L. Domain Adaptive Motor Fault Diagnosis Using Deep Transfer Learning. IEEE Access, 2019, 7: 80937-80949 CrossRef Google Scholar

[47] Xu G, Liu M, Jiang Z. Online Fault Diagnosis Method Based on Transfer Convolutional Neural Networks. IEEE Trans Instrum Meas, 2020, 69: 509-520 CrossRef Google Scholar

[48] Li X, Zhang W, Ding Q. Cross-Domain Fault Diagnosis of Rolling Element Bearings Using Deep Generative Neural Networks. IEEE Trans Ind Electron, 2019, 66: 5525-5534 CrossRef Google Scholar

[49] Shao S, McAleer S, Yan R. Highly Accurate Machine Fault Diagnosis Using Deep Transfer Learning. IEEE Trans Ind Inf, 2019, 15: 2446-2455 CrossRef Google Scholar

[50] Liu Z H, Lu B L, Wei H L. Deep Adversarial Domain Adaptation Model for Bearing Fault Diagnosis. IEEE Trans Syst Man Cybern Syst, 2020, : 1-10 CrossRef Google Scholar

[51] Fang J, Sun Y, Peng K, et al. Fast neural network adaptation via parameter remapping and architecture search. 2020,. arXiv Google Scholar

[52] Boyd S. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. FNT Machine Learning, 2010, 3: 1-122 CrossRef Google Scholar

[53] Ye S, Xu K, Liu S, et al. Adversarial robustness vs. model compression, or both. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019. 111--120. Google Scholar

[54] Li J, Guo B, Wang Z, et al. Where to place the next outlet? Harnessing cross-space urban data for multi-scale chain store recommendation. In: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, 2016. 149--152. Google Scholar

[55] Li N, Guo B, Liu Y, et al. Commercial site recommendation based on neural collaborative filtering. In: Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers, 2018. 138--141. Google Scholar

[56] Liu Y, Yao L, Guo B. DeepStore: An Interaction-Aware Wide&Deep Model for Store Site Recommendation With Attentional Spatial Embeddings. IEEE Internet Things J, 2019, 6: 7319-7333 CrossRef Google Scholar

[57] Guo B, Li J, Zheng V W. CityTransfer. Proc ACM Interact Mob Wearable Ubiquitous Technol, 2018, 1: 1-23 CrossRef Google Scholar

  • Figure 1

    (Color online) A$^{2}$IoT system architecture

  • Figure 2

    (Color online) X-ADMM model framework

  • Figure 3

    (Color online) CityTransfer system framework

  • Table 1   Internet of Things terminal resources
    Type Device Processor DRAM Battery (mAh)
    Smart phoneRedmi 7A Snapdragon 439 2 GB 4000
    Redmi 8A Snapdragon 439 3 GB 5000
    Redmi Note 8 Snapdragon 665 4 GB 4000
    Redmi K30 Snapdragon 730 6 GB 6400
    Huawei changxiang 9 PLUS Hisilicon Kirin 710 4 GB 4000
    Huawei nova 5z HUAWEI Kirin 810 6 GB 4000
    Smart watchSony Smartwatch SW3 ARM CortexA7 512 MB 420
    HUAWEI WATCH 2 Pro Snapdragon wear 2100 768 MB 420
    Xiaomi watche Qualcomm 3100 1 GB 570
    Smart braceletHuawei bracelet B5 ARM Cortex M4 384 KB 108
    Xiaomi bracelet4 Dialog DA14681 512 KB 135
  • Table 2   Deep learning model parameters
    Network name Parameter number (M) Needed storage capacity (MB) FLOPs
    AlexNet 60 233 727 MFLOPs
    GoogleNet 6.8 51 1.5 GFLOPs
    ResNet-18 33 44.7 1.8 GFLOPs
    ResNet-50 25.5 97.8 4.1 GFLOPs
    ResNet-152 117 230 11 GFLOPs
    VGG-16 138 528 16 GFLOPs
    VGG-19 144 548 20 GFLOPs
  • Table 3   Experimental results of model compression and partition
    Model compression and partition
    -2*Network name
    -2*
    Original
    accuracy (%)
    -2*
    Original
    inference
    time (ms)
    Compression rate (%)
    RAP-ADMM
    inference
    time (ms)
    RAP-ADMM
    accuracy (%)
    X-ADMM
    inference
    time (ms)
    X-ADMM
    accuracy (%)
    Alexnet 85.82 46.8 16.0 38.78 84.21 32.1 84.17
    GoogleNet 87.48 943.6 16.0 883.95 84.91 532.6 84.60
    ResNet-18 91.60 285.5 16.0 267.7 90.01 213.5 89.80
    VGG-16 91.66 203.7 16.0 186.9 89.59 88.3 89.20
    MobileNet 89.60 219.2 2.0 207.89 87.96 179.2 87.90
    MobileNet 89.60 219.2 4.0 196.69 80.60 160.3 80.57
    ShuffleNet 88.14 202.8 4.0 183.79 84.25 153.4 84.19
  • Table 4   The results of inter-city knowledge association and semantic characteristics
    Hanting Inn7 Days InnHome Inn
    NDCG RMSE NDCG RMSE NDCG RMSE
    MF 0.663 1.469 0.652 1.592 0.628 1.435
    MF_ SE 0.683 1.413 0.788 1.098 0.736 1.346
    MF_ KA 0.741 1.689 0.623 1.716 0.782 1.735
    CityTransfer 0.769 1.548 0.812 1.205 0.701 1.261
  • Table 5   The effect of transferring knowledge from different source cities to Xi'an
    Source city Hanting Inn 7 Days Inn Home Inn
    Beijing 0.859 0.826 0.814
    Shanghai 0.729 0.798 0.729
  • Table 6   The effect of transferring knowledge from different source cities to Nanjing
    Source city Hanting Inn 7 Days Inn Home Inn
    Beijing 0.754 0.821 0.764
    Shanghai 0.801 0.848 0.765