logo

SCIENTIA SINICA Informationis, Volume 49 , Issue 3 : 314-333(2019) https://doi.org/10.1360/N112018-00282

Research on low-power neural network computing accelerator

More info
  • ReceivedOct 18, 2018
  • AcceptedFeb 21, 2019
  • PublishedMar 20, 2019

Abstract


Funded by

国家自然科学基金(61774094)

国家科技重大专项(2018ZX01031101-002)


References

[1] 魏少军, 刘雷波, 尹首一. 可重构计算. 北京: 科学出版社, 2014. Google Scholar

[2] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 770--778. Google Scholar

[3] Chen Y H, Krishna T, Emer J S. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE J Solid-State Circuits, 2017, 52: 127-138 CrossRef ADS Google Scholar

[4] Gao C, Neil D, Ceolini E, et al. Deltarnn: a power-efficient recurrent neural network accelerator. In: Proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018. 21--30. Google Scholar

[5] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014,. arXiv Google Scholar

[6] Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015. Google Scholar

[7] He K M, Zhang X Y, Ren S Q, et al. Delving deep into rectifiers: surpassing human-level performance on imageNet classification. In: Proceedings of IEEE International Conference on Computer Vision, 2016. 1026--1034. Google Scholar

[8] Jouppi N P, Young C, Patil N, et al. In-datacenter performance analysis of a tensor processing unit. 2017,. arXiv Google Scholar

[9] Donahue J, Hendricks L A, Guadarrama S, et al. Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of Computer Vision and Pattern Recognition, 2015. Google Scholar

[10] Yin S Y, Ouyang P, Tang S B, et al. A 1.06-to-5.09 TOPS/W reconfigurable hybrid-neural-network processor for deep learning applications. In: Proceedings of Symposium on VLSI Circuits, 2017. Google Scholar

[11] Moons B, Uytterhoeven R, Dehaene W, et al. 14.5 envision: a 0.26-to-10 TOPS/W subword-parallel dynamic- voltageaccuracy-frequency-scalable convolutional neural network processor in 28 nm FDSOI. In: Proceedings of IEEE InternationalSolid-State Circuits Conference (ISSCC), 2017. 246--257. Google Scholar

[12] Yan J, Yin S, Tu F. GNA: Reconfigurable and Efficient Architecture for Generative Network Acceleration. IEEE Trans Comput-Aided Des Integr Circuits Syst, 2018, 37: 2519-2529 CrossRef Google Scholar

[13] Tu F, Yin S, Ouyang P. Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns. IEEE Trans VLSI Syst, 2017, 25: 2220-2233 CrossRef Google Scholar

[14] Chen T S, Du Z D, Sun N H, et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGPLAN Notice, 2014, 49: 269--284. Google Scholar

[15] Zhang C, Li P, Sun G Y, et al. Optimizing fpga-based accelerator design for deep convolutional neural networks. In: Proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015. 161--170. Google Scholar

[16] Johnson J, Alahi A, Feifei L. Perceptual losses for real-time style transfer and super-resolution. In: Proceedings of European Conference on Computer Vision. 2016. Google Scholar

[17] Ledig C, Theis L, Huszar F, et al. Photo-realistic single image super-resolution using a generative adversarial network. 2016,. arXiv Google Scholar

[18] Tu F B, Wu W W, Yin S Y, et al. RANA: towards efficient neural acceleration with refresh-optimized embedded DRAM. In: Proceedings of the 45th Annual International Symposium on Computer Architecture, 2018. 340--352. Google Scholar

[19] Chi P, Li S C, Xu C, et al. PRIME: a novel processing-in-memory architecture for neural network computation in reram-based main memory. In: Proceedings of the 43rd Annual International Symposium on Computer Architecture (ISCA), 2016. 27--39. Google Scholar

[20] Yin S Y, Ouyang P, Yang J X, et al. An ultra-high energy-efficient reconfigurable processor for deep neural networks with binary/ternary weights in 28 nm CMOS. In: Proceedings of Symposia on VLSI Technology and Circuits, Honolulu, 2018. Google Scholar

qqqq

Contact and support