logo

SCIENTIA SINICA Informationis, Volume 49 , Issue 3 : 247-255(2019) https://doi.org/10.1360/N112018-00283

Research on homegrown manycore architecture for intelligent computing

More info
  • ReceivedOct 18, 2018
  • AcceptedMar 7, 2019
  • PublishedMar 15, 2019

Abstract


Funded by

核高基项目面向数据中心(云平台)与集群计算的智能计算单元(2018ZX01028-102)


References

[1] Jouppi N P, Young C, Patil N, et al. In-datacenter performance analysis of a tensor processing unit. In: Proceedings of the 44th International Symposium on Computer Architecture (ISCA), Toronto, 2017. Google Scholar

[2] NVIDIA. Whitepaper-NVIDIA's next generation CUDA compute architecture: Kepler GK110/210. https://www.geforce.com/landing-page/graphics-cards-with-kepler-architecture. Google Scholar

[3] Uijlings J R R, van de Sande K E A, Gevers T. Selective Search for Object Recognition. Int J Comput Vis, 2013, 104: 154-171 CrossRef Google Scholar

[4] Chen D C, Rabaey J M. A reconfigurable multiprocessor IC for rapid prototyping of algorithmic-specific high-speed DSP data paths. IEEE J Solid-State Circuits, 1992, 27: 1895-1904 CrossRef ADS Google Scholar

[5] Yeung A K W, Rabaey J M. A reconfigurable data driven multi-processor architecture for rapid prototyping of high throughput DSP algorithms. In: Proceedings of HICCS Conference, 1993. 169--178. Google Scholar

[6] Goldstein S C, Schmit H, Moe M, et al. PipeRench: A Coprocessor for Streaming Multimedia Acceleration. In: Proceedings of the 26th International Symposium on Computer Architecture, 1999. Google Scholar

[7] Michael Bedford Taylor. The Raw Processor Specification. http://groups.csail.mit.edu/cag/raw/. Google Scholar

[8] Du P, Weber R, Luszczek P. From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming. Parallel Computing, 2012, 38: 391-407 CrossRef Google Scholar

[9] Denneau M. Computing at the speed of life: the blue gene/cyclops supercomputer. In: CITI Distinguished Lecture Series. Huston: Rice University, 2002. Google Scholar

[10] Gschwind M, Hofstee H P, Flachs B. Synergistic Processing in Cell's Multicore Architecture. IEEE Micro, 2006, 26: 10-24 CrossRef Google Scholar

[11] Chrysos G. Intel Xeon Phi coprocessor (code name Knights Corner). In: Proceedings of the 24th Hot Chips Symposium, 2012. Google Scholar

[12] Seiler L, Carmean D, Sprangle E. Larrabee: A Many-Core x86 Architecture for Visual Computing. IEEE Micro, 2009, 29: 10-21 CrossRef Google Scholar

[13] Lindholm E, Nickolls J, Oberman S. NVIDIA Tesla: A Unified Graphics and Computing Architecture. IEEE Micro, 2008, 28: 39-55 CrossRef Google Scholar

[14] NVIDIA. NVIDIA Kepler GK110 Architecture Whitepaper. 2012. https://www.nvidia.com/content/PDF/kepler/NV_DS_Tesla_KCompute_Arch_May_2012_LR.pdf. Google Scholar

[15] Keckler S W, Dally W J, Khailany B. GPUs and the Future of Parallel Computing. IEEE Micro, 2011, 31: 7-17 CrossRef Google Scholar

[16] Huang H, Liu L, Song F L, et al. Architecture supported synchronization-based cache coherence protocol for many-core processors. Chinese J Comput, 2009, 32: 1618--1630. Google Scholar

[17] Zhou Y B, Zhang J C, Zhang S, et al. Software/hardware co-design for 1-D FFT optimization on many-core architecture. Chinese J Comput, 2008, 31: 2005--2014. Google Scholar

[18] Deng R Y, Chen H Y, Dou Q, et al. A parallel stream memory architecture for heterogeneous multi-core processor. Acta Electron Sin, 2009, 37: 312--317. Google Scholar

[19] Fang J R, Fu H H, Zhao W L, et al. swDNN: a library for accelerating deep learning applications on sun- way taihulight supercomputer. In: Proceedings of the 31st IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2017. Google Scholar

[20] Zhao W, Fu H, Fang J. Optimizing Convolutional Neural Networks on the Sunway TaihuLight Supercomputer. ACM Trans Archit Code Optim, 2018, 15: 1-26 CrossRef Google Scholar

[21] Li L D, Fang J R, Fu H H, et al. swCaffe: A parallel framework for accelerating deep learning applications on sunway TaihuLight. In: Proceedings of IEEE International Conference on Cluster Computing (CLUSTER), 2018. Google Scholar

[22] Zhao W L. Deep learning platform on sunway TaihuLight supercomputer. 2017. http://lms.comp.nus.edu.sg/sites/default/files/news-attachments/Industry3-ZhaoWenlai.pdf. Google Scholar