logo

SCIENTIA SINICA Informationis, Volume 50 , Issue 9 : 1407(2020) https://doi.org/10.1360/SSI-2020-0130

Reconfigurable computing: toward software defined chips

More info
  • ReceivedMay 11, 2020
  • AcceptedAug 4, 2020
  • PublishedSep 23, 2020

Abstract


Funded by

国家自然科学基金重点(61834002)

国家重点研发计划(2018YFB2202100)


References

[1] Bohr M. A 30 Year Retrospective on Dennard's MOSFET Scaling Paper. IEEE Solid-State Circuits Newsl, 2007, 12: 11-13 CrossRef Google Scholar

[2] Hartenstein R W, Hirschbiel A G, Riedmuller M. A novel ASIC design approach based on a new machine paradigm. IEEE J Solid-State Circuits, 1991, 26: 975-989 CrossRef ADS Google Scholar

[3] Horowitz M. 1.1 computing's energy problem (and what we can do about it). In: Proceedings of IEEE International Solid-state Circuits Conference (ISSCC), 2014. Google Scholar

[4] Liu L, Li Z, Yang C. HReA: An Energy-Efficient Embedded Dynamically Reconfigurable Fabric for 13-Dwarfs Processing. IEEE Trans Circuits Syst II, 2018, 65: 381-385 CrossRef Google Scholar

[5] Nicol C. A coarse grain reconfigurable array (cgra) for statically scheduled data flow computing. Wave Computing White Paper, 2017. Google Scholar

[6] Putnam A, Caulfield A M, Chung E S. A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services. IEEE Micro, 2015, 35: 10-22 CrossRef Google Scholar

[7] Ouyang J, Lin S, Qi W, et al. SDA: Software-defined accelerator for large-scale DNN systems. In: Proceedings of IEEE Hot Chips 26 Symposium (HCS), 2016. Google Scholar

[8] Chen F, Shan Y, Zhang Y, et al. Enabling FPGAs in the cloud. In: Proceedings of the 11th ACM Conference on Computing Frontiers, 2014. 1--10. Google Scholar

[9] Tessier R, Pocek K, DeHon A. Reconfigurable Computing Architectures. Proc IEEE, 2015, 103: 332-354 CrossRef Google Scholar

[10] Gokhale M, Graham P S. Reconfigurable Computing Systems. Proceedings of the IEEE, 2007,90(7):1201-1217. Google Scholar

[11] Qadeer W, Hameed R, Shacham O. Convolution engine. SIGARCH Comput Archit News, 2013, 41: 24-35 CrossRef Google Scholar

[12] Taylor M B. Is dark silicon useful? harnessing the four horsemen of the coming dark silicon apocalypse. In: Proceedings of DAC Design Automation Conference, 2012. 1131--1136. Google Scholar

[13] Dennard R H, Gaensslen F H, Yu H N. Design of ion-implanted MOSFET's with very small physical dimensions. IEEE J Solid-State Circuits, 1974, 9: 256-268 CrossRef ADS Google Scholar

[14] Sutter H, Larus J. Software and the Concurrency Revolution. Queue, 2005, 3: 54-62 CrossRef Google Scholar

[15] Beyond Moore's law. The Economist, 2015. Google Scholar

[16] Fuchs A, Wentzlaff D. The Accelerator Wall: Limits of Chip Specialization. In: Proceedings of 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), Washington, 2019. 1--14. Google Scholar

[17] Estrin G. Organization of computer systems: the fixed plus variable structure computer. In: Proceedings of Western Joint IRE-AIEE-ACM Computer Conference, New York, 1960. 33--40. Google Scholar

[18] DeHon A, Adams J, Delorimier M, et al. Design patterns for reconfigurable computing: field-programmable custom computing machines. In: Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2004. 13--23. Google Scholar

[19] Prabhakar R, Zhang Y, Koeplinger D, et al. Plasticine: A Reconfigurable Architecture For Parallel Paterns. In: Proceedings of ACM/IEEE International Symposium on Computer Architecture, 2017. Google Scholar

[20] Nowatzki T, Gangadhar V, Ardalani N, et al. Stream-Dataflow Acceleration. In: Proceedings of ACM/IEEE International Symposium on Computer Architecture, 2017. Google Scholar

[21] Hartenstein R. A decade of reconfigurable computing: a visionary retrospective. In: Proceedings of the conference on Design, automation and test in Europe, 2001. Google Scholar

[22] DeHon A. Fundamental Underpinnings of Reconfigurable Computing Architectures. Proc IEEE, 2015, 103: 355-378 CrossRef Google Scholar

[23] 李兆石. 高灵活可重构处理器的编程模型和硬件架构关键技术研究. 2018. Google Scholar

[24] Liu L, Zhu J, Li Z. A Survey of Coarse-Grained Reconfigurable Architecture and Design. ACM Comput Surv, 2020, 52: 1-39 CrossRef Google Scholar

[25] Zain-ul-Abdin , Svensson B. Evolution in architectures and programming methodologies of coarse-grained reconfigurable computing. Microprocessors MicroSyst, 2009, 33: 161-178 CrossRef Google Scholar

[26] Maggs B M, Matheson L R, Tarjan R E. Models of parallel computation: a survey and synthesis. In: Proceedings of 28th Hawaii International Conference on System Sciences, 1995. Google Scholar

[27] Asanovic K, Bodik R, Catanzaro B C, et al. The Landscape of Parallel Computing Research: A View from Berkeley. Technical Report UCB/EECS-2006-183. 2006. Google Scholar

[28] Liu L, Wang D, Yin S. SimRPU: A Simulation Environment for Reconfigurable Architecture Exploration. IEEE Trans VLSI Syst, 2014, 22: 2635-2648 CrossRef Google Scholar

[29] Mei B, Vernalde S, Verkest D, et al. ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix. In: Proceedings of International Conference on Field Programmable Logic and Application (FPL), 2003. Google Scholar

[30] Baumgarte V, Ehlers G, May F. J Supercomputing, 2003, 26: 167-184 CrossRef Google Scholar

[31] Liu L, Wang D, Zhu M. An Energy-Efficient Coarse-Grained Reconfigurable Processing Unit for Multiple-Standard Video Decoding. IEEE Trans Multimedia, 2015, 17: 1706-1720 CrossRef Google Scholar

[32] Dutta H, Kissler D, Hannig F. A holistic approach for tightly coupled reconfigurable parallel processors. Microprocessors MicroSyst, 2009, 33: 53-62 CrossRef Google Scholar

[33] Watkins M A, Nowatzki T, Carno A. Software transparent dynamic binary translation for coarse-grain reconfigurable architectures. In: Proceedings of International Symposium on High Performance Computer Architecture (HPCA), 2016. Google Scholar

[34] Clark N, Kudlur M, Park H, et al. Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization. In: Proceedings of International Symposium on Microarchitecture, 2004. Google Scholar

[35] Liu F, Ahn H, Beard S R, et al. DynaSpAM: Dynamic spatial architecture mapping using Out of Order instruction schedules. In: Proceedings of ACM/IEEE International Symposium on Computer Architecture, 2015. Google Scholar

[36] Elbirt A J, Paar C. An instruction-level distributed processor for symmetric-key cryptography. IEEE Trans Parallel Distrib Syst, 2005, 16: 468-480 CrossRef Google Scholar

[37] Fronte D, Perez A, Payrat E. Celator: A Multi-algorithm Cryptographic Co-processor. In: Proceedings of 2008 International Conference on Reconfigurable Computing and FPGAs, 2008. Google Scholar

[38] Mei B, Veredas F J, Masschelein B. Mapping an H.264/AVC decoder onto the ADRES reconfigurable architecture. In: Proceedings of International Conference on Field Programmable Logic and Applications, 2005. Google Scholar

[39] Hartmann M, Pantazis V V, Vander Aa T. Still Image Processing on Coarse-Grained Reconfigurable Array Architectures. J Sign Process Syst, 2010, 60: 225-237 CrossRef Google Scholar

[40] Novo D, Moffat W, Derudder V, et al. Mapping a multiple antenna SDM-OFDM receiver on the ADRES coarse-grained reconfigurable processor. In: Proceedings of IEEE Workshop on Signal Processing Systems Design and Implementation, 2005. Google Scholar

[41] Palkovic M, Cappelle H, Glassee M, et al. Mapping of 40 MHz MIMO SDM-OFDM Baseband Processing on Multi-Processor SDR Platform. In: Proceedings of IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems, 2008. Google Scholar

[42] Tu F, Yin S, Ouyang P. Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns. IEEE Trans VLSI Syst, 2017, 25: 2220-2233 CrossRef Google Scholar

[43] Yin S, Ouyang P, Tang S, et al. A 1.06-to-5.09 TOPS/W reconfigurable hybrid-neural-network processor for deep learning applications. In: Proceedings of Symposium on VLSI Circuits, 2017. Google Scholar

[44] Farabet C, Martini B, Corda B, et al. NeuFlow: A runtime reconfigurable dataflow processor for vision. In: Proceedings of Computer Vision and Pattern Recognition Workshops, 2011. Google Scholar

[45] Li B, Tan K, Luo L, et al. ClickNP: Highly Flexible and High Performance Network Processing with Reconfigurable Hardware. In: Proceedings of Conference on ACM SIGCOMM, 2016. Google Scholar

[46] Cong J, Bin Liu J, Neuendorffer S. High-Level Synthesis for FPGAs: From Prototyping to Deployment. IEEE Trans Comput-Aided Des Integr Circuits Syst, 2011, 30: 473-491 CrossRef Google Scholar

[47] Wang Z, He B, Zhang W, et al. A performance analysis framework for optimizing OpenCL applications on FPGAs. In: Proceedings of IEEE International Symposium on High PERFORMANCE Computer Architecture, 2016. Google Scholar

[48] Windh S, Ma X, Halstead R J. High-Level Language Tools for Reconfigurable Computing. Proc IEEE, 2015, 103: 390-408 CrossRef Google Scholar

[49] Sankaralingam K, Moore C R, Nagarajan R. TRIPS. ACM Trans Archit Code Optim, 2004, 1: 62-93 CrossRef Google Scholar

[50] Park H, Park Y, Mahlke S. Polymorphic pipeline array: a flexible multicore accelerator with virtualized execution for mobile multimedia applications. In: Proceedings of IEEE/ACM International Symposium on Microarchitecture, 2009. Google Scholar

[51] Qadeer W, Hameed R, Shacham O. Convolution engine. SIGARCH Comput Archit News, 2013, 41: 24-35 CrossRef Google Scholar

[52] Robatmili B, Li D, Esmaeilzadeh H, et al. How to implement effective prediction and forwarding for fusable dynamic multicore architectures. In: Proceedings of IEEE International Symposium on High PERFORMANCE Computer Architecture, 2013. Google Scholar

[53] Li Z, Liu L, Deng Y, et al. Aggressive pipelining of irregular applications on reconfigurable hardware. In: Proceedings of 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), 2017. 24--28. Google Scholar

[54] Pager J, Jeyapaul R, Shrivastava A. A Software Scheme for Multithreading on CGRAs. ACM Trans Embed Comput Syst, 2015, 14: 1-26 CrossRef Google Scholar

[55] Chang K, Choi K. Mapping control intensive kernels onto coarse-grained reconfigurable array architecture. In: Proceedings of International SoC Design Conference, 2009. Google Scholar

[56] Kim C, Sethumadhavan S, Govindan M S, et al. Composable Lightweight Processors. In: Proceedings of IEEE/ACM International Symposium on Microarchitecture, 2007. Google Scholar

[57] Ramanathan N, Fleming S T, Wickerson J, et al. Hardware Synthesis of Weakly Consistent C Concurrency. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, New York, 2017. 169--178. Google Scholar

[58] Ramanathan N, Wickerson J, Winterstein F, et al. A Case for Work-stealing on FPGAs with OpenCL Atomics. In: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. New York: ACM, 2016. 48--53. Google Scholar

[59] Winterstein F, Bayliss S, Constantinides G A. High-level synthesis of dynamic data structures:A case study using Vivado HLS. In: Proceedings of International Conference on Field-Programmable Technology (FPT), 2013. 362--365. Google Scholar

[60] Thomas J, Hanrahan P, Zaharia M. Fleet: A Framework for Massively Parallel Streaming on FPGAs. In: Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operat-ing Systems, Lausanne Switzerland, 2020. 639--651. Google Scholar

[61] Zhou S, Kannan R, Prasanna V K. HitGraph: High-throughput Graph Processing Framework on FPGA. IEEE Trans Parallel Distrib Syst, 2019, 30: 2249-2264 CrossRef Google Scholar