logo

SCIENCE CHINA Information Sciences, Volume 60 , Issue 6 : 062402(2017) https://doi.org/10.1007/s11432-016-0306-y

HyBar: high efficient barrier synchronization based on a hybrid packet-circuit switching Network-on-Chip

More info
  • ReceivedAug 11, 2016
  • AcceptedNov 6, 2016
  • PublishedFeb 9, 2017

Abstract


Acknowledgment

Acknowledgments

This work was partially supported by Equipment Pre-Research Foundation of China (Grant No. 9140A08010414JW03025).


References

[1] Wilkinson B, Allen M. Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers. Upper Saddle River: Prentice Hall, 2004. Google Scholar

[2] Sartori J, Kumar R. Low-overhead, high-speed multi-core barrier synchronization. In: Proceedings of the 5th International Conference on High Performance Embedded Architectures and Compilers (HiPEAC'10), Pisa, 2010. 18--34. Google Scholar

[3] Shen X B. Evolution of MPP SoC architecture techniques. Sci China Ser F-Inf Sci, 2008, 51: 756-764 CrossRef Google Scholar

[4] Villa O, Palermo G, Silvano C. Efficiency and scalability of barrier synchronization on NoC based many-core architectures. In: Proceedings of International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES'08), New York, 2008. 81--90. Google Scholar

[5] Monchiero M, Palermo G, Silvano C, et al. Efficient synchronization for embedded on-chip multiprocessors. IEEE Trans Very Large Scale Integration Syst, 2006, 14: 1049-1062 CrossRef Google Scholar

[6] Xiao H, Wu N, Ge F, et al. Efficient synchronization for distributed embedded multiprocessors. IEEE Trans Very Large Scale Integration Syst, 2016, 24: 779-783 CrossRef Google Scholar

[7] Wei Z Q, Liu P L, Sun R D, et al. TAB barrier: hybrid barrier synchronization for NoC-based processors. In: Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS'15), Lisbon, 2015. 409--412. Google Scholar

[8] Chen X, Lu Z, Jantsch A, et al. Cooperative communication based barrier synchronization in on-chip mesh architectures. IEICE Electron Expr, 2011, 8: 1856-1862 CrossRef Google Scholar

[9] Chen X W, Lu Z, Jantsch A, et al. Cooperative communication for efficient and scalable all-to-all barrier synchronization on mesh-based many-core NoCs. IEICE Electron Expr, 2014, 11: 20140542-1862 CrossRef Google Scholar

[10] Abellan J L, Fernandez J, Acacio M E, et al. Design of a collective communication infrastructure for barrier synchronization in cluster-based nanoscale MPSoCs. In: Proceedings of Design, Automation Test in Europe Conference Exhibition (DATE'12), Dresden, 2012. 491--496. Google Scholar

[11] Oh J, Prvulovic M, Zajic A. TLSync: support for multiple fast barriers using on-chip transmission lines. In: Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA'11), San Jose, 2011. 105--115. Google Scholar

[12] Kumar A, Peh L S, Kundu P, et al. Express virtual channels: towards the ideal interconnection fabric. In: Proceedings of the 34th Annual International Symposium on Computer Architecture (ISCA'07), San Diego, 2007. 150--161. Google Scholar

[13] Krishna T, Peh L S. Single-cycle collective communication over a shared network fabric. In: Proceedings of the 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS'14), Ferrara, 2014. 1--8. Google Scholar

[14] Daneshtalab M, Ebrahimi M, Mohammadi S, et al. Low-distance path-based multicast routing algorithm for network-on-chips. IET Comput Digit Tech, 2009, 3: 430-442 CrossRef Google Scholar

[15] Modarressi M, Sarbazi-Azad H, Arjomand M. A hybrid packet-circuit switched on-chip network based on SDM. In: Proceedings of Conference on Design, Automation and Test in Europe (DATE'09), Nice, 2009. 566--569. Google Scholar

[16] Lin J, Zhou W, Yu Z, et al. A hybrid router combining circuit switching and packet switching with virtual channels for on-chip networks. In: Proceedings of the 10th IEEE International Conference on ASIC (ASICON'13), Shenzhen, 2013. 1--4. Google Scholar

[17] Abousamra A K, Melhem R G, Jones A K. Déjà Vu switching for multiplane NoCs. In: Proeedings of the 6th IEEE/ACM International Symposium on Networks on Chip (NoCS'12), Copenhagen, 2012. 11--18. Google Scholar

[18] Ou P, Zhang J, Quan H, et al. A 65nm 39 GOPS/W 24-core processor with 11 Tb/s/W packet-controlled circuit-switched doublelayer network-on-chip and heterogeneous execution array. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC'13), San Francisco, 2013. 56--57. Google Scholar

[19] Jerger N D E, Peh L S, Lipasti M H. Circuit-switched coherence. In: Proceedings of the 2nd IEEE/ACM International Symposium on Networks-on-Chip (NoCS'08), Newcastle upon Tyne, 2008. 193--202. Google Scholar

[20] Chen G, Anders M A, Kaul H, et al. A 340 mV-to-0.9V 20.2 Tb/s source-synchronous hybrid packet/circuit-switched 16$\times$16 network-on-chip in 22nm tri-gate CMOS. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC'14), San Francisco, 2014. 276--277. Google Scholar

[21] Glass C J, Ni L M. The turn model for adaptive routing. In: Proceedings of the 19th Annual International Symposium on Computer Architecture (ISCA'92). New York: ACM, 1992. 278--287. Google Scholar

[22] Becker D U. Efficient microarchitecture for network-on-chip routers. Dissertation for Ph.D. Degree. Palo Alto: Stanford University, 2012. Google Scholar

[23] McMahon F H. Livermore Fortran Kernels: a Computer Test of Numerical Performance Range. Technical Report UCRL-53745. 1986. Google Scholar