logo

SCIENCE CHINA Information Sciences, Volume 63 , Issue 2 : 122301(2020) https://doi.org/10.1007/s11432-019-9937-7

Storage and repair bandwidth tradeoff for heterogeneous cluster distributed storage systems

More info
  • ReceivedApr 22, 2019
  • AcceptedJul 3, 2019
  • PublishedJan 15, 2020

Abstract


Acknowledgment

This work was partially supported by National Natural Science Foundation of China (Grant No. 61571293), China Program of International S$\&$T Cooperation (Grant No. 2016YFE0100300), and SJTU-CUHK Joint Research Collaboration Fund 2018.


References

[1] Hu Y C, Li X L, Zhang M. Optimal Repair Layering for Erasure-Coded Data Centers. ACM Trans Storage, 2017, 13: 1-24 CrossRef Google Scholar

[2] Huang C, Chen M H, Li J. Pyramid codes: flexible schemes to trade space for access efficiency in reliable data storage systems. ACM Transactions on Storage, 2013, 9(1): 79--86 DOI: 10.1109/NCA.2007.37. Google Scholar

[3] Zhang H Y, Li H, Li S Y R. Repair Tree: Fast Repair for Single Failure in Erasure-Coded Distributed Storage Systems. IEEE Trans Parallel Distrib Syst, 2017, 28: 1728-1739 CrossRef Google Scholar

[4] Balaji S B, Krishnan M N, Vajha M. Erasure coding for distributed storage: an overview. Sci China Inf Sci, 2018, 61: 100301 CrossRef Google Scholar

[5] Hou H X, Han Y S. A class of binary MDS array codes with asymptotically weak-optimal repair. Sci China Inf Sci, 2018, 61: 100302 CrossRef Google Scholar

[6] Dimakis A G, Godfrey P B, Wu Y N. Network Coding for Distributed Storage Systems. IEEE Trans Inform Theor, 2010, 56: 4539-4551 CrossRef Google Scholar

[7] Li S Y R, Yeung R W, Cai N. Linear network coding. IEEE Trans Inform Theor, 2003, 49: 371-381 CrossRef Google Scholar

[8] Yang B, Tang X, Li J. A Systematic Piggybacking Design for Minimum Storage Regenerating Codes. IEEE Trans Inform Theor, 2015, 61: 5779-5786 CrossRef Google Scholar

[9] Goparaju S, Fazeli A, Vardy A. Minimum Storage Regenerating Codes for All Parameters. IEEE Trans Inform Theor, 2017, 63: 6318-6328 CrossRef Google Scholar

[10] Rashmi K V, Shah N B, Kumar P V. Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction. IEEE Trans Inform Theor, 2011, 57: 5227-5239 CrossRef Google Scholar

[11] Rashmi K V, Shah N B, Ramchandran K. A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes. IEEE Trans Inform Theor, 2017, : 1-1 CrossRef Google Scholar

[12] Tang X, Yang B, Li J. A New Repair Strategy for the Hadamard Minimum Storage Regenerating Codes for Distributed Storage Systems. IEEE Trans Inform Theor, 2015, 61: 5271-5279 CrossRef Google Scholar

[13] Ford D, Labelle F, Popovici F I, et al. Availability in globally distributed storage systems. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, Vancouver, 2010. 61--74. Google Scholar

[14] Kubiatowicz J, Bindel D, Chen Y. OceanStore. SIGPLAN Not, 2000, 35: 190-201 CrossRef Google Scholar

[15] Ahmad F, Chakradhar S T, Raghunathan A, et al. Shufflewatcher: Shuffle-aware scheduling in multi-tenant mapreduce clusters. In: Proceedings of 2014 USENIX Annual Technical Conference (USENIX ATC 14), Philadelphia, 2014. 1--13. Google Scholar

[16] Prakash N, Abdrashitov V, Medard M. The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems. IEEE Trans Inform Theor, 2018, 64: 5783-5805 CrossRef Google Scholar

[17] Sohn J Y, Choi B, Yoon S W. Capacity of Clustered Distributed Storage. IEEE Trans Inform Theor, 2019, 65: 81-107 CrossRef Google Scholar

[18] Hou H X, Lee P P C, Shum K W. Rack-Aware Regenerating Codes for Data Centers. IEEE Trans Inform Theor, 2019, 65: 4730-4745 CrossRef Google Scholar

[19] Hu Y C, Lee P P C, Zhang X. Double regenerating codes for hierarchical data centers. In: Proceedings of 2016 IEEE International Symposium on Information Theory (ISIT), 2016. 245--249. Google Scholar

[20] Abdrashitov V, Prakash N, and Médard M. The storage vs repair bandwidth trade-off for multiple failures in clustered storage networks. In: Proceedings of 2017 IEEE Information Theory Workshop (ITW), 2017. 46--50. Google Scholar

[21] Akhlaghi S, Kiani A, Ghanavati M R. Cost-bandwidth tradeoff in distributed storage systems. Comput Commun, 2010, 33: 2105-2115 CrossRef Google Scholar

[22] Wang J Z, Wang T H, Luo Y. Storage and repair bandwidth tradeoff for distributed storage systems with clusters and separate nodes. Sci China Inf Sci, 2018, 61: 100303 CrossRef Google Scholar

[23] Pernas J, Yuen C, Gastón B, et al. Non-homogeneous two-rack model for distributed storage systems. In: Proceedings of 2013 IEEE International Symposium on Information Theory (ISIT), 2013. 1237--1241. Google Scholar

[24] Sohn J Y, Choi B, Yoon S W, et al. Capacity of clustered distributed storage. In: Proceedings of 2017 IEEE International Conference on Communications (ICC), Paris, 2017. Google Scholar

[25] Ernvall T, El Rouayheb S, Hollanti C. Capacity and Security of Heterogeneous Distributed Storage Systems. IEEE J Sel Areas Commun, 2013, 31: 2701-2709 CrossRef Google Scholar

[26] Golrezaei N, Dimakis A G, Molisch A F. Wireless device-to-device communications with distributed caching. In: Proceedings of IEEE International Symposium on Information Theory, Cambridge, 2012. Google Scholar

[27] Yu Q, Shum K W, Sung C W. Tradeoff between storage cost and repair cost in heterogeneous distributed storage systems. Trans Emerging Tel Tech, 2015, 26: 1201-1211 CrossRef Google Scholar

[28] Wang J Z, Wang T H, Luo Y, et al. Capacity of Distributed Storage Systems with Clusters and Separate Nodes. 2019,. arXiv Google Scholar

[29] Dimakis A G, Ramchandran K, Wu Y N. A Survey on Network Codes for Distributed Storage. Proc IEEE, 2011, 99: 476-489 CrossRef Google Scholar

[30] Ahlswede R, Cai N, Li S Y R. Network information flow. IEEE Trans Inform Theor, 2000, 46: 1204-1216 CrossRef Google Scholar

[31] Ahuja R K, Magnanti T L, Orlin J B. Network flows: theory, algorithms, and applications. Upper Saddle River: Prentice Hall, 1993. 77--78. Google Scholar

[32] Zorgui M, Wang Z Y. Code constructions for multi-node exact repair in distributed storage. Sci China Inf Sci, 2018, 61: 100304 CrossRef Google Scholar

[33] Wu Y N, and Dimakis A G. Reducing repair traffic for erasure coding-based storage via interference alignment. In: Proceedings of 2009 IEEE International Symposium on Information Theory (ISIT), 2009. 2276--2280. Google Scholar

  • Figure 1

    The HC-DSS model.

  • Figure 2

    (Color online) An information flow graph.

  • Figure 3

    (Color online) The part-cut values of cutting $\{x^{t_i}\}_{i=1}^k$ in the topological ordering.

  • Figure 4

    (Color online) As the numbered nodes show, $\boldsymbol{\pi}$ and $\boldsymbol{\pi}'$ are generated by Algorithm 1 for the same ${\boldsymbol~s}=(2,4,2,0)$ and different separate location vectors. (a) $\boldsymbol{\pi}=(1,2,0,1,2,1,0,1)$; (b) $\boldsymbol{\pi}'=(1,2,1,0,2,1,0,1)$.

  • Figure 5

    The cluster nodes of ${\boldsymbol~s}=(2,3,3,0)$ and $\boldsymbol{\pi}=(1,2,1,0,2,1,0,2)$ shown in (a) are represented with $\overline{{\boldsymbol~s}}=(0,3,3,0)$ and $\overline{\boldsymbol{\pi}}=(1,2,1,2,1,2)$ respectively in (b).

  • Figure 6

    In (a), there are two selected separate node and the cluster order is $\boldsymbol{\pi}^*(2)=(1,2,1,2,1,1,0,0)$. In (b), a selected separate node is added, and the cluster changes to $\boldsymbol{\pi}^*(3)=(1,2,1,1,1,0,0,0)$

  • Figure 7

    (Color online) Optimal tradeoff bounds between node storage $\alpha$ and repair bandwidth parameter $\beta_C$ for the $(n=14,k=8,L=3,R=4,E)$ HC-DSS model with the original file size $\mathcal{M}=32$. (a) shows the tradeoffs of various $d_C$ when the amount of separate nodes $E=2$ and the bandwidth parameter constraint $\tau=\beta_I/\beta_C=2$ are fixed. (b) illustrates the tradeoffs of various $E$ when $d_C=$ and $\tau=2$ are fixed. (c) compares the tradeoffs of various $\tau=\beta_I/\beta_C$ when $d_C=6$ and $E=2$ are fixed.

  • Figure 8

    (Color online) In the $(n=6,k=4,L=2,R=2,E=2)$ HC-DSS model with $\tau=\beta_I/\beta_C=2$, $d_I=1$, $d_C=4$ and $\mathcal{M}=8$, (a) illustrates the optimal tradeoff bound between $\alpha$ and $\beta_C$, (b) shows the data assignment in the MSR code construction example.

  •   

    Algorithm 1 Heterogeneous vertical order algorithm

    Require:Selected node distribution ${\boldsymbol~s}=(s_0,s_1,\ldots,s_L)$; separate location vector ${\boldsymbol~v}=(v_1,v_2,\ldots,v_k)$.

    Output:Cluster order $\boldsymbol{\pi}^*=(\pi_1^*,\ldots,\pi_k^*)$.

    Initial cluster label $j~\leftarrow~1$ and count variable $t\gets~1$;

    for $i=1$ to $k$

    if $i=v_t$ then

    $\pi^*_i\gets~0$; $t\gets~t+1$; continue;

    end if

    if $s_j=0$ then

    $j\gets~1;$

    else

    $\pi_i^*\gets~j$; $s_j\gets~s_j-1$; $j\gets~(j\mod~L)+1$;

    end if

    end for

  •   

    Algorithm 2 Heterogeneous horizontal selection algorithm

    Require:Node parameters $(n,k,L,R,E)$; the amount of selected separate nodes $N_{\rm~sep}$ ($1\leq~N_{\rm~sep}\leq~E$)

    Output:Selected node distribution ${\boldsymbol~s}^*=(s_0^*,~s_1^*,\ldots,s_L^*)$.

    $s_0^*~\gets~N_{\rm~sep}$;

    for $i=1$ to $L$

    if $i~\leq~\left\lfloor~\frac{k-s_0^*}{R}\right\rfloor$ then

    $s_i^*~\gets~R$;

    end if

    if $i~=~\left\lfloor~\frac{k-s_0^*}{R}\right\rfloor+1$ then

    $s_i^*~\gets~k-s_0^*-\left\lfloor~\frac{k-s_0^*}{R}\right\rfloor~R$;

    end if

    if $i~>~\left\lfloor~\frac{k-s_0^*}{R}\right\rfloor+1$ then

    $s_i^*~\gets~0$;

    end if

    end for