SCIENCE CHINA Information Sciences, Volume 65 , Issue 3 : 132101(2022) https://doi.org/10.1007/s11432-019-2833-1

A self-tuning client-side metadata prefetching scheme for wide area network file systems

More info
  • ReceivedSep 28, 2019
  • AcceptedMar 17, 2020
  • PublishedFeb 22, 2021



This work was supported by National key RD Program of China (Grant No. 2018YFB0203901), National Natural Science Foundation of China (Grant No. 61772053), the Fund of the State Key Laboratory of Software Development Environment (Grant No. SKLSDE-2018ZX-10), and Science Challenge Project (Grant No. TZ2016002).


[1] Wrzeszcz M, Trzepla K, Slota R, et al. Metadata organization and management for globalization of data access with onedata. In: Proceedings of the International Conference on Parallel Processing and Applied Mathematics, Krakow, 2015. 312--321. Google Scholar

[2] Grimshaw A, Morgan M, Kalyanaraman A. GFFS - THE XSEDE GLOBAL FEDERATED FILE SYSTEM. Parallel Process Lett, 2013, 23: 1340005 CrossRef Google Scholar

[3] Weil S A, Brandt S A, Miller E L, et al. Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th symposium on Operating systems design and implementation, Washington, 2006. 307--320. Google Scholar

[4] Ghemawat S, Gobioff H, Leung S T. The Google file system. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles, New York, 2003. 20--43. Google Scholar

[5] Zhang S, Catanese H, Wang A A I. The composite-file file system: decoupling the one-to-one mapping of files and metadata for better performance. In: Proceedings of the 14th USENIX Conference on File and Storage Technologies, Santa Clara, 2016. 15--22. Google Scholar

[6] Beckmann N, Chen H, Cidon A. LHD: improving cache hit rate by maximizing hit density. In: Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation, Renton, 2018. 389-403. Google Scholar

[7] Li Z, Chen Z, Srinivasan S M, et al. C-Miner: mining block correlations in storage systems. In: Proceedings of the 3rd USENIX Conference on File and Storage Technologies, San Francisco, 2004. 173--186. Google Scholar

[8] Hsu W W, Smith A J, Young H C. The automatic improvement of locality in storage systems. ACM Trans Comput Syst, 2005, 23: 424-473 CrossRef Google Scholar

[9] Ding X, Jiang S, Chen F, et al. DiskSeen: exploiting disk layout and access history to enhance I/O prefetch. In: Proceedings of USENIX Annual Technical Conference, Boston, 2007. 7: 261--274. Google Scholar

[10] Jiang S, Ding X, Xu Y. A Prefetching Scheme Exploiting both Data Layout and Access History on Disk. ACM Trans Storage, 2013, 9: 1-23 CrossRef Google Scholar

[11] Kuenning G H. The design of the seer predictive caching system. In: Proceedings of the First Workshop on Mobile Computing Systems and Applications. New York, 1994. 37--43. Google Scholar

[12] Griffioen J. Performance measurements of automatic prefetching. In: Proceedings of the ISCA International Conference on Parallel and Distributed Computing Systems, New York, 1995. 165--170. Google Scholar

[13] Li X, Xiao L, Qiu M. Enabling dynamic file I/O path selection at runtime for parallel file system. J Supercomput, 2014, 68: 996-1021 CrossRef Google Scholar

[14] Battle L, Chang R, Stonebraker M. Dynamic prefetching of data tiles for interactive visualization. In: Proceedings of the 2016 International Conference on Management of Data, San Francisco, 2016. 1363--1375. Google Scholar

[15] Wei B, Xiao L M, Wei W. A New Adaptive Coding Selection Method for Distributed Storage Systems. IEEE Access, 2018, 6: 13350-13357 CrossRef Google Scholar

[16] Lin W, Xu S Y, Li J. Design and theoretical analysis of virtual machine placement algorithm based on peak workload characteristics. Soft Comput, 2017, 21: 1301-1314 CrossRef Google Scholar

[17] Patrick C M, Kandemir M, Karakoy M, et al. Cashing in on hints for better prefetching and caching in pvfs and mpi-io. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, Chicago, 2010: 191--202. Google Scholar

[18] Henschel R, Simms S, Hancock D, et al. Demonstrating Lustre over a 100 Gbps wide area network of 3500 km. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, Salt Lake City, 2012. 1--8. Google Scholar

[19] Carns P, Lang S, Ross R, et al. Small-file access in parallel file systems. In: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, New York, 2009. 1--11. Google Scholar

[20] Cao P, Felten E W, Karlin A R. A study of integrated prefetching and caching strategies. SIGMETRICS Perform Eval Rev, 1995, 23: 188-197 CrossRef Google Scholar

[21] Habermann P, Chi C C, Alvarez-Mesa M. Application-Specific Cache and Prefetching for HEVC CABAC Decoding. IEEE MultiMedia, 2017, 24: 72-85 CrossRef Google Scholar

[22] Al Assaf M M, Jiang X, Qin X. Informed Prefetching for Distributed Multi-Level Storage Systems. J Sign Process Syst, 2018, 90: 619-640 CrossRef Google Scholar

[23] Hou B, Chen F. Pacaca: mining object correlations and parallelism for enhancing user experience with cloud storage. In: Proceedings of the 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2018. 293--305. Google Scholar