Foundations and Trends® in Communications and Information Theory > Vol 18 > Issue 3

Modeling and Optimization of Latency in Erasure-coded Storage Systems

By Vaneet Aggarwal, Purdue University, USA, vaneet@purdue.edu | Tian Lan, George Washington University, USA, tlan@gwu.edu

 
Suggested Citation
Vaneet Aggarwal and Tian Lan (2021), "Modeling and Optimization of Latency in Erasure-coded Storage Systems", Foundations and TrendsĀ® in Communications and Information Theory: Vol. 18: No. 3, pp 380-525. http://dx.doi.org/10.1561/0100000108

Publication Date: 07 Jul 2021
© 2021 Vaneet Aggarwal and Tian Lan
 
Subjects
Coding theory and practice,  Information theory and computer science,  Communication system design,  Storage and recording codes,  Queuing theory,  Markov decision processes,  Stochastic optimization,  Modeling and analysis,  Dynamics and asymptotic behavior of networks,  Coding and compression,  Distributed computing,  Storage, access methods, and indexing
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. MDS-Reservation Scheduling Approach
3. Fork-Join Scheduling Approach
4. Probabilistic Scheduling Approach
5. Delayed-Relaunch Scheduling Approach
6. Analyzing Latency for Video Content
7. Lessons from prototype implementation
References

Abstract

As consumers are increasingly engaged in social networking and E-commerce activities, businesses grow to rely on Big Data analytics for intelligence, and traditional IT infrastructures continue to migrate to the cloud and edge, these trends cause distributed data storage demand to rise at an unprecedented speed. Erasure coding has seen itself quickly emerged as a promising technique to reduce storage cost while providing similar reliability as replicated systems, widely adopted by companies like Facebook, Microsoft and Google. However, it also brings new challenges in characterizing and optimizing the access latency when data objects are erasure coded in distributed storage. The aim of this monograph is to provide a review of recent progress (both theoretical and practical) on systems that employ erasure codes for distributed storage.

In this monograph, we will first identify the key challenges and taxonomy of the research problems and then give an overview of different models and approaches that have been developed to quantify latency of erasure-coded storage. This includes recent work leveraging MDS-Reservation, Fork-Join, Probabilistic, and Delayed-Relaunch scheduling policies, as well as their applications to characterizing access latency (e.g., mean, tail, and asymptotic latency) of erasure-coded distributed storage systems. We will also extend the discussions to video streaming from erasure-coded distributed storage systems. Next, we will bridge the gap between theory and practice, and discuss lessons learned from prototype implementations. In particular, we will discuss exemplary implementations of erasure-coded storage, illuminate key design degrees of freedom and tradeoffs, and summarize remaining challenges in real-world storage systems such as in content delivery and caching. Open problems for future research are discussed at the end of each chapter.

DOI:10.1561/0100000108
ISBN: 978-1-68083-842-8
156 pp. $99.00
Buy book (pb)
 
ISBN: 978-1-68083-843-5
156 pp. $140.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. MDS-Reservation Scheduling Approach
3. Fork-Join Scheduling Approach
4. Probabilistic Scheduling Approach
5. Delayed-Relaunch Scheduling Approach
6. Analyzing Latency for Video Content
7. Lessons from prototype implementation
References

Modeling and Optimization of Latency in Erasure-coded Storage Systems

The advent of Big Data analytics and cloud computing has resulted in an unprecedented increase in the demand for distributed data storage demand. Companies are constantly on the lookout for ways of reducing this cost and improving reliability. Erasure coding has emerged as a promising technique to achieve these goals and most major tech companies have adopted it. However, one major issue in such systems is the characterization and optimization of access latency when data objects are erasure coded in distributed storage.

In this monograph, the authors provide a review of recent theoretical and practical progress on systems that employ erasure codes for distributed storage. Starting with an overview the key challenges and research problems, the authors give an overview of different models and approaches that have been developed to quantify latency of erasure-coded storage. They also extend the discussions to video streaming from erasure-coded distributed storage systems. Practical implementations of erasure-coded storage are then discussed in real-world storage systems such as in content delivery and caching.

This monograph is aimed at students, researchers and practitioners in information theory active in the research and development of modern day distributed storage systems.

 
CIT-108