Foundations and Trends® in Databases > Vol 9 > Issue 1

Distributed Learning Systems with First-Order Methods

By Ji Liu, University of Rochester and Kuaishou Inc., USA, ji.liu.uwisc@gmail.com | Ce Zhang, ETH Zurich, Switzerland, ce.zhang@inf.ethz.ch

 
Suggested Citation
Ji Liu and Ce Zhang (2020), "Distributed Learning Systems with First-Order Methods", Foundations and Trends® in Databases: Vol. 9: No. 1, pp 1-100. http://dx.doi.org/10.1561/1900000062

Publication Date: 24 Jun 2020
© 2020 Ji Liu and Ce Zhang
 
Subjects
Parallel and Distributed Database Systems,  Optimization
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. Distributed Stochastic Gradient Descent
3. System Relaxation 1: Lossy Communication Compression
4. System Relaxation 2: Asynchronous Training
5. System Relaxation 3: Decentralized Communication
6. Further Reading
References

Abstract

Scalable and efficient distributed learning is one of the main driving forces behind the recent rapid advancement of machine learning and artificial intelligence. One prominent feature of this topic is that recent progress has been made by researchers in two communities: (1) the system community such as database, data management, and distributed systems, and (2) the machine learning and mathematical optimization community. The interaction and knowledge sharing between these two communities has led to the rapid development of new distributed learning systems and theory. In this monograph, we hope to provide a brief introduction of some distributed learning techniques that have recently been developed, namely lossy communication compression (e.g., quantization and sparsification), asynchronous communication, and decentralized communication. One special focus in this monograph is on making sure that it can be easily understood by researchers in both communities — on the system side, we rely on a simplified system model hiding many system details that are not necessary for the intuition behind the system speedups; while, on the theory side, we rely on minimal assumptions and significantly simplify the proof of some recent work to achieve comparable results.

DOI:10.1561/1900000062
ISBN: 978-1-68083-700-1
108 pp. $75.00
Buy book (pb)
 
ISBN: 978-1-68083-701-8
108 pp. $140.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. Distributed Stochastic Gradient Descent
3. System Relaxation 1: Lossy Communication Compression
4. System Relaxation 2: Asynchronous Training
5. System Relaxation 3: Decentralized Communication
6. Further Reading
References

Distributed Learning Systems with First-order Methods

Scalable and efficient distributed learning is one of the main driving forces behind the recent rapid advancement of machine learning and artificial intelligence. One prominent feature of this development is that recent progress has been made by researchers in two communities: (1) the system community such as database, data management, and distributed systems, and (2) the machine learning and mathematical optimization community. The interaction and knowledge sharing between these two communities has led to the rapid development of new distributed learning systems and theory.

This monograph provides a brief introduction to three distributed learning techniques that have recently been developed: lossy communication compression, asynchronous communication, and decentralized communication. These have significant impact on the work in both the system and machine learning and mathematical optimization communities but to fully realize the potential, it is essential they understand the whole picture. This monograph provides the bridge between the two communities. The simplified introduction to the essential aspects of each community enables researchers to gain insights into the factors influencing both.

The monograph provides students and researchers the groundwork for developing faster and better research results in this dynamic area of research.

 
DBS-062