Scalable Algorithms for Data and Network Analysis

Shang-Hua Teng

doi:10.1561/0400000051

Foundations and Trends® in Theoretical Computer Science > Vol 12 > Issue 1–2

Scalable Algorithms for Data and Network Analysis

By Shang-Hua Teng, University of Southern California, Los Angeles, USA, shanghua@usc.edu

Suggested Citation

Shang-Hua Teng (2016), "Scalable Algorithms for Data and Network Analysis", Foundations and Trends® in Theoretical Computer Science: Vol. 12: No. 1–2, pp 1-274. http://dx.doi.org/10.1561/0400000051

Publication Date: 30 May 2016

Subjects

Algorithmic game theory, Computational aspects of combinatorics and graph theory, Computational complexity, Computational geometry, Computational models and complexity, Data structures, Design and analysis of algorithms, Operations research, Parallel algorithms, Randomness in computation, Data mining, Economics of information and the Web, Scalability, Social networking, Spectral methods, Robustness, Optimization, Markov chain Monte Carlo, Graphical models, Game theoretic learning, Dimensionality reduction, Data mining, Clustering, Web search, Natural language processing for IR

Journal details

Download article

In this article:

Abstract

In the age of Big Data, efficient algorithms are now in higher demand more than ever before. While Big Data takes us into the asymptotic world envisioned by our pioneers, it also challenges the classical notion of efficient algorithms: Algorithms that used to be considered efficient, according to polynomial-time characterization, may no longer be adequate for solving today’s problems. It is not just desirable, but essential, that efficient algorithms should be scalable. In other words, their complexity should be nearly linear or sub-linear with respect to the problem size. Thus, scalability, not just polynomial-time computability, should be elevated as the central complexity notion for characterizing efficient computation. In this tutorial, I will survey a family of algorithmic techniques for the design of provably-good scalable algorithms. These techniques include local network exploration, advanced sampling, sparsification, and geometric partitioning. They also include spectral graph-theoretical methods, such as those used for computing electrical flows and sampling from Gaussian Markov random fields. These methods exemplify the fusion of combinatorial, numerical, and statistical thinking in network analysis. I will illustrate the use of these techniques by a few basic problems that are fundamental in network analysis, particularly for the identification of significant nodes and coherent clusters/communities in social and information networks. I also take this opportunity to discuss some frameworks beyond graph-theoretical models for studying conceptual questions to understand multifaceted network data that arise in social influence, network dynamics, and Internet economics.

DOI:10.1561/0400000051

Book details

ISBN: 978-1-68083-130-6

290 pp. $99.00

Buy book (pb)

ISBN: 978-1-68083-131-3

290 pp. $260.00

Buy E-book (.pdf)

Table of contents:

Preface

1. Scalable Algorithms

2. Networks and Data

3. Significant Nodes: Sampling - Making Data Smaller

4. Clustering: Local Exploration of Networks

5. Partitioning: Geometric Techniques for Data Analysis

6. Sparsification: Sparsification - Making Networks Simpler

7. Electrical Flows: Laplacian Paradigm for Network Analysis

8. Remarks and Discussions

Acknowledgements

References

Scalable Algorithms for Data and Network Analysis

In the age of Big Data, efficient algorithms are in higher demand more than ever before. While Big Data takes us into the asymptotic world envisioned by our pioneers, it also challenges the classical notion of efficient algorithms: Algorithms that used to be considered efficient, according to polynomial-time characterization, may no longer be adequate for solving today's problems. It is not just desirable but essential that efficient algorithms should be scalable. In other words, their complexity should be nearly linear or sub-linear with respect to the problem size. Thus, scalability, not just polynomial-time computability, should be elevated as the central complexity notion for characterizing efficient computation.

Scalable Algorithms for Data and Network Analysis surveys a family of algorithmic techniques for the design of scalable algorithms. These techniques include local network exploration, advanced sampling, sparsification, and geometric partitioning. They also include spectral graph-theoretical methods, such as are used for computing electrical flows and sampling from Gaussian Markov random fields. These methods exemplify the fusion of combinatorial, numerical, and statistical thinking in network analysis. Scalable Algorithms for Data and Network Analysis illustrates the use of these techniques by a few basic problems that are fundamental in analyzing network data, particularly for the identification of significant nodes and coherent clusters/communities in social and information networks. It also discusses some frameworks beyond graph-theoretical models for studying conceptual questions that arise in network analysis and social influences.

Scalable Algorithms for Data and Network Analysis

Free Preview:

Share

Journal details

Abstract

Book details

Scalable Algorithms for Data and Network Analysis