Foundations and Trends® in Machine Learning >
Vol 13 > Issue 4

Ljubiša Stanković, Danilo Mandic, Miloš Daković, Miloš Brajović, Bruno Scalzo, Shengxi Li and Anthony G. Constantinides (2020), "Data Analytics on Graphs Part III: Machine Learning on Graphs, from Graph Topology to Applications", Foundations and Trends® in Machine Learning: Vol. 13: No. 4, pp 332-530. http://dx.doi.org/10.1561/2200000078-3

© 2020 Ljubiša Stankovic, Danilo Mandic, Miloš Dakovic, Miloš Brajovic, Bruno Scalzo, Shengxi Li and Anthony G. Constantinides

Statistical learning theory, Spectral methods, Big data analytics and privacy, Statistical machine learning, Multidimensional signal processing

Download article
**In this article:**

Modern data analytics applications on graphs often operate on domains where graph topology is not known a priori, and hence its determination becomes part of the problem definition, rather than serving as prior knowledge which aids the problem solution. Part III of this monograph starts by a comprehensive account of ways to learn the pertinent graph topology, ranging from the simplest case where the physics of the problem already suggest a possible graph structure, through to general cases where the graph structure is to be learned from the data observed on a graph. A particular emphasis is placed on the use of standard “relationship measures” in this context, including the correlation and precision matrices, together with the ways to combine these with the available prior knowledge and structural conditions, such as the smoothness of the graph signals or sparsity of graph connections. Next, for learning sparse graphs (that is, graphs with a small number of edges), the utility of the least absolute shrinkage and selection operator, known as LASSO is addressed, along with its graph specific variant, the graphical LASSO. For completeness, both variants of LASSO are derived in an intuitive way, starting from basic principles. An in-depth elaboration of the graph topology learning paradigm is provided through examples on physically well defined graphs, such as electric circuits, linear heat transfer, social and computer networks, and springmass systems. We also review main trends in graph neural networks (GNN) and graph convolutional networks (GCN) from the perspective of graph signal filtering. Particular insight is given to the role of diffusion processes over graphs, to show that GCNs can be understood from the graph diffusion perspective. Given the largely heuristic nature of the existing GCNs, their treatment through graph diffusion processes may also serve as a basis for new designs of GCNs. Tensor representation of lattice-structured graphs is next considered, and it is shown that tensors (multidimensional data arrays) can be treated a special class of graph signals, whereby the graph vertices reside on a high-dimensional regular lattice structure. The concept of graph tensor networks then provides a unifying framework for learning on irregular domains. This part of monograph concludes with an in-dept account of emerging applications in financial data processing and underground transportation network modeling. By means of portfolio cuts of an asset graph, we show how domain knowledge can be meaningfully incorporated into investment analysis. In the underground transportation example, we demonstrate how graph theory can be used to identify those stations in the London underground network which have the greatest influence on the functionality of the traffic, and proceed, in an innovative way, to assess the impact of a station closure on service levels across the city.

560 pp. $145.00

Buy book (pb)
560 pp. $140.00

Buy E-book (.pdf)
Part I: Graphs and Spectra on Graphs

Part II: Signals on Graphs

Part III: Machine Learning on Graphs, from Graph Topology to Applications

The current availability of powerful computers and huge data sets is creating new opportunities in computational mathematics to bring together concepts and tools from graph theory, machine learning and signal processing, creating Data Analytics on Graphs.

In discrete mathematics, a graph is merely a collection of points (nodes) and lines connecting some or all of them. The power of such graphs lies in the fact that the nodes can represent entities as diverse as the users of social networks or financial market data, and that these can be transformed into signals which can be analyzed using data analytics tools. Data Analytics on Graphs is a comprehensive introduction to generating advanced data analytics on graphs that allows us to move beyond the standard regular sampling in time and space to facilitate modelling in many important areas, including communication networks, computer science, linguistics, social sciences, biology, physics, chemistry, transport, town planning, financial systems, personal health and many others.

The authors revisit graph topologies from a modern data analytics point of view, and proceed to establish a taxonomy of graph networks. With this as a basis, the authors show how the spectral analysis of graphs leads to even the most challenging machine learning tasks, such as clustering, being performed in an intuitive and physically meaningful way. The authors detail unique aspects of graph data analytics, such as their benefits for processing data acquired on irregular domains, their ability to finely-tune statistical learning procedures through local information processing, the concepts of random signals on graphs and graph shifts, learning of graph topology from data observed on graphs, and confluence with deep neural networks, multi-way tensor networks and Big Data. Extensive examples are included to render the concepts more concrete and to facilitate a greater understanding of the underlying principles.

Aimed at readers with a good grasp of the fundamentals of data analytics, this book sets out the fundamentals of graph theory and the emerging mathematical techniques for the analysis of a wide range of data acquired on graph environments. Data Analytics on Graphs will be a useful friend and a helpful companion to all involved in data gathering and analysis irrespective of area of application.