Foundations and Trends® in Computer Graphics and Vision > Vol 17 > Issue 2

Demystifying Variational Diffusion Models

By Fabio De Sousa Ribeiro, Imperial College London, UK, f.de-sousa-ribeiro@imperial.ac.uk | Ben Glocker, Imperial College London, UK, b.glocker@imperial.ac.uk

 
Suggested Citation
Fabio De Sousa Ribeiro and Ben Glocker (2025), "Demystifying Variational Diffusion Models", Foundations and TrendsĀ® in Computer Graphics and Vision: Vol. 17: No. 2, pp 76-170. http://dx.doi.org/10.1561/0600000113

Publication Date: 28 Apr 2025
© 2025 F. De Sousa Ribeiro and B. Glocker
 
Subjects
Deep learning,  Variational inference,  Graphical models,  Bayesian learning,  Statistical/Machine learning,  Image and video processing,  Decoding and inverse problems,  Learning and statistical methods,  Mathematical modelling,  Probability and statistics: Bayesian inference
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. Latent Variable Models
3. Variational Diffusion Models
4. Understanding Diffusion Objectives
5. Discussion and Outlook
Acknowledgements
Appendix
References

Abstract

Despite the growing interest in diffusion models, gaining a deep understanding of the model class remains an elusive endeavour, particularly for the uninitiated in non-equilibrium statistical physics. Thanks to the rapid rate of progress in the field, most existing work on diffusion models focuses on either applications or theoretical contributions. Unfortunately, the theoretical material is often inaccessible to practitioners and new researchers, leading to a risk of superficial understanding in ongoing research. Given that diffusion models are now an indispensable tool, a clear and consolidating perspective on the model class is needed to properly contextualize recent advances in generative modelling and lower the barrier to entry for new researchers. To that end, we revisit predecessors to diffusion models, such as hierarchical latent variable models, and synthesize a holistic perspective using only directed graphical modelling and variational inference principles. The resulting narrative is easier to follow as it imposes fewer prerequisites on the average reader relative to the view from non-equilibrium thermodynamics or stochastic differential equations.

DOI:10.1561/0600000113
ISBN: 978-1-63828-560-1
108 pp. $75.00
Buy book (pb)
 
ISBN: 978-1-63828-561-8
108 pp. $160.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. Latent Variable Models
3. Variational Diffusion Models
4. Understanding Diffusion Objectives
5. Discussion and Outlook
Acknowledgements
Appendix
References

Demystifying Variational Diffusion Models

A generative model is a simulation of a data-generating process. Understanding the true generative process of data is valuable as it naturally reveals causal relationships. These causal relationships are advantageous as they tend to generalize more effectively to new situations than mere correlations, which may be spurious and unreliable. Although various generative modelling strategies exist, diffusion models have emerged as the latest dominant paradigm. Gaining a deeper understanding of the model class remains an elusive endeavour, particularly for the uninitiated in non-equilibrium statistical physics.

Thanks to the rapid rate of progress in the field, existing work on diffusion models focuses on either applications or theoretical contributions. Unfortunately, the theoretical material is often inaccessible to practitioners and new researchers, leading to a risk of superficial understanding in ongoing research. Given that diffusion models are now an indispensable tool, a clear and consolidating perspective on the model class is needed to properly contextualize recent advances in generative modelling and lower the barrier to entry for new researchers. That is what this monograph focuses on. In this text, predecessors to diffusion models are revisited, such as hierarchical latent variable models, and a holistic perspective using only directed graphical modelling and variational inference principles is synthesized. The resulting narrative is easier to follow, as it imposes fewer prerequisites on the average reader relative to the view from non-equilibrium thermodynamics or stochastic differential equations.

 
CGV-113