Foundations and Trends® in Programming Languages > Vol 8 > Issue 3-4

Safety and Trust in Artificial Intelligence with Abstract Interpretation

By Gagandeep Singh, University of Illinois Urbana-Champaign, USA, ggnds@illinois.edu | Jacob Laurel, Georgia Institute of Technology, USA, jlaurel6@gatech.edu | Sasa Misailovic, University of Illinois Urbana-Champaign, USA, misailo@illinois.edu | Debangshu Banerjee, University of Illinois Urbana-Champaign, USA, db21@illinois.edu | Avaljot Singh, University of Illinois Urbana-Champaign, USA, avaljot2@illinois.edu | Changming Xu, University of Illinois Urbana-Champaign, USA, cx23@illinois.edu | Shubham Ugare, University of Illinois Urbana-Champaign, USA, sugare2@illinois.edu | Huan Zhang, University of Illinois Urbana-Champaign, USA, huanz@illinois.edu

 
Suggested Citation
Gagandeep Singh, Jacob Laurel, Sasa Misailovic, Debangshu Banerjee, Avaljot Singh, Changming Xu, Shubham Ugare and Huan Zhang (2025), "Safety and Trust in Artificial Intelligence with Abstract Interpretation", Foundations and Trends® in Programming Languages: Vol. 8: No. 3-4, pp 250-408. http://dx.doi.org/10.1561/2500000062

Publication Date: 26 Jun 2025
© 2025 G. Singh et al.
 
Subjects
Robustness,  Deep learning,  Computational geometry,  Program verification,  Static and dynamic program analysis,  Abstract interpretation,  Optimization,  Calculus and mathematical analysis: Differential calculus and equations
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. Background
3. Formal Verification of DNNs
4. Training with Differentiable Abstract Interpreters
5. Explaining and Interpreting DNNs
6. Analyzing and Verifying Differentiable Programs
7. Conclusion
References

Abstract

Deep neural networks (DNNs) now dominate the AI landscape and have shown impressive performance in diverse application domains, including vision, natural language processing (NLP), and healthcare. However, both public and private entities have been increasingly expressing significant concern about the potential of state-of-the-art AI models to cause societal and financial harm. This lack of trust arises from their black-box construction and vulnerability against natural and adversarial noise.

As a result, researchers have spent considerable time developing automated methods for building safe and trustworthy DNNs. Abstract interpretation has emerged as the most popular framework for efficiently analyzing realistic DNNs among the various approaches. However, due to fundamental differences in the computational structure (e.g., high nonlinearity) of DNNs compared to traditional programs, developing efficient DNN analyzers has required tackling significantly different research challenges than encountered for programs.

In this monograph, we describe state-of-the-art approaches based on abstract interpretation for analyzing DNNs. These approaches include the design of new abstract domains, synthesis of novel abstract transformers, abstraction refinement, and incremental analysis. We will discuss how the analysis results can be used to: (i) formally check whether a trained DNN satisfies desired output and gradient-based safety properties, (ii) guide the model updates during training towards satisfying safety properties, and (iii) reliably explain and interpret the black-box workings of DNNs.

DOI:10.1561/2500000062
ISBN: 978-1-63828-586-1
172 pp. $99.00
Buy book (pb)
 
ISBN: 978-1-63828-587-8
172 pp. $320.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. Background
3. Formal Verification of DNNs
4. Training with Differentiable Abstract Interpreters
5. Explaining and Interpreting DNNs
6. Analyzing and Verifying Differentiable Programs
7. Conclusion
References

Safety and Trust in Artificial Intelligence with Abstract Interpretation

Deep neural networks (DNNs) are currently the dominant technology in Artificial Intelligence (AI) and have shown impressive performance in diverse applications, including autonomous driving, medical diagnosis, text generation, and logical reasoning. However, they lack transparency due to their black-box construction and are vulnerable to environmental and adversarial noise. These issues have caused concerns about their safety and trust when deployed in the real world. Although standard training optimizes the model’s accuracy, it does not take into account desirable safety properties such as robustness, fairness, and monotonicity.

As a result, researchers have spent considerable time developing automated methods for building safe and trustworthy DNNs. Abstract interpretation has emerged as the most popular framework for efficiently analyzing realistic DNNs among the various approaches. However, due to fundamental differences in the computational structure of DNNs compared to traditional programs, developing efficient DNN analyzers has required tackling significantly different research challenges than those encountered for programs.

In this monograph, state-of-the-art approaches based on abstract interpretation for analyzing DNNs are described. These approaches include the design of new abstract domains, synthesis of novel abstract transformers, abstraction refinement, and incremental analysis. Discussed is how the analysis results can be used to: (i) formally check whether a trained DNN satisfies desired output and gradient-based safety properties, (ii) guide the model updates during training towards satisfying safety properties, and (iii) reliably explain and interpret the black-box workings of DNNs.

 
PGL-062