Foundations and Trends® in Databases > Vol 9 > Issue 3-4

Data Provenance

Boris Glavic, Illinois Institute of Technology, USA, bglavic@iit.edu
 
Suggested Citation
Boris Glavic (2021), "Data Provenance", Foundations and Trends® in Databases: Vol. 9: No. 3-4, pp 209-441. http://dx.doi.org/10.1561/1900000068

Publication Date: 28 Apr 2021
© 2021 Boris Glavic
 
Subjects
Metadata Management,  Trust and Provenance
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. Provenance Models - Formalizing Provenance Semantics
3. Applications
4. Provenance Capture, Storage, and Querying
5. Connection to Other Research Fields
6. Summary and Conclusions
Acknowledgements
References
Index

Abstract

Data provenance has evolved from a niche topic to a mainstream area of research in databases and other research communities. This article gives a comprehensive introduction to data provenance. The main focus is on provenance in the context of databases. However, it will be insightful to also consider connections to related research in programming languages, software engineering, semantic web, formal logic, and other communities. The target audience are researchers and practitioners that want to gain a solid understanding of data provenance and the state-of-the-art in this research area. The article only assumes that the reader has a basic understanding of database concepts, but not necessarily any prior exposure to provenance.

DOI:10.1561/1900000068
ISBN: 978-1-68083-828-2
246 pp. $99.00
Buy book (pb)
 
ISBN: 978-1-68083-829-9
246 pp. $280.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. Provenance Models - Formalizing Provenance Semantics
3. Applications
4. Provenance Capture, Storage, and Querying
5. Connection to Other Research Fields
6. Summary and Conclusions
Acknowledgements
References
Index

Data Provenance: Origins, Applications, Algorithms, and Models

The term provenance is used in the art world to describe a record of the history of ownership of a piece of art. This term has been adapted by the database community to describe a record of the origin of a piece of data. Data provenance emerged as a research topic in the database community in the late 1990s. Data provenance, by explaining how the result of an operation was derived from its inputs, has proven to be a useful tool that is applicable in a wide variety of applications.

This monograph gives a comprehensive introduction to data provenance concepts, algorithms, and methodology developed in the last few decades. It introduces the reader to the formalisms, algorithms, and system’s developments in this fascinating field as well as providing a collection of relevant literature references for further research. The monograph provides a concise starting point for research into and using provenance in data. Although focusing on data provenance in databases pointers to work in other fields are given throughout.

The intended audience is researchers and practitioners unfamiliar with the topic who want to develop a basic understanding of provenance techniques and the state-of-the-art in the field as well as researchers with prior experience in provenance that want to broaden their horizon.

 
DBS-068