Provenance in Databases: Why, How, and Where

James Cheney; Laura Chiticariu; Wang-Chiew Tan

doi:10.1561/1900000006

Foundations and Trends® in Databases > Vol 1 > Issue 4

Provenance in Databases: Why, How, and Where

By James Cheney, University of Edinburgh, UK, jcheney@inf.ed.ac.uk | Laura Chiticariu, IBM Almaden Research Center, USA, chiti@almaden.ibm.com | Wang-Chiew Tan, University of California, USA, wctan@cs.ucsc.edu

Suggested Citation

James Cheney, Laura Chiticariu and Wang-Chiew Tan (2009), "Provenance in Databases: Why, How, and Where", Foundations and Trends® in Databases: Vol. 1: No. 4, pp 379-474. http://dx.doi.org/10.1561/1900000006

Publication Date: 02 Jun 2009

Subjects

Private and Secure Data Management

Journal details

Download article

In this article:

Abstract

Different notions of provenance for database queries have been proposed and studied in the past few years. In this article, we detail three main notions of database provenance, some of their applications, and compare and contrast amongst them. Specifically, we review why, how, and where provenance, describe the relationships among these notions of provenance, and describe some of their applications in confidence computation, view maintenance and update, debugging, and annotation propagation.

DOI:10.1561/1900000006

Book details

ISBN: 978-1-60198-233-9

100 pp. $100.00

To Order this Article please contact ⁠ Emerald Customer Support

Table of contents:

1. Introduction

2. Why-Provenance

3. How-Provenance

4. Where-Provenance

5. Comparing Models of Provenance

6. Conclusions

Acknowledgements

References

Provenance in Databases

In September 2008, Google News promoted an undated article about United Airlines' near bankruptcy in 2002. In the ensuing panic, the share price of United Airlines dropped by around 75% in a few hours. This problem was due in part to the fact that the article lacked provenance that readers could have used to determine that it was out of date. In an increasingly networked world, understanding of provenance is essential for establishing trust in data stored in databases and exchanged among Web sites. It is also critical to the process of making key business, scientific, and governmental decisions. Modern database systems are capable of producing answers efficiently. However, they are generally lacking capabilities to explain provenance such as why and how the answers were produced, or where the data in the result came from. In recent years, different notions of provenance for database queries have been studied by the authors and a growing community of researchers in databases and scientific computation.

Provenance in Databases reviews research over the past ten years on why, how, and where provenance, clarifies the relationships among these notions of provenance, and describes some of their applications in confidence computation, view maintenance and update, debugging, and annotation propagation. Provenance in Databases is intended for engineers and researchers who would like to familiarize themselves with the foundations, as well as the many challenges in the field of database provenance.

Provenance in Databases: Why, How, and Where

Free Preview:

Share

Journal details

Abstract

Book details

Provenance in Databases