Foundations and Trends® in Information Retrieval > Vol 5 > Issue 2–3

Automatic Summarization

Ani Nenkova, University of Pennsylvania, USA, nenkova@seas.upenn.edu Kathleen McKeown, Columbia University, USA, kathy@cs.columbia.edu
 
Suggested Citation
Ani Nenkova and Kathleen McKeown (2011), "Automatic Summarization", Foundations and Trends® in Information Retrieval: Vol. 5: No. 2–3, pp 103-233. http://dx.doi.org/10.1561/1500000015

Published: 30 Jun 2011
© 2011 A. Nenkova and K. McKeown
 
Subjects
Summarization
 

Free Preview:

Article Help

Share

Download article
In this article:
1 Introduction
2 Sentence Extraction: Determining Importance
3 Methods Using Semantics and Discourse
4 Generation for Summarization
5 Genre and Domain Specific Approaches
6 Intrinsic Evaluation
7 Conclusions
References

Abstract

It has now been 50 years since the publication of Luhn's seminal paper on automatic summarization. During these years the practical need for automatic summarization has become increasingly urgent and numerous papers have been published on the topic. As a result, it has become harder to find a single reference that gives an overview of past efforts or a complete view of summarization tasks and necessary system components. This article attempts to fill this void by providing a comprehensive overview of research in summarization, including the more traditional efforts in sentence extraction as well as the most novel recent approaches for determining important content, for domain and genre specific summarization and for evaluation of summarization. We also discuss the challenges that remain open, in particular the need for language generation and deeper semantic understanding of language that would be necessary for future advances in the field.

DOI:10.1561/1500000015
ISBN: 978-1-60198-470-8
148 pp. $99.00
Buy book
 
ISBN: 978-1-60198-471-5
148 pp. $175.00
Buy E-book
Table of contents:
1: Introduction
2: Sentence extraction: determining importance
3: Methods using semantics and discourse
4: Generation for summarization
5: Genre and domain specific approaches
6: Intrinsic evaluation
7: Conclusions
References

Automatic Summarization

Today's world is all about information, most of it online. The World Wide Web contains billions of documents and is growing at an exponential pace. Tools that provide timely access to, and digest of, various sources are necessary in order to alleviate the information overload people are facing. The need for such tools sparked interest in the development of automatic summarization systems. Such systems are designed to take a single article, a cluster of news articles, a broadcast news show, or an email thread as input, and produce a concise and fluent summary of the most important information. Recent years have seen the development of numerous summarization applications for news, email threads, lay and professional medical information, scientific articles, spontaneous dialogues, voicemail, broadcast news and video, and meeting recordings. These systems, imperfect as they are, have already been shown to help users and to enhance other automatic applications and interfaces. Automatic Summarization provides a comprehensive overview of research in summarization, including the more traditional efforts in sentence extraction as well as the most novel recent approaches for determining important content, for domain and genre specific summarization and for evaluation of summarization. It also discusses the challenges that remain open, in particular the need for language generation and deeper semantic understanding of language that would be necessary for future advances in the field.

 
INR-015