Foundations and Trends® in Information Retrieval > Vol 10 > Issue 4

A Survey of Query Auto Completion in Information Retrieval

Fei Cai, Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China and University of Amsterdam, The Netherlands, f.cai@uva.nl Maarten de Rijke, University of Amsterdam, The Netherlands, derijke@uva.nl
 
Suggested Citation
Fei Cai and Maarten de Rijke (2016), "A Survey of Query Auto Completion in Information Retrieval", Foundations and Trends® in Information Retrieval: Vol. 10: No. 4, pp 273-363. http://dx.doi.org/10.1561/1500000055

Published: 19 Sep 2016
© 2016 F. Cai and M. de Rijke
 
Subjects
Indexing and retrieval of structured documents,  Information extraction,  Natural language processing for IR,  Question Answering,  Text Mining,  Semantic Web
 

Free Preview:

Article Help

Share

Download article
In this article:
1. Introduction
2. Query Auto Completion
3. Heuristic Approaches to Query Auto Completion
4. Learning-based Approaches to Query Auto Completion
5. Evaluation
6. Efficiency and Robustness
7. Presentation and Interaction
8. Related Tasks
9. Conclusions
Glossary
Acknowledgements
References

Abstract

Abstract In information retrieval, query auto completion (QAC), also known as type-ahead [Xiao et al., 2013, Cai et al., 2014b] and auto-complete suggestion [Jain and Mishne, 2010], refers to the following functionality: given a prefix consisting of a number of characters entered into a search box, the user interface proposes alternative ways of extending the prefix to a full query. Ranking query completions is a challenging task due to the limited length of prefixes entered by users, the large volume of possible query completions matching a prefix, and the broad range of possible search intents. In recent years, a large number of query auto completion approaches have been proposed that produce ranked lists of alternative query completions by mining query logs. In this survey, we review work on query auto completion that has been published before 2016. We focus mainly on web search and provide a formal definition of the query auto completion problem. We describe two dominant families of approaches to the query auto completion problem, one based on heuristic models and the other based on learning to rank. We also identify dominant trends in published work on query auto completion, viz. the use of time-sensitive signals and the use of user-specific signals. We describe the datasets and metrics that are used to evaluate algorithms for query auto completion. We also devote a chapter to efficiency and a chapter to presentation and interaction aspects of query auto completion. We end by discussing related tasks as well as potential research directions to further the area.

DOI:10.1561/1500000055
ISBN: 978-1-68083-200-6
108 pp. $75.00
Buy book
 
ISBN: 978-1-68083-201-3
108 pp. $130.00
Buy E-book
Table of contents:
1. Introduction
2. Query Auto Completion
3. Heuristic Approaches to Query Auto Completion
4. Learning-based Approaches to Query Auto Completion
5. Evaluation
6. Efficiency and Robustness
7. Presentation and Interaction
8. Related Tasks
9. Conclusions
Glossary
Acknowledgements
References

A Survey of Query Auto Completion in Information Retrieval

In information retrieval, query auto completion (QAC), also known as type-ahead and auto-complete suggestion, refers to the following functionality: given a prefix consisting of a number of characters entered into a search box, the user interface proposes alternative ways of extending the prefix to a full query. QAC helps users to formulate their query when they have an intent in mind but not a clear way of expressing this in a query. It helps to avoid possible spelling mistakes, especially on devices with small screens. It saves keystrokes and cuts down the search duration of users which implies a lower load on the search engine, and results in savings in machine resources and maintenance. Because of the clear benefits of QAC, a considerable number of algorithmic approaches to QAC have been proposed in the past few years. Query logs have proven to be a key asset underlying most of the recent research. This monograph surveys this research. It focuses on summarizing the literature on QAC and provides a general understanding of the wealth of QAC approaches that are currently available.

A Survey of Query Auto Completion in Information Retrieval is an ideal reference on the topic. Its contributions can be summarized as follows: It provides researchers who are working on query auto completion or related problems in the field of information retrieval with a good overview and analysis of state-of-the-art QAC approaches. In particular, for researchers new to the field, the survey can serve as an introduction to the state-of-the-art. It also offers a comprehensive perspective on QAC approaches by presenting a taxonomy of existing solutions. In addition, it presents solutions for QAC under different conditions such as available high-resolution query logs, in-depth user interactions with QAC using eye-tracking, and elaborate user engagements in a QAC process. It also discusses practical issues related to QAC. Lastly, it presents a detailed discussion of core challenges and promising open directions in QAC.

 
INR-055