Foundations and Trends® in Information Retrieval > Vol 7 > Issue 5

Semantic Matching in Search

By Hang Li, Huawei Technologies, Hong Kong, hangli.hl@huawei.com | Jun Xu, Huawei Technologies, Hong Kong, nkxujun@gmail.com

 
Suggested Citation
Hang Li and Jun Xu (2014), "Semantic Matching in Search", Foundations and Trends® in Information Retrieval: Vol. 7: No. 5, pp 343-469. http://dx.doi.org/10.1561/1500000035

Publication Date: 12 Jun 2014
© 2014 H. Li and J. Xu
 
Subjects
Formal models and language models for IR,  Web search
 

Free Preview:

Download extract

Share

Download article
In this article:
1 Introduction 
2. Semantic Matching in Search 
3. Matching by Query Reformulation 
4. Matching with Term Dependency Model 
5. Matching with Translation Model 
6. Matching with Topic Model 
7. Matching with Latent Space Model 
8. Learning to Match 
9. Conclusion and Open Problems 
Acknowledgements 
References 

Abstract

Relevance is the most important factor to assure users’ satisfaction in search and the success of a search engine heavily depends on its performance on relevance. It has been observed that most of the dissatisfaction cases in relevance are due to term mismatch between queries and documents (e.g., query “NY times” does not match well with a document only containing “New York Times”), because term matching, i.e., the bag-of-words approach, still functions as the main mechanism of modern search engines. It is not exaggerated to say, therefore, that mismatch between query and document poses the most critical challenge in search. Ideally, one would like to see query and document match with each other, if they are topically relevant. Recently, researchers have expended significant effort to address the problem. The major approach is to conduct semantic matching, i.e., to perform more query and document understanding to represent the meanings of them, and perform better matching between the enriched query and document representations. With the availability of large amounts of log data and advanced machine learning techniques, this becomes more feasible and significant progress has been made recently. This survey gives a systematic and detailed introduction to newly developed machine learning technologies for query document matching (semantic matching) in search, particularly web search. It focuses on the fundamental problems, as well as the state-of-the-art solutions of query document matching on form aspect, phrase aspect, word sense aspect, topic aspect, and structure aspect. The ideas and solutions explained may motivate industrial practitioners to turn the research results into products. The methods introduced and the discussions made may also stimulate academic researchers to find new research directions and approaches. Matching between query and document is not limited to search and similar problems can be found in question answering, online advertising, cross-language information retrieval, machine translation, recommender systems, link prediction, image annotation, drug design, and other applications, as the general task of matching between objects from two different spaces. The technologies introduced can be generalized into more general machine learning techniques, which is referred to as learning to match in this survey.

DOI:10.1561/1500000035
ISBN: 978-1-60198-804-1
140 pp. $95.00
Buy book (pb)
 
ISBN: 978-1-60198-805-8
140 pp. $120.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. Semantic Matching in Search
3. Matching by Query Reformulation
4. Matching with Term Dependency Model
5. Matching with Translation Model
6. Matching with Topic Model
7. Matching with Latent Space Model
8. Learning to Match
9. Conclusion and Open Problems
Acknowledgements
References

Semantic Matching in Search

Semantic Matching in Search is a systematic and detailed introduction to newly developed machine learning technologies for query document matching (semantic matching) in search, particularly in web search. It focuses on the fundamental problems, as well as the state-of-the-art solutions of query document matching on form aspect, phrase aspect, word sense aspect, topic aspect, and structure aspect. Matching between query and document is not limited to search, and similar problems can be found in question answering, online advertising, cross-language information retrieval, machine translation, recommender systems, link prediction, image annotation, drug design, and other applications where one is faced with the general task of matching between objects from two different spaces. The technologies introduced in this monograph can be generalized into more general machine learning techniques, which are referred to as learning to match in this survey.

It is hoped that the ideas and solutions explained in Semantic Matching in Search may motivate industrial practitioners to turn the research results into products. The methods introduced and the discussions around them should also stimulate academic researchers to find new research directions and approaches.

 
INR-035