|
|
|
|
The Probabilistic Relevance Framework:
BM25 and Beyond
Foundations and Trends® in Information Retrieval
Volume 3 Issue 4
The Probabilistic Relevance Framework:
BM25 and Beyond
Stephen Robertson
Microsoft Research ser@microsoft.com
Hugo Zaragoza
Yahoo! Research hugoz@yahoo-inc.com
SUGGESTED CITATION:
Stephen Robertson and Hugo Zaragoza (2009)
"The Probabilistic Relevance Framework: BM25 and Beyond",
Foundations and Trends® in Information Retrieval: Vol. 3: No 4, pp 333-389.
http:/dx.doi.org/10.1561/1500000019
Abstract
The Probabilistic Relevance Framework (PRF) is a formal framework
for document retrieval, grounded in work done in the 1970-80s, which led to the development
of one of the most successful text-retrieval algorithms, BM25. In recent years, research
in the PRF has yielded new retrieval models capable of taking into account document
metadata (especially structure and link-graph information). Again, this has led to
one of the most successful web-search and corporate-search algorithms, BM25F. This
work presents the PRF from a conceptual point of view, describing the probabilistic
modelling assumptions behind the framework and the different ranking algorithms that
result from its application: the binary independence model, relevance feedback models,
BM25, BM25F. It also discusses the relation between the PRF and other statistical models
for IR, and covers some related topics, such as the use of non-textual features, and parameter
optimisation for models with free parameters.
|
|
|
|
|
|
|
|
|