Foundations and Trends® in Information Retrieval > Vol 12 > Issue 1

Web Forum Retrieval and Text Analytics: A Survey

By Doris Hoogeveen, University of Melbourne, Australia, dhoogeveen@student.unimelb.edu.au | Li Wang, Evernote, USA, li@liwang.info | Timothy Baldwin, University of Melbourne, Australia, tb@ldwin.net | Karin M. Verspoor, University of Melbourne, Australia, karin.verspoor@unimelb.edu.au

 
Suggested Citation
Doris Hoogeveen, Li Wang, Timothy Baldwin and Karin M. Verspoor (2018), "Web Forum Retrieval and Text Analytics: A Survey", Foundations and TrendsĀ® in Information Retrieval: Vol. 12: No. 1, pp 1-163. http://dx.doi.org/10.1561/1500000062

Publication Date: 03 Jan 2018
© 2018 D. Hoogeveen, L. Wang, T. Baldwin, K. M. Verspoor
 
Subjects
Indexing and retrieval of structured documents,  Information categorization and clustering,  Natural language processing for IR,  Question answering
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. Post classification
3. Post retrieval
4. Thread level tasks
5. Social forum analysis
6. Conclusion
Acknowledgements
References

Abstract

This survey presents an overview of information retrieval, natural language processing and machine learning research that makes use of forum data, including both discussion forums and community questionanswering (cQA) archives. The focus is on automated analysis, with the goal of gaining a better understanding of the data and its users. We discuss the different strategies used for both retrieval tasks (post retrieval, question retrieval, and answer retrieval) and classification tasks (post type classification, question classification, post quality assessment, subjectivity, and viewpoint classification) at the post level, as well as at the thread level (thread retrieval, solvedness and task orientation, discourse structure recovery and dialogue act tagging, QA-pair extraction, and thread summarisation). We also review work on forum users, including user satisfaction, expert finding, question recommendation and routing, and community analysis. The survey includes a brief history of forums, an overview of the different kinds of forums, a summary of publicly available datasets for forum research, and a short discussion on the evaluation of retrieval tasks using forum data. The aim is to give a broad overview of the different kinds of forum research, a summary of the methods that have been applied, some insights into successful strategies, and potential areas for future research.

DOI:10.1561/1500000062
ISBN: 978-1-68083-350-8
156 pp. $99.00
Buy book (pb)
 
ISBN: 978-1-68083-351-5
156 pp. $140.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. Post classification
3. Post retrieval
4. Thread level tasks
5. Social forum analysis
6. Conclusion
Acknowledgements
References

Web Forum Retrieval and Text Analytics: A Survey

Web Forum Retrieval and Text Analytics: A Survey presents an overview of information retrieval, natural language processing and machine learning research that makes use of forum data, including both discussion forums and community question-answering (cQA) archives. The focus is on automated analysis, with the goal of providing the reader with a better understanding of the data and its users. It discusses the different strategies used for both retrieval tasks (post retrieval, question retrieval, and answer retrieval) and classification tasks (post type classification, question classification, post quality assessment, subjectivity, and viewpoint classification) at the post level, as well as at the thread level (thread retrieval, solvedness and task orientation, discourse structure recovery and dialogue act tagging, QA-pair extraction, and thread summarisation). It also reviews work on forum users, including user satisfaction, expert finding, question recommendation and routing, and community analysis.

Web Forum Retrieval and Text Analytics: A Survey includes a brief history of forums, an overview of the different kinds of forums, a summary of publicly available datasets for forum research, and a short discussion on the evaluation of retrieval tasks using forum data. Covering 450 papers, it provides the reader with a broad overview of the different kinds of forum research, a summary of the methods that have been applied, some insights into successful strategies, and potential areas for future research.

 
INR-062