Foundations and Trends® in Information Retrieval > Vol 7 > Issue 4

Arabic Information Retrieval

By Kareem Darwish, Qatar Computing Research Institute, Qatar, kdarwish@qf.org.qa | Walid Magdy, Qatar Computing Research Institute, Qatar, wmagdy@qf.org.qa

 
Suggested Citation
Kareem Darwish and Walid Magdy (2014), "Arabic Information Retrieval", Foundations and TrendsĀ® in Information Retrieval: Vol. 7: No. 4, pp 239-342. http://dx.doi.org/10.1561/1500000031

Publication Date: 05 Feb 2014
© 2013 K. Darwish and W. Magdy
 
Subjects
Search,  Languages on the web,  Databases on the web,  Information retrieval,  Natural language processing for IR,  Cross-lingual and multilingual IR,  Applications of IR
 
Keywords
Arabic IRArabic NLP
 

Free Preview:

Download extract

Share

Download article
In this article:
1 Introduction 
2. Arabic Features Affecting Retrieval 
3. Arabic Preprocessing and Indexing 
4. Arabic IR in Shared-Task Evaluations 
5. Domain-specific IR 
6. Open Research Areas in Arabic IR 
7. Conclusions 
Appendices 
References 

Abstract

In the past several years, Arabic Information Retrieval (IR) has garnered significant attention. The main research interests have focused on retrieval of formal language, mostly in the news domain, with ad hoc retrieval, OCR document retrieval, and cross-language retrieval. The literature on other aspects of retrieval continues to be sparse or non-existent, though some of these aspects have been investigated by industry. Others aspects of Arabic retrieval that have received attention include document image retrieval, speech search, social media and web search, and filtering. However, efforts on different aspects of Arabic retrieval continue to be deficient and severely lacking behind efforts in other languages. The survey covers: 1) general properties of the Arabic language; 2) some of the aspects of Arabic that affect retrieval; 3) Arabic processing necessary for effective Arabic retrieval; 4) Arabic retrieval in public IR evaluations; 5) specialized retrieval problems, namely Arabic-English CLIR, Arabic Document Image Retrieval, Arabic Social Search, Arabic Web Search, Question Answering, Image retrieval, and Arabic Speech Search; 6) Arabic IR and NLP resources; and 7) open IR problems that require further attention.

DOI:10.1561/1500000031
ISBN: 978-1-60198-776-1
122 pp. $85.00
Buy book (pb)
 
ISBN: 978-1-60198-777-8
122 pp. $120.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. Arabic Features Affecting Retrieval
3. Arabic Preprocessing and Indexing
4. Arabic IR in Shared-Task Evaluations
5. Domain-specific IR
6. Open Research Areas in Arabic IR
7. Conclusions
Appendices
References

Arabic Information Retrieval

Arabic is ranked as the seventh largest language on the Internet but it has also been the fastest growing language in the last decade in terms of users. At this rate of growth, Arabic users should have the fourth largest user population on the Internet by 2020. Given these facts, it is not surprising that Arabic Information Retrieval (IR) has garnered significant attention. The main research interests have focused on retrieval of formal language, mostly in the news domain, with ad hoc retrieval, OCR document retrieval, and cross-language retrieval. The literature on other aspects of retrieval continues to be sparse or non-existent, though some of these aspects have been investigated by industry. Others aspects of Arabic retrieval that have received attention include document image retrieval, speech search, filtering, and social media and web search. However, efforts within different aspects of Arabic retrieval continue to be deficient and severely lacking behind efforts in other languages.

Arabic Information Retrieval reviews Arabic IR including the nature of the Arabic language, the techniques used for pre-processing the language, the latest research in Arabic IR in different domains, and the open areas in Arabic IR. It covers general properties of the Arabic language, aspects of Arabic that affect retrieval, Arabic processing necessary for effective Arabic retrieval, Arabic retrieval in public IR evaluations, Arabic IR and NLP resources, and specialized retrieval problems such as Arabic-English CLIR, Arabic Document Image Retrieval, Arabic Social Search, Arabic Web Search, Question Answering, Image retrieval, and Arabic Speech Search. Lastly, it also discusses open IR problems that require further attention.

 
INR-031