By Gerard Hoberg, University of Southern California, USA, hoberg@marshall.usc.edu | Asaf Manela, Washington University, USA, amanela@wustl.edu
We summarize the wide array of natural language processing (NLP) tools used in financial economics research. These tools empower researchers to incorporate rich but subjective textual data into advanced empirical analysis. NLP tools have pros and cons, and some are better suited to certain research agendas. Research using these tools has exploded in prevalence over the past ten years, and we document the major contributions in corporate finance, asset pricing, and beyond. These tools offer the flexibility to test hypotheses that were not possible before their advent, while also offering improvements in the clarity of identification and the ability to separate hypotheses that purport to explain a set of findings. Finally, we identify challenges and directions for future work.
Natural Language Processing (NLP), a branch of artificial intelligence, has become an increasingly influential tool in financial economics by enabling the systematic transformation of textual data into measurable economic variables. Recent advances, particularly the rise of large language models, have accelerated research possibilities, revealing both unprecedented opportunities and notable limitations. The Natural Language of Finance documents the historical evolution and current state of the art of NLP, emphasizing its role in measuring previously inaccessible variables, enhancing precision, improving rigor, and strengthening interpretability. At the same time, the monograph highlights critical challenges, including biased corpora, black-box concerns, look-ahead bias, and risks from misaligned tool selection. To address methodological complexity, the authors introduce a researcher-objective-based framework, classifying projects into three Research Objective Categories (ROCs): targeted, holistic, and comparative. By mapping tools to objectives and disciplines, they illustrate how NLP reshapes corporate finance, asset pricing, and related fields. Ultimately, the authors balance benefits against costs, offering guidance for effective scholarly adoption.