Using Python for Text Analysis in Accounting Research

Vic Anand; Khrystyna Bochkay; Roman Chychyla; Andrew Leone

doi:10.1561/1400000062

Foundations and Trends® in Accounting > Vol 14 > Issue 3–4

Using Python for Text Analysis in Accounting Research

By Vic Anand, University of Illinois at Urbana-Champaign, USA, vanand@illinois.edu | Khrystyna Bochkay, University of Miami, USA, kbochkay@bus.miami.edu | Roman Chychyla, University of Miami, USA, rchychyla@bus.miami.edu | Andrew Leone, Northwestern University, USA, andrew.leone@kellogg.northwestern.edu

Suggested Citation

Vic Anand, Khrystyna Bochkay, Roman Chychyla and Andrew Leone (2020), "Using Python for Text Analysis in Accounting Research", Foundations and Trends® in Accounting: Vol. 14: No. 3–4, pp 128-359. http://dx.doi.org/10.1561/1400000062

Publication Date: 03 Dec 2020

Subjects

Book details

ISBN: 978-1-68083-760-5

248 pp. $99.00

Buy book (pb)

ISBN: 978-1-68083-761-2

248 pp. $280.00

Buy E-book (.pdf)

Table of contents:

1. Introduction

2. Installing Python on Your Computer

3. Jupyter Notebooks

4. A Brief Introduction to the Python Programming Language

5. Working with Tabular Data:The Pandas Package

6. Introduction to Regular Expressions

7. Dictionary-Based Textual Analysis

8. Quantifying Text Complexity

9. Sentence Structure and Classification

10. Measuring Text Similarity

11. Identifying Specific Information in Text

12. Collecting Data from the Internet

Acknowledgements

References

Using Python for Text Analysis in Accounting Research

Using Python for Text Analysis in Accounting Research provides an interactive step-by-step framework for analyzing spoken or written language for faculty and PhD students in social sciences. The goal is to demonstrate how textual analysis can enhance research by automatically extracting new and previously unknown information from voluminous disclosures, news articles, and social media posts. Materials are presented in a way that allows the reader to learn about a textual analysis concept or technique and also replicate it by doing.

The monograph begins by showing how to install and use Python, a popular general purpose programming language, reviewing Python’s basic programming syntax, operators, data types, functions, and so on; allowing the readers to familiarize themselves with the programming environment first. It discusses the Jupyter notebook, which is an open-source web application that allows creating, running, and testing your Python code interactively. And the monograph introduces the Pandas package for working with tabular data that aids researchers as they convert unstructured textual data into structured, tabular data. The authors introduce regular expressions which represent patterns for matching different elements in texts. They then proceed with the discussion and coding of different textual analysis methods used in accounting and finance studies. Finally, the monograph provides an overview of web scraping and file processing features in Python with a focus on downloading EDGAR filings and identifying specific sections in them.

Taken together, the first five chapters of this monograph will help readers get started with Python and prepare for writing their own code.

Supplementary information

Supplementary material | 1400000062_supp.zip (ZIP).

This file contains the supplementary material code referred to in the monograph.

DOI: 10.1561/1400000062_supp

Using Python for Text Analysis in Accounting Research

Free Preview:

Share

Journal details

Abstract

Book details

Using Python for Text Analysis in Accounting Research

Supplementary information