now publishers - An Overview of Language Models: Recent Developments and Outlook

APSIPA Transactions on Signal and Information Processing > Vol 13 > Issue 2

An Overview of Language Models: Recent Developments and Outlook

Chengwei Wei, University of Southern California, USA, chengwei@usc.edu , Yun-Cheng Wang, University of Southern California, USA, Bin Wang, National University of Singapore, Singapore, C.-C. Jay Kuo, University of Southern California, USA

Suggested Citation

Chengwei Wei, Yun-Cheng Wang, Bin Wang and C.-C. Jay Kuo (2024), "An Overview of Language Models: Recent Developments and Outlook", APSIPA Transactions on Signal and Information Processing: Vol. 13: No. 2, e101. http://dx.doi.org/10.1561/116.00000010

Publication Date: 12 Feb 2024

Subjects

Keywords

Language model, Natural language processing, Pre-trained language model, Conventional language model

Journal details

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 1426 times

In this article:

Abstract

Language modeling studies the probability distributions over strings of texts. It is one of the most fundamental tasks in natural language processing (NLP). It has been widely used in text generation, speech recognition, machine translation, etc. Conventional language models (CLMs) aim to predict the probability of linguistic sequences in a causal manner, while pre-trained language models (PLMs) cover broader concepts and can be used in both causal sequential modeling and fine-tuning for downstream applications. PLMs have their own training paradigms (usually self-supervised) and serve as foundation models in modern NLP systems. This overview paper provides an introduction to both CLMs and PLMs from five aspects, i.e., linguistic units, architectures, training methods, evaluation methods, and applications. Furthermore, we discuss the relationship between CLMs and PLMs and shed light on the future directions of language modeling in the pre-trained era.

DOI:10.1561/116.00000010

Related publications

Companion

APSIPA Transactions on Signal and Information Processing Special Issue - Pre-trained Large Language Models for Information Processing
See the other articles that are part of this special issue.

Introduction
Types of Language Models
Linguistic Units
Architecture of Language Models
Pre-trained Language Models
Model Evaluation
Language Models in Text Generation
Efficient Models
Future Research Directions
Conclusion
References

An Overview of Language Models: Recent Developments and Outlook

Share

Journal details

Abstract

Related publications