APSIPA Transactions on Signal and Information Processing > Vol 11 > Issue 2

An Application-Oriented Taxonomy on Spoofing, Disguise and Countermeasures in Speaker Recognition

Lantian Li, Center for Speech and Language Technologies, Beijing National Research Center for Information Science and Technology, Tsinghua University, China, Xingliang Cheng, Center for Speech and Language Technologies, Beijing National Research Center for Information Science and Technology, Tsinghua University, China, Thomas Fang Zheng, Center for Speech and Language Technologies, Beijing National Research Center for Information Science and Technology, Tsinghua University, China, fzheng@tsinghua.edu.cn
 
Suggested Citation
Lantian Li, Xingliang Cheng and Thomas Fang Zheng (2022), "An Application-Oriented Taxonomy on Spoofing, Disguise and Countermeasures in Speaker Recognition", APSIPA Transactions on Signal and Information Processing: Vol. 11: No. 2, e39. http://dx.doi.org/10.1561/116.00000017

Publication Date: 28 Dec 2022
© 2022 L. Li, X. Cheng and T. F. Zheng
 
Subjects
 
Keywords
Speaker recognitionspoofingdisguisecountermeasuresapplication
 

Share

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 1092 times

In this article:
Introduction 
Our Taxonomy 
Countermeasures against Spoofing Attacks 
Countermeasures against Disguise Cheating 
Discussion 
Conclusions 
References 

Abstract

Speaker recognition aims to recognize the identity of the speaking person. After decades of research, current speaker recognition systems have achieved rather satisfactory performance, and have been deployed in a wide range of practical applications. However, a massive amount of evidence shows that these systems are susceptible to malicious fake actions in real applications. To address this issue, the research community has been responding with dedicated countermeasures which aim to defend against fake actions. Recently, there are several reviews and surveys reported in the literature that describe the current state-of-the-art research advancements. Even so, these reviews and surveys are generally based on a canonical taxonomy to categorize spoofing attacks and corresponding countermeasures from the technology-oriented perspective. This paper provides a new taxonomy from the application-oriented perspective and extends to two major fake forms: spoofing attack and disguise cheating. This taxonomy starts from the applications of speaker recognition technology, e.g., access control, surveillance and forensic, and then rezones two fake forms according to different application scenarios: one is spoofing attack that imitates the voice of an authorized speaker to get access to the target system; the other one is disguise cheating that makes someone unrecognizable by altering his/her voice. Furthermore, for each fake form, more delicate categories and related countermeasures are presented. Finally, this paper discusses future research directions in this area and suggests that the research community should not only focus on the technical view but also connect with application scenarios.

DOI:10.1561/116.00000017

Companion

APSIPA Transactions on Signal and Information Processing Special Issue - Multi-Disciplinary Dis/Misinformation Analysis and Countermeasures: Articles Overview
See the other articles that are part of this special issue.