now publishers - An analysis of speaker dependent models in replay detection

APSIPA Transactions on Signal and Information Processing > Vol 9 > Issue 1

An analysis of speaker dependent models in replay detection

Gajan Suthokumar, University of New South Wales, Australia AND Data61, CSIRO, Australia, g.suthokumar@unsw.edu.au , Kaavya Sriskandaraja, University of New South Wales, Australia, Vidhyasaharan Sethu, University of New South Wales, Australia, Eliathamby Ambikairajah, University of New South Wales, Australia AND Data61, CSIRO, Australia, Haizhou Li, National University of Singapore, Singapore

Suggested Citation

Gajan Suthokumar, Kaavya Sriskandaraja, Vidhyasaharan Sethu, Eliathamby Ambikairajah and Haizhou Li (2020), "An analysis of speaker dependent models in replay detection", APSIPA Transactions on Signal and Information Processing: Vol. 9: No. 1, e14. http://dx.doi.org/10.1017/ATSIP.2020.9

Publication Date: 30 Apr 2020

Subjects

Keywords

Speaker Dependent Models, Replay Attack, Spoofing Detection, Speaker Verification, Speaker Adapted Neural Networks

Journal details

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 2443 times

In this article:

Abstract

Most research on replay detection has focused on developing a stand-alone countermeasure that runs independently of a speaker verification system by training a single spoofed model and a single genuine model for all speakers. In this paper, we explore the potential benefits of adapting the back-end of a spoofing detection system towards the claimed target speaker. Specifically, we characterize and quantify speaker variability by comparing speaker-dependent and speaker-independent (SI) models of feature distributions for both genuine and spoofed speech. Following this, we develop an approach for implementing speaker-dependent spoofing detection using a Gaussian mixture model (GMM) back-end, where both the genuine and spoofed models are adapted to the claimed speaker. Finally, we also develop and evaluate a speaker-specific neural network-based spoofing detection system in addition to the GMM based back-end. Evaluations of the proposed approaches on replay corpora BTAS2016 and ASVspoof2017 v2.0 reveal that the proposed speaker-dependent spoofing detection outperforms equivalent SI replay detection baselines on both datasets. Our experimental results show that the use of speaker-specific genuine models leads to a significant improvement (around 4% in terms of equal error rate (EER)) as previously shown and the addition of speaker-specific spoofed models adds a small improvement on top (less than 1% in terms of EER).

DOI:10.1017/ATSIP.2020.9

I. INTRODUCTION
II. ANALYSIS OF SPEAKER VARIABILITY
III. PROPOSED SPEAKER DEPENDENT SPOOFING DETECTION GMM BACKEND
IV. PROPOSED SPEAKER DEPENDENT DEEP NEURAL NETWORK BACKEND
V. DATABASES and DATA PREPARATION
VI. FRONT-END FEATURES
VII. EXPERIMENTAL SETTING
VIII. RESULTS and DISCUSSION
IX. CONCLUSION

An analysis of speaker dependent models in replay detection

Share

Journal details

Abstract