APSIPA Transactions on Signal and Information Processing > Vol 7 > Issue 1

Noise masking method based on an effective ratio mask estimation in Gammatone channels

Feng Bao, University of Auckland, New Zealand, fbao026@aucklanduni.ac.nz , Waleed H. Abdulla, University of Auckland, New Zealand
 
Suggested Citation
Feng Bao and Waleed H. Abdulla (2018), "Noise masking method based on an effective ratio mask estimation in Gammatone channels", APSIPA Transactions on Signal and Information Processing: Vol. 7: No. 1, e5. http://dx.doi.org/10.1017/ATSIP.2018.7

Publication Date: 15 May 2018
© 2018 Feng Bao and Waleed H. Abdulla
 
Subjects
 
Keywords
CASANoise maskingRatio mask estimationConvex optimization
 

Share

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 1634 times

In this article:
I. INTRODUCTION 
II. THE PRINCIPLE OF THE PROPOSED METHOD 
III. EXPERIMENTS AND RESULTS 
IV. CONCLUSIONS 

Abstract

In computational auditory scene analysis, the accurate estimation of binary mask or ratio mask plays a key role in noise masking. An inaccurate estimation often leads to some artifacts and temporal discontinuity in the synthesized speech. To overcome this problem, we propose a new ratio mask estimation method in terms of Wiener filtering in each Gammatone channel. In the reconstruction of Wiener filter, we utilize the relationship of the speech and noise power spectra in each Gammatone channel to build the objective function for the convex optimization of speech power. To improve the accuracy of estimation, the estimated ratio mask is further modified based on its adjacent time–frequency units, and then smoothed by interpolating with the estimated binary masks. The objective tests including the signal-to-noise ratio improvement, spectral distortion and intelligibility, and subjective listening test demonstrate the superiority of the proposed method compared with the reference methods.

DOI:10.1017/ATSIP.2018.7