APSIPA Transactions on Signal and Information Processing > Vol 14 > Issue 1

Improvement of Sound Quality in Visual Microphone by Manipulation of Focused Area

Hayata Nakano, Ritsumeikan University, Japan, is0516vr@ed.ritsumei.ac.jp , Yuting Geng, Ritsumeikan University, Japan, Kenta Iwai, Ritsumeikan University, Japan, Takanobu Nishiura, Ritsumeikan University, Japan
 
Suggested Citation
Hayata Nakano, Yuting Geng, Kenta Iwai and Takanobu Nishiura (2025), "Improvement of Sound Quality in Visual Microphone by Manipulation of Focused Area", APSIPA Transactions on Signal and Information Processing: Vol. 14: No. 1, e5. http://dx.doi.org/10.1561/116.20240087

Publication Date: 29 Apr 2025
© 2025 H. Nakano, Y. Geng, K. Iwai and T. Nishiura
 
Subjects
Audio signal processing,  Enhancement
 
Keywords
Sound extractioncaptured videosound quality improvementout-of-focused area removalweighted phase variation
 

Share

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 9 times

In this article:
Introduction 
Conventional sound extraction method in visual microphone 
Proposed sound extraction method for visual microphone 
Evaluation experiments 
Conclusion 
Acknowledgments 
References 

Abstract

This paper presents methods for improving the sound quality of visual microphone by emphasizing the focused area in a captured video. When sound reaches an object, it causes vibrations of the object’s surface. Thus, the visual microphone can extract sound by measuring the displacement of the object captured by the camera. In the captured video, there may be area that arise blurred (out-of- focused) due to the depth of field. In out-of-focused area of the captured video, it is difficult to accurately measure the displacement of the object being vibrated by sound, which may deteriorate the quality of the extracted sound. Here, the out-of-focused area is not taken into account in conventional sound extraction methods. In this paper, we propose three methods to extract sound by focusing on the focused area, where displacement can be measured more accurately than in the out-of-focused area. The proposed methods utilize out-of-focused area removal, weighted phase variation, and both processing to emphasize the measured displacement in the focused area of the captured video by using the focal rate that represents the degree of focus. Experimental results show that the proposed methods improve the quality of the extracted sound compared to the conventional method.

DOI:10.1561/116.20240087