now publishers - Combining acoustic signals and medical records to improve pathological voice classification

APSIPA Transactions on Signal and Information Processing > Vol 8 > Issue 1

Combining acoustic signals and medical records to improve pathological voice classification

Shih-Hau Fang, Yuan Ze University, and MOST Joint Research Center for AI Technology and All Vista Healthcare Innovation Center, Taiwan, Chi-Te Wang, Yuan Ze University, and MOST Joint Research Center for AI Technology and All Vista Healthcare Innovation Center, Taiwan AND Far Eastern Memorial Hospital, Taiwan AND University of Taipei, Taiwan, Ji-Ying Chen, Yuan Ze University, and MOST Joint Research Center for AI Technology and All Vista Healthcare Innovation Center, Taiwan AND University of Taipei, Taiwan, Yu Tsao, Research Center for Information Technology Innovation, Taiwan, yu.tsao@citi.sinica.edu.tw , Feng-Chuan Lin, Far Eastern Memorial Hospital, Taiwan AND University of Taipei, Taiwan

Suggested Citation

Shih-Hau Fang, Chi-Te Wang, Ji-Ying Chen, Yu Tsao and Feng-Chuan Lin (2019), "Combining acoustic signals and medical records to improve pathological voice classification", APSIPA Transactions on Signal and Information Processing: Vol. 8: No. 1, e14. http://dx.doi.org/10.1017/ATSIP.2019.7

Publication Date: 11 Jun 2019

Subjects

Keywords

Pathological voice, Diseases classification, Acoustic signal, Medical record, Artificial intelligence

Journal details

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 2421 times

In this article:

Abstract

This study proposes two multimodal frameworks to classify pathological voice samples by combining acoustic signals and medical records. In the first framework, acoustic signals are transformed into static supervectors via Gaussian mixture models; then, a deep neural network (DNN) combines the supervectors with the medical record and classifies the voice signals. In the second framework, both acoustic features and medical data are processed through first-stage DNNs individually; then, a second-stage DNN combines the outputs of the first-stage DNNs and performs classification. Voice samples were recorded in a specific voice clinic of a tertiary teaching hospital, including three common categories of vocal diseases, i.e. glottic neoplasm, phonotraumatic lesions, and vocal paralysis. Experimental results demonstrated that the proposed framework yields significant accuracy and unweighted average recall (UAR) improvements of 2.02–10.32% and 2.48–17.31%, respectively, compared with systems that use only acoustic signals or medical records. The proposed algorithm also provides higher accuracy and UAR than traditional feature-based and model-based combination methods.

DOI:10.1017/ATSIP.2019.7

I. INTRODUCTION
II. PATHOLOGICAL VOICE CLASSIFICATION FRAMEWORKS
III. EXPERIMENTS AND RESULTS
IV. CONCLUSION

Combining acoustic signals and medical records to improve pathological voice classification

Share

Journal details

Abstract