Machine Learning Approaches for Whisper to Normal Speech Conversion: A Survey

Marco A. Oliveira

doi:10.24840/2183-6493_008.002_0016

Machine Learning Approaches for Whisper to Normal Speech Conversion: A Survey

PDF

Published: Apr 28, 2022

DOI: https://doi.org/10.24840/2183-6493_008.002_0016

Issue: Vol. 8 No. 2 (2022)

Keywords:

Signal Processing, Machine Learning, Whispered Speech, Normal Speech, Voice Conversion, Speech Conversion

Marco A. Oliveira

Faculty of Engineering, University of Porto

https://orcid.org/0000-0002-3161-1109

Abstract

Whispered speech is a mode of speech that differs from normal speech due to the absence of a periodic component, namely the Fundamental Frequency that characterizes the pitch, among other spectral and temporal differences. Much attention has been given in recent years to the application of Machine Learning techniques for voice conversion tasks. The whisper-to-normal speech conversion is particularly challenging, however, especially with respect to the Fundamental Frequency estimation. Based on the most recent literature, this survey assesses the state-of-the-art regarding Machine Learning based whisper-to-normal speech conversion, identifying trends both on modeling and training approaches. The proposed solutions include Generative Adversarial Network based, Autoencoder based and Bidirectional Long Short-Term Memory based frameworks, among other Deep Neural Network based architectures. In addition to Parallel versus Non-Parallel training, time-alignment requirements and strategies, datasets, vocoder usage, as well as both objective and subjective evaluation metrics are also covered by the present survey.

Downloads

Download data is not yet available.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors grant the journal the rights to provide the article in all forms and media so the article can be used on the latest technology even after publication and ensure its long-term preservation.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

Author Biography

Marco A. Oliveira, Faculty of Engineering, University of Porto

Department of Electrical and Computer Engineering

Faculty of Engineering

University of Porto

Rua Dr. Roberto Frias

4200-465 Porto

Portugal

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

Marco A. Oliveira, Faculty of Engineering, University of Porto