Time-Frequency Energy Features for Articulator Position Inference on Stop Consonants

Acoustic-to-Articulatory inversion offers new perspectives and interesting applicationsin the speech processing field; however, it remains an open issue. This paper presents a method to estimate the distribution of the articulatory informationcontained in the stop consonants’ acoustics, whose parame...

Full description

Autores:
Sepulveda-Sepulveda, Alexander
Castellanos-Domínguez, German
Tipo de recurso:
Fecha de publicación:
2012
Institución:
Universidad EAFIT
Repositorio:
Repositorio EAFIT
Idioma:
eng
OAI Identifier:
oai:repository.eafit.edu.co:10784/14448
Acceso en línea:
http://hdl.handle.net/10784/14448
Palabra clave:
Acoustic-To-Articulatory Inversion
Gaussian Mixture Models
Articulatory Phonetics
Time-Frequency Features
Inversión Acústica A Articulación
Modelos De Mezcla Gaussiana
Fonética Articulatoria
Características De Frecuencia De Tiempo
Rights
License
Copyright (c) 2012 Alexander Sepulveda-Sepulveda, German Castellanos-Domínguez
Description
Summary:Acoustic-to-Articulatory inversion offers new perspectives and interesting applicationsin the speech processing field; however, it remains an open issue. This paper presents a method to estimate the distribution of the articulatory informationcontained in the stop consonants’ acoustics, whose parametrizationis achieved by using the wavelet packet transform. The main focus is on measuringthe relevant acoustic information, in terms of statistical association, forthe inference of the position of critical articulators involved in stop consonantsproduction. The rank correlation Kendall coefficient is used as the relevance measure. The maps of relevant time–frequency features are calculated for theMOCHA–TIMIT database; from which, stop consonants are extracted andanalysed. The proposed method obtains a set of time–frequency components closely related to articulatory phenemenon, which offers a deeper understanding into the relationship between the articulatory and acoustical phenomena.The relevant maps are tested into an acoustic–to–articulatory mapping systembased on Gaussian mixture models, where it is shown they are suitable for improvingthe performance of such a systems over stop consonants. The method could be extended to other manner of articulation categories, e.g. fricatives,in order to adapt present method to acoustic-to-articulatory mapping systemsover whole speech.