Time-Frequency Energy Features for Articulator Position Inference on Stop Consonants
Acoustic-to-Articulatory inversion offers new perspectives and interesting applicationsin the speech processing field; however, it remains an open issue. This paper presents a method to estimate the distribution of the articulatory informationcontained in the stop consonants’ acoustics, whose parame...
- Autores:
-
Sepulveda-Sepulveda, Alexander
Castellanos-Domínguez, German
- Tipo de recurso:
- Fecha de publicación:
- 2012
- Institución:
- Universidad EAFIT
- Repositorio:
- Repositorio EAFIT
- Idioma:
- eng
- OAI Identifier:
- oai:repository.eafit.edu.co:10784/14448
- Acceso en línea:
- http://hdl.handle.net/10784/14448
- Palabra clave:
- Acoustic-To-Articulatory Inversion
Gaussian Mixture Models
Articulatory Phonetics
Time-Frequency Features
Inversión Acústica A Articulación
Modelos De Mezcla Gaussiana
Fonética Articulatoria
Características De Frecuencia De Tiempo
- Rights
- License
- Copyright (c) 2012 Alexander Sepulveda-Sepulveda, German Castellanos-Domínguez
Summary: | Acoustic-to-Articulatory inversion offers new perspectives and interesting applicationsin the speech processing field; however, it remains an open issue. This paper presents a method to estimate the distribution of the articulatory informationcontained in the stop consonants’ acoustics, whose parametrizationis achieved by using the wavelet packet transform. The main focus is on measuringthe relevant acoustic information, in terms of statistical association, forthe inference of the position of critical articulators involved in stop consonantsproduction. The rank correlation Kendall coefficient is used as the relevance measure. The maps of relevant time–frequency features are calculated for theMOCHA–TIMIT database; from which, stop consonants are extracted andanalysed. The proposed method obtains a set of time–frequency components closely related to articulatory phenemenon, which offers a deeper understanding into the relationship between the articulatory and acoustical phenomena.The relevant maps are tested into an acoustic–to–articulatory mapping systembased on Gaussian mixture models, where it is shown they are suitable for improvingthe performance of such a systems over stop consonants. The method could be extended to other manner of articulation categories, e.g. fricatives,in order to adapt present method to acoustic-to-articulatory mapping systemsover whole speech. |
---|