A small vocabulary database of ultrasound image sequences of vocal tract dynamics

This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is compo...

Full description

Autores:
Tipo de recurso:
Fecha de publicación:
2019
Institución:
Universidad Tecnológica de Bolívar
Repositorio:
Repositorio Institucional UTB
Idioma:
eng
OAI Identifier:
oai:repositorio.utb.edu.co:20.500.12585/9154
Acceso en línea:
https://hdl.handle.net/20.500.12585/9154
Palabra clave:
Articulation
Speech
Tongue
Ultrasound
Data visualization
Database systems
Speech
Ultrasonics
Vision
Acoustic data
Acoustic speech
Articulatory data
Speech pathology
Speech production
Tongue
Ultrasound image sequences
Ultrasound videos
Image processing
Rights
restrictedAccess
License
http://creativecommons.org/licenses/by-nc-nd/4.0/
id UTB2_df815e99cde11322be7644c5b00825b4
oai_identifier_str oai:repositorio.utb.edu.co:20.500.12585/9154
network_acronym_str UTB2
network_name_str Repositorio Institucional UTB
repository_id_str
dc.title.none.fl_str_mv A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title A small vocabulary database of ultrasound image sequences of vocal tract dynamics
spellingShingle A small vocabulary database of ultrasound image sequences of vocal tract dynamics
Articulation
Speech
Tongue
Ultrasound
Data visualization
Database systems
Speech
Ultrasonics
Vision
Acoustic data
Acoustic speech
Articulatory data
Speech pathology
Speech production
Tongue
Ultrasound image sequences
Ultrasound videos
Image processing
title_short A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title_full A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title_fullStr A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title_full_unstemmed A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title_sort A small vocabulary database of ultrasound image sequences of vocal tract dynamics
dc.subject.keywords.none.fl_str_mv Articulation
Speech
Tongue
Ultrasound
Data visualization
Database systems
Speech
Ultrasonics
Vision
Acoustic data
Acoustic speech
Articulatory data
Speech pathology
Speech production
Tongue
Ultrasound image sequences
Ultrasound videos
Image processing
topic Articulation
Speech
Tongue
Ultrasound
Data visualization
Database systems
Speech
Ultrasonics
Vision
Acoustic data
Acoustic speech
Articulatory data
Speech pathology
Speech production
Tongue
Ultrasound image sequences
Ultrasound videos
Image processing
description This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is composed of 30 short sentences that were acquired by a directional cardioid microphone. This database includes data from 17 young subjects (8 male and 9 female) from the Santander region in Colombia, who reported not having any speech pathology. © 2019 IEEE.
publishDate 2019
dc.date.issued.none.fl_str_mv 2019
dc.date.accessioned.none.fl_str_mv 2020-03-26T16:33:04Z
dc.date.available.none.fl_str_mv 2020-03-26T16:33:04Z
dc.type.coarversion.fl_str_mv http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.coar.fl_str_mv http://purl.org/coar/resource_type/c_c94f
dc.type.driver.none.fl_str_mv info:eu-repo/semantics/conferenceObject
dc.type.hasversion.none.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.spa.none.fl_str_mv Conferencia
status_str publishedVersion
dc.identifier.citation.none.fl_str_mv 2019 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 - Conference Proceedings
dc.identifier.isbn.none.fl_str_mv 9781728114910
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12585/9154
dc.identifier.doi.none.fl_str_mv 10.1109/STSIVA.2019.8730224
dc.identifier.instname.none.fl_str_mv Universidad Tecnológica de Bolívar
dc.identifier.reponame.none.fl_str_mv Repositorio UTB
dc.identifier.orcid.none.fl_str_mv 57209530567
57209536314
57209535982
57210822856
55340424500
identifier_str_mv 2019 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 - Conference Proceedings
9781728114910
10.1109/STSIVA.2019.8730224
Universidad Tecnológica de Bolívar
Repositorio UTB
57209530567
57209536314
57209535982
57210822856
55340424500
url https://hdl.handle.net/20.500.12585/9154
dc.language.iso.none.fl_str_mv eng
language eng
dc.relation.conferencedate.none.fl_str_mv 24 April 2019 through 26 April 2019
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_16ec
dc.rights.uri.none.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.accessrights.none.fl_str_mv info:eu-repo/semantics/restrictedAccess
dc.rights.cc.none.fl_str_mv Atribución-NoComercial 4.0 Internacional
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
Atribución-NoComercial 4.0 Internacional
http://purl.org/coar/access_right/c_16ec
eu_rights_str_mv restrictedAccess
dc.format.medium.none.fl_str_mv Recurso electrónico
dc.format.mimetype.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Institute of Electrical and Electronics Engineers Inc.
publisher.none.fl_str_mv Institute of Electrical and Electronics Engineers Inc.
dc.source.none.fl_str_mv https://www.scopus.com/inward/record.uri?eid=2-s2.0-85068073792&doi=10.1109%2fSTSIVA.2019.8730224&partnerID=40&md5=f3c96d8ebc49f846b1e99dafa00b746d
Scopus2-s2.0-85068073792
institution Universidad Tecnológica de Bolívar
dc.source.event.none.fl_str_mv 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019
bitstream.url.fl_str_mv https://repositorio.utb.edu.co/bitstream/20.500.12585/9154/1/MiniProdInv.png
bitstream.checksum.fl_str_mv 0cb0f101a8d16897fb46fc914d3d7043
bitstream.checksumAlgorithm.fl_str_mv MD5
repository.name.fl_str_mv Repositorio Institucional UTB
repository.mail.fl_str_mv repositorioutb@utb.edu.co
_version_ 1814021790148591616
spelling 2020-03-26T16:33:04Z2020-03-26T16:33:04Z20192019 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 - Conference Proceedings9781728114910https://hdl.handle.net/20.500.12585/915410.1109/STSIVA.2019.8730224Universidad Tecnológica de BolívarRepositorio UTB5720953056757209536314572095359825721082285655340424500This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is composed of 30 short sentences that were acquired by a directional cardioid microphone. This database includes data from 17 young subjects (8 male and 9 female) from the Santander region in Colombia, who reported not having any speech pathology. © 2019 IEEE.IEEE Colombia Section;IEEE Signal Processing Society Colombia Chapter;Universidad Industrial de SantanderRecurso electrónicoapplication/pdfengInstitute of Electrical and Electronics Engineers Inc.http://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/restrictedAccessAtribución-NoComercial 4.0 Internacionalhttp://purl.org/coar/access_right/c_16echttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85068073792&doi=10.1109%2fSTSIVA.2019.8730224&partnerID=40&md5=f3c96d8ebc49f846b1e99dafa00b746dScopus2-s2.0-8506807379222nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019A small vocabulary database of ultrasound image sequences of vocal tract dynamicsinfo:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionConferenciahttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_c94fArticulationSpeechTongueUltrasoundData visualizationDatabase systemsSpeechUltrasonicsVisionAcoustic dataAcoustic speechArticulatory dataSpeech pathologySpeech productionTongueUltrasound image sequencesUltrasound videosImage processing24 April 2019 through 26 April 2019Castillo M.Rubio F.Porras D.Contreras Ortiz, Sonia HelenaSepúlveda A.Richmond, K., (2001) Estimating Articulatory Parameters from the Acoustic Speech Signal., , PhD thesis, The Centre for Speech Technology Research, Edinburgh UniversityMaeda, S., (1990) Speech Production and Speech Modelling, Chapter Compensatory Articulation during Speech: Evidence from the Analysis and Synthesis of Vocal-tract Shapes Using Articulatory Model, pp. 131-149. , Kluwer Academic PublishersX-ray Microbeam Speech Production Database User's Handbook Version 1.0Xue, Q., Improvement in tracking of articulatory movements with the x-ray microbeam system Annual International Conference on Engineering in Medicine and Biology SocietyMunhall, K.G., VatikiotisBateson, E., Tohkura, Y., Xray film database for speech research (1995) The Journal of the Acoustical Society of America, 98 (2), pp. 1222-1224Sock, R., Hirsch, F., Laprie, Y., Perrier, P., Vaxelaire, B., An x-ray database, tools and procedures for the study of speech production 9th International Seminar on Speech Production (ISSP 2011), , V.L. Gracco D.J. Ostry L. Mnard, S.R. Baum, editor, JuneWrench, A.A., Hardcastle, W.J., A multichannel articulatory database and its application for automatic speech recognition (2000) 5th Seminar on Speech Production: Models and Data, 1Rudzicz, F., Namasivayam, A., Wolff, T., The torgo database of acoustic and articulatory speech from speakers with dysarthria (2010) Language Resources and Evaluation, 46 (1), pp. 1-19Narayanan, S., Toutios, A., Ramanarayanan, V., Lammert, A., Kim, J., Lee, S., Nayak, K., Proctor, M., Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC) (2014) The Journal of the Acoustical Society of America, 136 (3), pp. 1307-1311Gábor Csapó, T., Grósz, T., Gosztolya, G., Tóth, L., Markó, A., DNN-based ultrasound-to-speech conversion for a silent speech interface (2017) Proc. Interspeech, pp. 3672-3676. , Stockholm, SwedenQin, C., Carreira-Perpinán, M.A., Richmond, K., Wrench, A., Renals, S., Predicting tongue shapes from a few landmark locations (2008) Ninth Annual Conference of the International Speech Communication AssociationPreston, J.L., McAllister Byun, T., Boyce, S.E., Hamilton, S., Tiede, M., Phillips, E., Rivera-Campos, A., Whalen, D.H., Ultrasound images of the tongue: A tutorial for assessment and remediation of speech sound errors (2017) Journal of Visualized Experiments: JoVE, 119Gábor Csapó, T., Grósz, T., Gosztolya, G., Tóth, L., Markó, A., Dnn-based ultrasound-to-speech conversion for a silent speech interface (2017) Proc. Interspeech, Stockholm, Sweden, pp. 3672-3676Xu, K., Roussel, P., Gábor Csapó, T., Denby, B., Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using b-mode ultrasound images The Journal of the Acoustical Society of America, 141 (6)Scobbie, J.M., Wrench, A.A., Van Der Linden, M., Headprobe stabilisation in ultrasound tongue imaging using a headset to permit natural head movement (2008) Proceedings of the 8th International Seminar on Speech Production, pp. 373-376The haskins optically corrected ultrasound system (HOCUS) (2005) Journal of Speech Language and Hearing Research, 48 (3), p. 543Jallon, J.F., Berthommier, F., A semi-automatic method for extracting vocal tract movements from X-ray films (2009) Speech Communication, 51 (2), pp. 97-115Fontecave, J., Berthommier, F., Quasi-automatic extraction of tongue movement from a large existing speech cineradiographic database (2009) Evaluation, 2, pp. 8-11Ghosh, P.K., Narayanan, S., A generalized smoothness criterion for acoustic-to-articulatory inversion (2010) The Journal of the Acoustical Society of America, 128, pp. 2162-2172Lofqvist, A., Tongue movement kinematics in long and short Japanese consonants (2007) Journal of the Acoustical Society of America, 122 (1), pp. 512-518Li, M., Kambhamettu, C., Stone, M., Automatic contour tracking in ultrasound images (2005) Clinical Linguistics & Phonetics, 19 (6-7), pp. 545-554Kass, M., Witkin, A., Terzopoulos, D., Snakes: Active contour models (1988) International Journal of Computer Vision, 1 (4), pp. 321-331Xu, K., Csapó, T.G., Roussel, P., Denby, B., A comparative study on the contour tracking algorithms in ultrasound tongue images with automatic re-initialization (2016) The Journal of the Acoustical Society of America, 139 (5), pp. EL154-EL160Yu, Y., Acton, S.T., Speckle reducing anisotropic diffusion (2002) IEEE Transactions on Image Processing, 11 (11), pp. 1260-1270. , NovLozano-Herrera, C., Gmez-Reyes, J., (2017) Implementacin y Anlisis de un Mtodo Automtico de Deteccin Del Contorno Superior de la Lengua en Secuencias de Imgenes de Ultrasonido, , MayCadena-Bonfanti, A., Contreras-Ortiz, S.H., Giraldo-Guzmn, J., Porto-Solano, O., Speckle reduction in echocardiography by temporal compounding and anisotropic diffusion filtering (2014) 10th International Symposium on Medical Information Processing and Analysis, 2014. , OctKing, S., Frankel, J., Livescu, K., McDermott, E., Richmond, K., Wester, M., Speech production knowledge in automatic speech recognition (2007) The Journal of the Acoustical Society of America, 121 (2), pp. 723-742Ling, Z.-H., Richmond, K., Yamagishi, J., Wang, R.-H., Integrating articulatory features into hmm-based parametric speech synthesis (2009) IEEE Transactions on Audio, Speech, and Language Processing, 17 (6), pp. 1171-1185Li, M., Kim, J., Lammert, A., Kumar Ghosh, P., Ramanarayanan, V., Narayanan, S., Speaker verification based on the fusion of speech acoustics and inverted articulatory signals (2016) Computer Speech & Language, 36, pp. 196-211Wang, L., Qian, X., Han, W., Soong, F.K., Synthesizing photo-real talking head via trajectory-guided sample selection (2010) Eleventh Annual Conference of the International Speech Communication AssociationSepúlveda, A., Capobianco Guido, R., Castellanos-Dominguez, G., Estimation of relevant timefrequency features using Kendall coefficient for articulator position inference (2013) Speech Communication, 55 (1), pp. 99-110. , janStone, M., A guide to analysing tongue motion from ultrasound images (2005) Clinical Linguistics & Phonetics, 19 (6-7), pp. 455-501. , janCsapó, T.G., Lulich, S.M., Error analysis of extracted tongue contours from 2D ultrasound images (2015) Proc. Interspeech, pp. 2157-2161. , Dresden, GermanyGhosh, P.K., Narayanan, S., Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion (2011) The Journal of the Acoustical Society of America, 130 (4), pp. EL251-EL257Li, M., Kim, J., Lammert, A., Kumar Ghosh, P., Ramanarayanan, V., Narayanan, S., Speaker verification based on the fusion of speech acoustics and inverted articulatory signals (2016) Computer Speech & Language, 36, pp. 196-211http://purl.org/coar/resource_type/c_c94fTHUMBNAILMiniProdInv.pngMiniProdInv.pngimage/png23941https://repositorio.utb.edu.co/bitstream/20.500.12585/9154/1/MiniProdInv.png0cb0f101a8d16897fb46fc914d3d7043MD5120.500.12585/9154oai:repositorio.utb.edu.co:20.500.12585/91542023-05-25 15:52:19.648Repositorio Institucional UTBrepositorioutb@utb.edu.co