A small vocabulary database of ultrasound image sequences of vocal tract dynamics

This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is compo...

Full description

Autores:

Tipo de recurso:

Fecha de publicación:: 2019

Institución:: Universidad Tecnológica de Bolívar

Repositorio:: Repositorio Institucional UTB

Idioma:: eng

id	UTB2_df815e99cde11322be7644c5b00825b4
oai_identifier_str	oai:repositorio.utb.edu.co:20.500.12585/9154
network_acronym_str	UTB2
network_name_str	Repositorio Institucional UTB
repository_id_str
dc.title.none.fl_str_mv	A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title	A small vocabulary database of ultrasound image sequences of vocal tract dynamics
spellingShingle	A small vocabulary database of ultrasound image sequences of vocal tract dynamics Articulation Speech Tongue Ultrasound Data visualization Database systems Speech Ultrasonics Vision Acoustic data Acoustic speech Articulatory data Speech pathology Speech production Tongue Ultrasound image sequences Ultrasound videos Image processing
title_short	A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title_full	A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title_fullStr	A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title_full_unstemmed	A small vocabulary database of ultrasound image sequences of vocal tract dynamics
title_sort	A small vocabulary database of ultrasound image sequences of vocal tract dynamics
dc.subject.keywords.none.fl_str_mv	Articulation Speech Tongue Ultrasound Data visualization Database systems Speech Ultrasonics Vision Acoustic data Acoustic speech Articulatory data Speech pathology Speech production Tongue Ultrasound image sequences Ultrasound videos Image processing
topic	Articulation Speech Tongue Ultrasound Data visualization Database systems Speech Ultrasonics Vision Acoustic data Acoustic speech Articulatory data Speech pathology Speech production Tongue Ultrasound image sequences Ultrasound videos Image processing
description	This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is composed of 30 short sentences that were acquired by a directional cardioid microphone. This database includes data from 17 young subjects (8 male and 9 female) from the Santander region in Colombia, who reported not having any speech pathology. © 2019 IEEE.
publishDate	2019
dc.date.issued.none.fl_str_mv	2019
dc.date.accessioned.none.fl_str_mv	2020-03-26T16:33:04Z
dc.date.available.none.fl_str_mv	2020-03-26T16:33:04Z
dc.type.coarversion.fl_str_mv	http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.coar.fl_str_mv	http://purl.org/coar/resource_type/c_c94f
dc.type.driver.none.fl_str_mv	info:eu-repo/semantics/conferenceObject
dc.type.hasversion.none.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.spa.none.fl_str_mv	Conferencia
status_str	publishedVersion
dc.identifier.citation.none.fl_str_mv	2019 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 - Conference Proceedings
dc.identifier.isbn.none.fl_str_mv	9781728114910
dc.identifier.uri.none.fl_str_mv	https://hdl.handle.net/20.500.12585/9154
dc.identifier.doi.none.fl_str_mv	10.1109/STSIVA.2019.8730224
dc.identifier.instname.none.fl_str_mv	Universidad Tecnológica de Bolívar
dc.identifier.reponame.none.fl_str_mv	Repositorio UTB
dc.identifier.orcid.none.fl_str_mv	57209530567 57209536314 57209535982 57210822856 55340424500
identifier_str_mv	2019 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 - Conference Proceedings 9781728114910 10.1109/STSIVA.2019.8730224 Universidad Tecnológica de Bolívar Repositorio UTB 57209530567 57209536314 57209535982 57210822856 55340424500
url	https://hdl.handle.net/20.500.12585/9154
dc.language.iso.none.fl_str_mv	eng
language	eng
dc.relation.conferencedate.none.fl_str_mv	24 April 2019 through 26 April 2019
dc.rights.coar.fl_str_mv	http://purl.org/coar/access_right/c_16ec
dc.rights.uri.none.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.accessrights.none.fl_str_mv	info:eu-repo/semantics/restrictedAccess
dc.rights.cc.none.fl_str_mv	Atribución-NoComercial 4.0 Internacional
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-nd/4.0/ Atribución-NoComercial 4.0 Internacional http://purl.org/coar/access_right/c_16ec
eu_rights_str_mv	restrictedAccess
dc.format.medium.none.fl_str_mv	Recurso electrónico
dc.format.mimetype.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Institute of Electrical and Electronics Engineers Inc.
publisher.none.fl_str_mv	Institute of Electrical and Electronics Engineers Inc.
dc.source.none.fl_str_mv	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85068073792&doi=10.1109%2fSTSIVA.2019.8730224&partnerID=40&md5=f3c96d8ebc49f846b1e99dafa00b746d Scopus2-s2.0-85068073792
institution	Universidad Tecnológica de Bolívar
dc.source.event.none.fl_str_mv	22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019
bitstream.url.fl_str_mv	https://repositorio.utb.edu.co/bitstream/20.500.12585/9154/1/MiniProdInv.png
bitstream.checksum.fl_str_mv	0cb0f101a8d16897fb46fc914d3d7043
bitstream.checksumAlgorithm.fl_str_mv	MD5
repository.name.fl_str_mv	Repositorio Institucional UTB
repository.mail.fl_str_mv	repositorioutb@utb.edu.co
_version_	1837010856756379648
spelling	2020-03-26T16:33:04Z2020-03-26T16:33:04Z20192019 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 - Conference Proceedings9781728114910https://hdl.handle.net/20.500.12585/915410.1109/STSIVA.2019.8730224Universidad Tecnológica de BolívarRepositorio UTB5720953056757209536314572095359825721082285655340424500This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is composed of 30 short sentences that were acquired by a directional cardioid microphone. This database includes data from 17 young subjects (8 male and 9 female) from the Santander region in Colombia, who reported not having any speech pathology. © 2019 IEEE.IEEE Colombia Section;IEEE Signal Processing Society Colombia Chapter;Universidad Industrial de SantanderRecurso electrónicoapplication/pdfengInstitute of Electrical and Electronics Engineers Inc.http://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/restrictedAccessAtribución-NoComercial 4.0 Internacionalhttp://purl.org/coar/access_right/c_16echttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85068073792&doi=10.1109%2fSTSIVA.2019.8730224&partnerID=40&md5=f3c96d8ebc49f846b1e99dafa00b746dScopus2-s2.0-8506807379222nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019A small vocabulary database of ultrasound image sequences of vocal tract dynamicsinfo:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionConferenciahttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_c94fArticulationSpeechTongueUltrasoundData visualizationDatabase systemsSpeechUltrasonicsVisionAcoustic dataAcoustic speechArticulatory dataSpeech pathologySpeech productionTongueUltrasound image sequencesUltrasound videosImage processing24 April 2019 through 26 April 2019Castillo M.Rubio F.Porras D.Contreras Ortiz, Sonia HelenaSepúlveda A.Richmond, K., (2001) Estimating Articulatory Parameters from the Acoustic Speech Signal., , PhD thesis, The Centre for Speech Technology Research, Edinburgh UniversityMaeda, S., (1990) Speech Production and Speech Modelling, Chapter Compensatory Articulation during Speech: Evidence from the Analysis and Synthesis of Vocal-tract Shapes Using Articulatory Model, pp. 131-149. , Kluwer Academic PublishersX-ray Microbeam Speech Production Database User's Handbook Version 1.0Xue, Q., Improvement in tracking of articulatory movements with the x-ray microbeam system Annual International Conference on Engineering in Medicine and Biology SocietyMunhall, K.G., VatikiotisBateson, E., Tohkura, Y., Xray film database for speech research (1995) The Journal of the Acoustical Society of America, 98 (2), pp. 1222-1224Sock, R., Hirsch, F., Laprie, Y., Perrier, P., Vaxelaire, B., An x-ray database, tools and procedures for the study of speech production 9th International Seminar on Speech Production (ISSP 2011), , V.L. Gracco D.J. Ostry L. Mnard, S.R. Baum, editor, JuneWrench, A.A., Hardcastle, W.J., A multichannel articulatory database and its application for automatic speech recognition (2000) 5th Seminar on Speech Production: Models and Data, 1Rudzicz, F., Namasivayam, A., Wolff, T., The torgo database of acoustic and articulatory speech from speakers with dysarthria (2010) Language Resources and Evaluation, 46 (1), pp. 1-19Narayanan, S., Toutios, A., Ramanarayanan, V., Lammert, A., Kim, J., Lee, S., Nayak, K., Proctor, M., Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC) (2014) The Journal of the Acoustical Society of America, 136 (3), pp. 1307-1311Gábor Csapó, T., Grósz, T., Gosztolya, G., Tóth, L., Markó, A., DNN-based ultrasound-to-speech conversion for a silent speech interface (2017) Proc. Interspeech, pp. 3672-3676. , Stockholm, SwedenQin, C., Carreira-Perpinán, M.A., Richmond, K., Wrench, A., Renals, S., Predicting tongue shapes from a few landmark locations (2008) Ninth Annual Conference of the International Speech Communication AssociationPreston, J.L., McAllister Byun, T., Boyce, S.E., Hamilton, S., Tiede, M., Phillips, E., Rivera-Campos, A., Whalen, D.H., Ultrasound images of the tongue: A tutorial for assessment and remediation of speech sound errors (2017) Journal of Visualized Experiments: JoVE, 119Gábor Csapó, T., Grósz, T., Gosztolya, G., Tóth, L., Markó, A., Dnn-based ultrasound-to-speech conversion for a silent speech interface (2017) Proc. Interspeech, Stockholm, Sweden, pp. 3672-3676Xu, K., Roussel, P., Gábor Csapó, T., Denby, B., Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using b-mode ultrasound images The Journal of the Acoustical Society of America, 141 (6)Scobbie, J.M., Wrench, A.A., Van Der Linden, M., Headprobe stabilisation in ultrasound tongue imaging using a headset to permit natural head movement (2008) Proceedings of the 8th International Seminar on Speech Production, pp. 373-376The haskins optically corrected ultrasound system (HOCUS) (2005) Journal of Speech Language and Hearing Research, 48 (3), p. 543Jallon, J.F., Berthommier, F., A semi-automatic method for extracting vocal tract movements from X-ray films (2009) Speech Communication, 51 (2), pp. 97-115Fontecave, J., Berthommier, F., Quasi-automatic extraction of tongue movement from a large existing speech cineradiographic database (2009) Evaluation, 2, pp. 8-11Ghosh, P.K., Narayanan, S., A generalized smoothness criterion for acoustic-to-articulatory inversion (2010) The Journal of the Acoustical Society of America, 128, pp. 2162-2172Lofqvist, A., Tongue movement kinematics in long and short Japanese consonants (2007) Journal of the Acoustical Society of America, 122 (1), pp. 512-518Li, M., Kambhamettu, C., Stone, M., Automatic contour tracking in ultrasound images (2005) Clinical Linguistics & Phonetics, 19 (6-7), pp. 545-554Kass, M., Witkin, A., Terzopoulos, D., Snakes: Active contour models (1988) International Journal of Computer Vision, 1 (4), pp. 321-331Xu, K., Csapó, T.G., Roussel, P., Denby, B., A comparative study on the contour tracking algorithms in ultrasound tongue images with automatic re-initialization (2016) The Journal of the Acoustical Society of America, 139 (5), pp. EL154-EL160Yu, Y., Acton, S.T., Speckle reducing anisotropic diffusion (2002) IEEE Transactions on Image Processing, 11 (11), pp. 1260-1270. , NovLozano-Herrera, C., Gmez-Reyes, J., (2017) Implementacin y Anlisis de un Mtodo Automtico de Deteccin Del Contorno Superior de la Lengua en Secuencias de Imgenes de Ultrasonido, , MayCadena-Bonfanti, A., Contreras-Ortiz, S.H., Giraldo-Guzmn, J., Porto-Solano, O., Speckle reduction in echocardiography by temporal compounding and anisotropic diffusion filtering (2014) 10th International Symposium on Medical Information Processing and Analysis, 2014. , OctKing, S., Frankel, J., Livescu, K., McDermott, E., Richmond, K., Wester, M., Speech production knowledge in automatic speech recognition (2007) The Journal of the Acoustical Society of America, 121 (2), pp. 723-742Ling, Z.-H., Richmond, K., Yamagishi, J., Wang, R.-H., Integrating articulatory features into hmm-based parametric speech synthesis (2009) IEEE Transactions on Audio, Speech, and Language Processing, 17 (6), pp. 1171-1185Li, M., Kim, J., Lammert, A., Kumar Ghosh, P., Ramanarayanan, V., Narayanan, S., Speaker verification based on the fusion of speech acoustics and inverted articulatory signals (2016) Computer Speech & Language, 36, pp. 196-211Wang, L., Qian, X., Han, W., Soong, F.K., Synthesizing photo-real talking head via trajectory-guided sample selection (2010) Eleventh Annual Conference of the International Speech Communication AssociationSepúlveda, A., Capobianco Guido, R., Castellanos-Dominguez, G., Estimation of relevant timefrequency features using Kendall coefficient for articulator position inference (2013) Speech Communication, 55 (1), pp. 99-110. , janStone, M., A guide to analysing tongue motion from ultrasound images (2005) Clinical Linguistics & Phonetics, 19 (6-7), pp. 455-501. , janCsapó, T.G., Lulich, S.M., Error analysis of extracted tongue contours from 2D ultrasound images (2015) Proc. Interspeech, pp. 2157-2161. , Dresden, GermanyGhosh, P.K., Narayanan, S., Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion (2011) The Journal of the Acoustical Society of America, 130 (4), pp. EL251-EL257Li, M., Kim, J., Lammert, A., Kumar Ghosh, P., Ramanarayanan, V., Narayanan, S., Speaker verification based on the fusion of speech acoustics and inverted articulatory signals (2016) Computer Speech & Language, 36, pp. 196-211http://purl.org/coar/resource_type/c_c94fTHUMBNAILMiniProdInv.pngMiniProdInv.pngimage/png23941https://repositorio.utb.edu.co/bitstream/20.500.12585/9154/1/MiniProdInv.png0cb0f101a8d16897fb46fc914d3d7043MD5120.500.12585/9154oai:repositorio.utb.edu.co:20.500.12585/91542023-05-25 15:52:19.648Repositorio Institucional UTBrepositorioutb@utb.edu.co

A small vocabulary database of ultrasound image sequences of vocal tract dynamics

Publicaciones similares