A small vocabulary database of ultrasound image sequences of vocal tract dynamics
This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is compo...
- Autores:
- Tipo de recurso:
- Fecha de publicación:
- 2019
- Institución:
- Universidad Tecnológica de Bolívar
- Repositorio:
- Repositorio Institucional UTB
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.utb.edu.co:20.500.12585/9154
- Acceso en línea:
- https://hdl.handle.net/20.500.12585/9154
- Palabra clave:
- Articulation
Speech
Tongue
Ultrasound
Data visualization
Database systems
Speech
Ultrasonics
Vision
Acoustic data
Acoustic speech
Articulatory data
Speech pathology
Speech production
Tongue
Ultrasound image sequences
Ultrasound videos
Image processing
- Rights
- restrictedAccess
- License
- http://creativecommons.org/licenses/by-nc-nd/4.0/
id |
UTB2_df815e99cde11322be7644c5b00825b4 |
---|---|
oai_identifier_str |
oai:repositorio.utb.edu.co:20.500.12585/9154 |
network_acronym_str |
UTB2 |
network_name_str |
Repositorio Institucional UTB |
repository_id_str |
|
dc.title.none.fl_str_mv |
A small vocabulary database of ultrasound image sequences of vocal tract dynamics |
title |
A small vocabulary database of ultrasound image sequences of vocal tract dynamics |
spellingShingle |
A small vocabulary database of ultrasound image sequences of vocal tract dynamics Articulation Speech Tongue Ultrasound Data visualization Database systems Speech Ultrasonics Vision Acoustic data Acoustic speech Articulatory data Speech pathology Speech production Tongue Ultrasound image sequences Ultrasound videos Image processing |
title_short |
A small vocabulary database of ultrasound image sequences of vocal tract dynamics |
title_full |
A small vocabulary database of ultrasound image sequences of vocal tract dynamics |
title_fullStr |
A small vocabulary database of ultrasound image sequences of vocal tract dynamics |
title_full_unstemmed |
A small vocabulary database of ultrasound image sequences of vocal tract dynamics |
title_sort |
A small vocabulary database of ultrasound image sequences of vocal tract dynamics |
dc.subject.keywords.none.fl_str_mv |
Articulation Speech Tongue Ultrasound Data visualization Database systems Speech Ultrasonics Vision Acoustic data Acoustic speech Articulatory data Speech pathology Speech production Tongue Ultrasound image sequences Ultrasound videos Image processing |
topic |
Articulation Speech Tongue Ultrasound Data visualization Database systems Speech Ultrasonics Vision Acoustic data Acoustic speech Articulatory data Speech pathology Speech production Tongue Ultrasound image sequences Ultrasound videos Image processing |
description |
This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is composed of 30 short sentences that were acquired by a directional cardioid microphone. This database includes data from 17 young subjects (8 male and 9 female) from the Santander region in Colombia, who reported not having any speech pathology. © 2019 IEEE. |
publishDate |
2019 |
dc.date.issued.none.fl_str_mv |
2019 |
dc.date.accessioned.none.fl_str_mv |
2020-03-26T16:33:04Z |
dc.date.available.none.fl_str_mv |
2020-03-26T16:33:04Z |
dc.type.coarversion.fl_str_mv |
http://purl.org/coar/version/c_970fb48d4fbd8a85 |
dc.type.coar.fl_str_mv |
http://purl.org/coar/resource_type/c_c94f |
dc.type.driver.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
dc.type.hasversion.none.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.spa.none.fl_str_mv |
Conferencia |
status_str |
publishedVersion |
dc.identifier.citation.none.fl_str_mv |
2019 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 - Conference Proceedings |
dc.identifier.isbn.none.fl_str_mv |
9781728114910 |
dc.identifier.uri.none.fl_str_mv |
https://hdl.handle.net/20.500.12585/9154 |
dc.identifier.doi.none.fl_str_mv |
10.1109/STSIVA.2019.8730224 |
dc.identifier.instname.none.fl_str_mv |
Universidad Tecnológica de Bolívar |
dc.identifier.reponame.none.fl_str_mv |
Repositorio UTB |
dc.identifier.orcid.none.fl_str_mv |
57209530567 57209536314 57209535982 57210822856 55340424500 |
identifier_str_mv |
2019 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 - Conference Proceedings 9781728114910 10.1109/STSIVA.2019.8730224 Universidad Tecnológica de Bolívar Repositorio UTB 57209530567 57209536314 57209535982 57210822856 55340424500 |
url |
https://hdl.handle.net/20.500.12585/9154 |
dc.language.iso.none.fl_str_mv |
eng |
language |
eng |
dc.relation.conferencedate.none.fl_str_mv |
24 April 2019 through 26 April 2019 |
dc.rights.coar.fl_str_mv |
http://purl.org/coar/access_right/c_16ec |
dc.rights.uri.none.fl_str_mv |
http://creativecommons.org/licenses/by-nc-nd/4.0/ |
dc.rights.accessrights.none.fl_str_mv |
info:eu-repo/semantics/restrictedAccess |
dc.rights.cc.none.fl_str_mv |
Atribución-NoComercial 4.0 Internacional |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/4.0/ Atribución-NoComercial 4.0 Internacional http://purl.org/coar/access_right/c_16ec |
eu_rights_str_mv |
restrictedAccess |
dc.format.medium.none.fl_str_mv |
Recurso electrónico |
dc.format.mimetype.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Institute of Electrical and Electronics Engineers Inc. |
publisher.none.fl_str_mv |
Institute of Electrical and Electronics Engineers Inc. |
dc.source.none.fl_str_mv |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85068073792&doi=10.1109%2fSTSIVA.2019.8730224&partnerID=40&md5=f3c96d8ebc49f846b1e99dafa00b746d Scopus2-s2.0-85068073792 |
institution |
Universidad Tecnológica de Bolívar |
dc.source.event.none.fl_str_mv |
22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 |
bitstream.url.fl_str_mv |
https://repositorio.utb.edu.co/bitstream/20.500.12585/9154/1/MiniProdInv.png |
bitstream.checksum.fl_str_mv |
0cb0f101a8d16897fb46fc914d3d7043 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 |
repository.name.fl_str_mv |
Repositorio Institucional UTB |
repository.mail.fl_str_mv |
repositorioutb@utb.edu.co |
_version_ |
1814021790148591616 |
spelling |
2020-03-26T16:33:04Z2020-03-26T16:33:04Z20192019 22nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019 - Conference Proceedings9781728114910https://hdl.handle.net/20.500.12585/915410.1109/STSIVA.2019.8730224Universidad Tecnológica de BolívarRepositorio UTB5720953056757209536314572095359825721082285655340424500This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is composed of 30 short sentences that were acquired by a directional cardioid microphone. This database includes data from 17 young subjects (8 male and 9 female) from the Santander region in Colombia, who reported not having any speech pathology. © 2019 IEEE.IEEE Colombia Section;IEEE Signal Processing Society Colombia Chapter;Universidad Industrial de SantanderRecurso electrónicoapplication/pdfengInstitute of Electrical and Electronics Engineers Inc.http://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/restrictedAccessAtribución-NoComercial 4.0 Internacionalhttp://purl.org/coar/access_right/c_16echttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85068073792&doi=10.1109%2fSTSIVA.2019.8730224&partnerID=40&md5=f3c96d8ebc49f846b1e99dafa00b746dScopus2-s2.0-8506807379222nd Symposium on Image, Signal Processing and Artificial Vision, STSIVA 2019A small vocabulary database of ultrasound image sequences of vocal tract dynamicsinfo:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionConferenciahttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_c94fArticulationSpeechTongueUltrasoundData visualizationDatabase systemsSpeechUltrasonicsVisionAcoustic dataAcoustic speechArticulatory dataSpeech pathologySpeech productionTongueUltrasound image sequencesUltrasound videosImage processing24 April 2019 through 26 April 2019Castillo M.Rubio F.Porras D.Contreras Ortiz, Sonia HelenaSepúlveda A.Richmond, K., (2001) Estimating Articulatory Parameters from the Acoustic Speech Signal., , PhD thesis, The Centre for Speech Technology Research, Edinburgh UniversityMaeda, S., (1990) Speech Production and Speech Modelling, Chapter Compensatory Articulation during Speech: Evidence from the Analysis and Synthesis of Vocal-tract Shapes Using Articulatory Model, pp. 131-149. , Kluwer Academic PublishersX-ray Microbeam Speech Production Database User's Handbook Version 1.0Xue, Q., Improvement in tracking of articulatory movements with the x-ray microbeam system Annual International Conference on Engineering in Medicine and Biology SocietyMunhall, K.G., VatikiotisBateson, E., Tohkura, Y., Xray film database for speech research (1995) The Journal of the Acoustical Society of America, 98 (2), pp. 1222-1224Sock, R., Hirsch, F., Laprie, Y., Perrier, P., Vaxelaire, B., An x-ray database, tools and procedures for the study of speech production 9th International Seminar on Speech Production (ISSP 2011), , V.L. Gracco D.J. Ostry L. Mnard, S.R. Baum, editor, JuneWrench, A.A., Hardcastle, W.J., A multichannel articulatory database and its application for automatic speech recognition (2000) 5th Seminar on Speech Production: Models and Data, 1Rudzicz, F., Namasivayam, A., Wolff, T., The torgo database of acoustic and articulatory speech from speakers with dysarthria (2010) Language Resources and Evaluation, 46 (1), pp. 1-19Narayanan, S., Toutios, A., Ramanarayanan, V., Lammert, A., Kim, J., Lee, S., Nayak, K., Proctor, M., Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC) (2014) The Journal of the Acoustical Society of America, 136 (3), pp. 1307-1311Gábor Csapó, T., Grósz, T., Gosztolya, G., Tóth, L., Markó, A., DNN-based ultrasound-to-speech conversion for a silent speech interface (2017) Proc. Interspeech, pp. 3672-3676. , Stockholm, SwedenQin, C., Carreira-Perpinán, M.A., Richmond, K., Wrench, A., Renals, S., Predicting tongue shapes from a few landmark locations (2008) Ninth Annual Conference of the International Speech Communication AssociationPreston, J.L., McAllister Byun, T., Boyce, S.E., Hamilton, S., Tiede, M., Phillips, E., Rivera-Campos, A., Whalen, D.H., Ultrasound images of the tongue: A tutorial for assessment and remediation of speech sound errors (2017) Journal of Visualized Experiments: JoVE, 119Gábor Csapó, T., Grósz, T., Gosztolya, G., Tóth, L., Markó, A., Dnn-based ultrasound-to-speech conversion for a silent speech interface (2017) Proc. Interspeech, Stockholm, Sweden, pp. 3672-3676Xu, K., Roussel, P., Gábor Csapó, T., Denby, B., Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using b-mode ultrasound images The Journal of the Acoustical Society of America, 141 (6)Scobbie, J.M., Wrench, A.A., Van Der Linden, M., Headprobe stabilisation in ultrasound tongue imaging using a headset to permit natural head movement (2008) Proceedings of the 8th International Seminar on Speech Production, pp. 373-376The haskins optically corrected ultrasound system (HOCUS) (2005) Journal of Speech Language and Hearing Research, 48 (3), p. 543Jallon, J.F., Berthommier, F., A semi-automatic method for extracting vocal tract movements from X-ray films (2009) Speech Communication, 51 (2), pp. 97-115Fontecave, J., Berthommier, F., Quasi-automatic extraction of tongue movement from a large existing speech cineradiographic database (2009) Evaluation, 2, pp. 8-11Ghosh, P.K., Narayanan, S., A generalized smoothness criterion for acoustic-to-articulatory inversion (2010) The Journal of the Acoustical Society of America, 128, pp. 2162-2172Lofqvist, A., Tongue movement kinematics in long and short Japanese consonants (2007) Journal of the Acoustical Society of America, 122 (1), pp. 512-518Li, M., Kambhamettu, C., Stone, M., Automatic contour tracking in ultrasound images (2005) Clinical Linguistics & Phonetics, 19 (6-7), pp. 545-554Kass, M., Witkin, A., Terzopoulos, D., Snakes: Active contour models (1988) International Journal of Computer Vision, 1 (4), pp. 321-331Xu, K., Csapó, T.G., Roussel, P., Denby, B., A comparative study on the contour tracking algorithms in ultrasound tongue images with automatic re-initialization (2016) The Journal of the Acoustical Society of America, 139 (5), pp. EL154-EL160Yu, Y., Acton, S.T., Speckle reducing anisotropic diffusion (2002) IEEE Transactions on Image Processing, 11 (11), pp. 1260-1270. , NovLozano-Herrera, C., Gmez-Reyes, J., (2017) Implementacin y Anlisis de un Mtodo Automtico de Deteccin Del Contorno Superior de la Lengua en Secuencias de Imgenes de Ultrasonido, , MayCadena-Bonfanti, A., Contreras-Ortiz, S.H., Giraldo-Guzmn, J., Porto-Solano, O., Speckle reduction in echocardiography by temporal compounding and anisotropic diffusion filtering (2014) 10th International Symposium on Medical Information Processing and Analysis, 2014. , OctKing, S., Frankel, J., Livescu, K., McDermott, E., Richmond, K., Wester, M., Speech production knowledge in automatic speech recognition (2007) The Journal of the Acoustical Society of America, 121 (2), pp. 723-742Ling, Z.-H., Richmond, K., Yamagishi, J., Wang, R.-H., Integrating articulatory features into hmm-based parametric speech synthesis (2009) IEEE Transactions on Audio, Speech, and Language Processing, 17 (6), pp. 1171-1185Li, M., Kim, J., Lammert, A., Kumar Ghosh, P., Ramanarayanan, V., Narayanan, S., Speaker verification based on the fusion of speech acoustics and inverted articulatory signals (2016) Computer Speech & Language, 36, pp. 196-211Wang, L., Qian, X., Han, W., Soong, F.K., Synthesizing photo-real talking head via trajectory-guided sample selection (2010) Eleventh Annual Conference of the International Speech Communication AssociationSepúlveda, A., Capobianco Guido, R., Castellanos-Dominguez, G., Estimation of relevant timefrequency features using Kendall coefficient for articulator position inference (2013) Speech Communication, 55 (1), pp. 99-110. , janStone, M., A guide to analysing tongue motion from ultrasound images (2005) Clinical Linguistics & Phonetics, 19 (6-7), pp. 455-501. , janCsapó, T.G., Lulich, S.M., Error analysis of extracted tongue contours from 2D ultrasound images (2015) Proc. Interspeech, pp. 2157-2161. , Dresden, GermanyGhosh, P.K., Narayanan, S., Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion (2011) The Journal of the Acoustical Society of America, 130 (4), pp. EL251-EL257Li, M., Kim, J., Lammert, A., Kumar Ghosh, P., Ramanarayanan, V., Narayanan, S., Speaker verification based on the fusion of speech acoustics and inverted articulatory signals (2016) Computer Speech & Language, 36, pp. 196-211http://purl.org/coar/resource_type/c_c94fTHUMBNAILMiniProdInv.pngMiniProdInv.pngimage/png23941https://repositorio.utb.edu.co/bitstream/20.500.12585/9154/1/MiniProdInv.png0cb0f101a8d16897fb46fc914d3d7043MD5120.500.12585/9154oai:repositorio.utb.edu.co:20.500.12585/91542023-05-25 15:52:19.648Repositorio Institucional UTBrepositorioutb@utb.edu.co |