Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english

This research focuses on the development of a personalized voice spoofing detection model, specifically targeting audio generated through deepfake techniques, such as text-to-speech (TTS) and voice conversion (VC). The growing sophistication of generative artificial intelligence models has facilitat...

Full description

Autores:: Daza Díaz, Paula Cecilia
Torres Ramírez, Sofia

Tipo de recurso:: Trabajo de grado de pregrado

Fecha de publicación:: 2024

Institución:: Universidad de los Andes

Repositorio:: Séneca: repositorio Uniandes

Idioma:: eng

id	UNIANDES2_41c155341c142ea93495ff21b0140b70
oai_identifier_str	oai:repositorio.uniandes.edu.co:1992/75424
network_acronym_str	UNIANDES2
network_name_str	Séneca: repositorio Uniandes
repository_id_str
dc.title.eng.fl_str_mv	Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english
dc.title.alternative.spa.fl_str_mv	Sistema de reconocimiento de tono de voz y anti-spoofing de doble capa para audios deepfake en inglés
title	Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english
spellingShingle	Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english Voice Spoofing Detection Personalized Voice Recognition Vocal Tone Identification Voice conversion text-to-speech Ingeniería
title_short	Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english
title_full	Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english
title_fullStr	Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english
title_full_unstemmed	Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english
title_sort	Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english
dc.creator.fl_str_mv	Daza Díaz, Paula Cecilia Torres Ramírez, Sofia
dc.contributor.advisor.none.fl_str_mv	Manrique Piramanrique, Rubén Francisco
dc.contributor.author.none.fl_str_mv	Daza Díaz, Paula Cecilia Torres Ramírez, Sofia
dc.contributor.researchgroup.none.fl_str_mv	Facultad de Ingeniería::TICSw: Tecnologías de Información y Construcción de Software
dc.subject.keyword.eng.fl_str_mv	Voice Spoofing Detection Personalized Voice Recognition Vocal Tone Identification Voice conversion text-to-speech
topic	Voice Spoofing Detection Personalized Voice Recognition Vocal Tone Identification Voice conversion text-to-speech Ingeniería
dc.subject.themes.none.fl_str_mv	Ingeniería
description	This research focuses on the development of a personalized voice spoofing detection model, specifically targeting audio generated through deepfake techniques, such as text-to-speech (TTS) and voice conversion (VC). The growing sophistication of generative artificial intelligence models has facilitated the creation of falsified audio that is nearly indistinguishable to the human ear, posing a significant risk to individual security and privacy. In response to this threat, our project proposes an innovative solution: a dual-layer voice spoofing detection system. The first layer focuses on detecting whether the audio is real or spoofed, while the second layer identifies whether the voice tone belongs to the legitimate user or someone else, adding an extra level of personalization and security. This system is built upon an existing pre-trained voice spoofing detection model that undergoes fine-tuning with user-specific data for the first layer, which focuses on detecting whether the audio is real or spoofed. For the second layer, the same model is repurposed, but instead of training it for spoof detection, it is trained to recognize the user's specific vocal tone using real voice samples from multiple individuals with different vocal tones. To facilitate the adoption and use of this technology, we have developed a user-friendly tool. This tool allows users to provide just a few minutes of their voice recordings, after which it automatically generates deepfake audio and analyzes voice tones using the mentioned techniques. These falsified audios, along with the genuine recordings, are used to train the detection models, adapting them specifically to the user's voice and tone. This dual-layered approach offers a robust and personalized solution to protect the user's vocal identity against deepfake spoofing threats, not only by verifying the authenticity of the audio but also by ensuring that the tone of voice truly belongs to the legitimate user.
publishDate	2024
dc.date.issued.none.fl_str_mv	2024-12-03
dc.date.accessioned.none.fl_str_mv	2025-01-15T14:25:06Z
dc.date.accepted.none.fl_str_mv	2025-01-15
dc.date.available.none.fl_str_mv	2026-01-31
dc.type.none.fl_str_mv	Trabajo de grado - Pregrado
dc.type.driver.none.fl_str_mv	info:eu-repo/semantics/bachelorThesis
dc.type.version.none.fl_str_mv	info:eu-repo/semantics/acceptedVersion
dc.type.coar.none.fl_str_mv	http://purl.org/coar/resource_type/c_7a1f
dc.type.content.none.fl_str_mv	Text
dc.type.redcol.none.fl_str_mv	http://purl.org/redcol/resource_type/TP
format	http://purl.org/coar/resource_type/c_7a1f
status_str	acceptedVersion
dc.identifier.uri.none.fl_str_mv	https://hdl.handle.net/1992/75424
dc.identifier.instname.none.fl_str_mv	instname:Universidad de los Andes
dc.identifier.reponame.none.fl_str_mv	reponame:Repositorio Institucional Séneca
dc.identifier.repourl.none.fl_str_mv	repourl:https://repositorio.uniandes.edu.co/
url	https://hdl.handle.net/1992/75424
identifier_str_mv	instname:Universidad de los Andes reponame:Repositorio Institucional Séneca repourl:https://repositorio.uniandes.edu.co/
dc.language.iso.none.fl_str_mv	eng
language	eng
dc.relation.references.none.fl_str_mv	Zahra Khanjani, Gabrielle Watson, and Vandana P. Janeja. “How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey”. In: arXiv (2021). DOI: https://arxiv.org/abs/2111.14203. Yu Xie, Zhiyao Zhang, and Yi Yang. “Siamese network with WAV2VEC feature for spoofing speech detection”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2021. URL: https://doi.org/10.21437/interspeech.2021-847. Nicolas M¨uller et al. “Does Audio Deepfake Detection Generalize?” In: Proc. Interspeech 2022. 2022, pp. 2783–2787. DOI: 10.21437/Interspeech.2022-108. A. Author1 and B. Author2. “Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video”. In: arXiv (2022). DOI: https://arxiv.org/abs/2202.12883. Jordan J. Bird and Ahmad Lotfi. “Real-time detection of AI-Generated Speech for deepfake voice conversion”. In: arXiv (2023). DOI: https://arxiv.org/pdf/2308.12734. Hemlata Tak et al. “End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection”. In: arXiv (2021). DOI: https://arxiv.org/abs/2107.12710. XinWang and Junichi Yamagishi. “A comparative study on recent neural spoofing countermeasures for synthetic speech detection”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Vol. 6. 2021, pp. 4685–4689. X. Liu et al. “ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild. IEEE/ACM Transactions On Audio Speech And Language Processing”. In: IEEE 31 (2023). DOI: https://doi.org/10.1109/taslp.2023.3285283. P. A. Tamayo Fl´orez. “Does Audio Deepfake Detection Generalize?” In: Proc. Interspeech 2022. 2022, pp. 2783–2787. DOI: 10.21437/Interspeech.2022-108. Manel Rabhi, Spiridon Bakiras, and Roberto Di Pietro. “Audio-deepfake detection: Adversarial attacks and countermeasures”. In: Expert Systems with Applications 250 (2024), p. 123941. URL: https://doi.org/10.1016/j.eswa.2024.123941. Zhiyao Wu et al. “ASVspoof 2015: The first automatic speaker verification spoofing and countermeasures challenge”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Vol. 2015-Janua. 2015, pp. 2037–2041. Tayyab Arif et al. “Voice spoofing countermeasure for logical access attacks detection”. In: IEEE Access 9 (2021), pp. 162857–162868. Hermann Dinkel, Yanmin Qian, and Kai Yu. “Investigating raw wave deep neural networks for end-to-end speaker spoofing detection”. In: IEEE/ACM Transactions on Audio Speech and Language Processing 26 (2018), pp. 2002–2014. ASVspoof Consortium. ASVspoof 2019 evaluation plan. 2019. Hemlata Tak et al. “Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation”. In: Proc. Odyssey 2022 The Speaker and Language Recognition Workshop. 2022, pp. 141–147. URL: https://arxiv.org/abs/2202.12233. Ruiqi He et al. “Raw PC-DARTS: Searching for convolutional neural architectures for speaker verification”. In: arXiv preprint arXiv:2107.12212 (2021). URL: https://arxiv.org/abs/2107.12212. Elisabeth Zetterholm. “Detection of Speaker Characteristics Using Voice Imitation”. In: SpringerLink (2007). URL: https://doi.org/10.1007/978-3-540-74122-0_16. Qi. Li. Speaker authentication. eng. Signals and communication technology. Springer, 2012. ISBN: 1-283-45117-4. Sabu M. Thampi, Alexander. Gelbukh, and Jayanta. Mukhopadhyay. Advances in Signal Processing and Intelligent Recognition Systems. eng. Advances in Intelligent Systems and Computing, 264. Springer International Publishing, 2014. ISBN: 3-319-04960-7. Elisabeth Zetterholm. Detection of Speaker Characteristics Using Voice Imitation. Ed. by Christian M¨uller. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 192–205. ISBN: 978-3-540-74122-0. DOI: 10.1007/978-3-540-74122-0_16. URL: https://doi.org/10.1007/978-3-540-74122-0_16. TTS. coquiTTS v1. https://github.com/coqui-ai/TTS?tab=readme-ov-file. Accedido: 08 30, 2024. 2023. Galina Lavrentyeva. “STC Antispoofing Systems for the ASVspoof2019 Challenge”. In: arXiv (2019). URL: https://arxiv.org/abs/1904.05576. TTS. XTTS-v2. https://huggingface.co/coqui/XTTS-v2. Accedido: 09 10, 2024. 2024. fish-speech. fish-speech. https://github.com/fishaudio/fish-speech. Accedido: 09 11, 2024. 2024. TedManders. Deep Learning Challenge: Speaker Identification. https://github.com/TedManders/speaker-identification. Accedido: 09 24, 2024. 2022. vivekkr12. You Only Speak Once. https://github.com/Speaker-Identification/You-Only-Speak-Once. Accedido: 09 25, 2024. 2020. Leena Mary. “Significance of Prosody for Speaker, Language and Speech Recognition”. In: Extraction and Representation of Prosody for Speaker, Speech and Language Recognition. New York, NY: Springer New York, 2012, pp. 1–18. ISBN: 978-1-4614-1159-8. DOI: 10.1007/978-1-4614-1159-8_1. URL: https://doi.org/10.1007/978-1-4614-1159-8_1. James Betker. “Better speech synthesis through scaling”. In: arXiv (2023). URL: https://arxiv.org/abs/2305.07243.
dc.rights.en.fl_str_mv	Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri.none.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.accessrights.none.fl_str_mv	info:eu-repo/semantics/embargoedAccess
dc.rights.coar.none.fl_str_mv	http://purl.org/coar/access_right/c_f1cf
rights_invalid_str_mv	Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ http://purl.org/coar/access_right/c_f1cf
eu_rights_str_mv	embargoedAccess
dc.format.extent.none.fl_str_mv	39 páginas
dc.format.mimetype.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Universidad de los Andes
dc.publisher.program.none.fl_str_mv	Ingeniería de Sistemas y Computación
dc.publisher.faculty.none.fl_str_mv	Facultad de Ingeniería
dc.publisher.department.none.fl_str_mv	Departamento de Ingeniería de Sistemas y Computación
publisher.none.fl_str_mv	Universidad de los Andes
institution	Universidad de los Andes
bitstream.url.fl_str_mv	https://repositorio.uniandes.edu.co/bitstreams/0d143a55-63a2-46f4-809e-253b18b6f029/download https://repositorio.uniandes.edu.co/bitstreams/89a82f01-4b9a-4aa9-92ea-6a23e8792134/download https://repositorio.uniandes.edu.co/bitstreams/8d58c371-36e2-4482-bbe5-11036a53c3a3/download https://repositorio.uniandes.edu.co/bitstreams/3c287c2f-1952-4f3b-a7eb-238d3e9d70bc/download https://repositorio.uniandes.edu.co/bitstreams/d0744714-d1c6-433f-9d1f-e1df3ea74e4c/download https://repositorio.uniandes.edu.co/bitstreams/1b364a71-8077-4fd4-8aaa-1105ed707a2e/download https://repositorio.uniandes.edu.co/bitstreams/f1d31346-aa55-451a-a8d4-ed370897ee39/download https://repositorio.uniandes.edu.co/bitstreams/4bb27e85-79f7-41ab-93a4-efea74f814cc/download
bitstream.checksum.fl_str_mv	1d42bd85a080766181ac4efdef4a40ac 318e349664cbad385f331c47a7869bc1 4460e5956bc1d1639be9ae6146a50347 ae9e573a68e7f92501b6913cc846c39f a8955fd438417c21946813a372a085d2 530d320eb9a3b695b3582e4fc981c68e f9fc4eae4100966bb8528a77984d1bf2 7dce52f55fc1598e16a3f46de7b71083
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5 MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio institucional Séneca
repository.mail.fl_str_mv	adminrepositorio@uniandes.edu.co
_version_	1837005031535017984
spelling	Manrique Piramanrique, Rubén Franciscovirtual::22103-1Daza Díaz, Paula CeciliaTorres Ramírez, SofiaFacultad de Ingeniería::TICSw: Tecnologías de Información y Construcción de Software2025-01-15T14:25:06Z2026-01-312024-12-032025-01-15https://hdl.handle.net/1992/75424instname:Universidad de los Andesreponame:Repositorio Institucional Sénecarepourl:https://repositorio.uniandes.edu.co/This research focuses on the development of a personalized voice spoofing detection model, specifically targeting audio generated through deepfake techniques, such as text-to-speech (TTS) and voice conversion (VC). The growing sophistication of generative artificial intelligence models has facilitated the creation of falsified audio that is nearly indistinguishable to the human ear, posing a significant risk to individual security and privacy. In response to this threat, our project proposes an innovative solution: a dual-layer voice spoofing detection system. The first layer focuses on detecting whether the audio is real or spoofed, while the second layer identifies whether the voice tone belongs to the legitimate user or someone else, adding an extra level of personalization and security. This system is built upon an existing pre-trained voice spoofing detection model that undergoes fine-tuning with user-specific data for the first layer, which focuses on detecting whether the audio is real or spoofed. For the second layer, the same model is repurposed, but instead of training it for spoof detection, it is trained to recognize the user's specific vocal tone using real voice samples from multiple individuals with different vocal tones. To facilitate the adoption and use of this technology, we have developed a user-friendly tool. This tool allows users to provide just a few minutes of their voice recordings, after which it automatically generates deepfake audio and analyzes voice tones using the mentioned techniques. These falsified audios, along with the genuine recordings, are used to train the detection models, adapting them specifically to the user's voice and tone. This dual-layered approach offers a robust and personalized solution to protect the user's vocal identity against deepfake spoofing threats, not only by verifying the authenticity of the audio but also by ensuring that the tone of voice truly belongs to the legitimate user.Esta investigación se centra en el desarrollo de un modelo de detección de suplantación de voz personalizado, específicamente dirigido a audio generado mediante técnicas de deepfake, como la conversión de texto a voz (TTS) y la conversión de voz (VC). El creciente perfeccionamiento de los modelos de inteligencia artificial generativa ha facilitado la creación de audio falsificado casi indistinguible para el oído humano, planteando un riesgo importante para la seguridad y la privacidad individual. En respuesta a esta amenaza, nuestro proyecto propone una solución innovadora: un sistema de detección de suplantación de voz de doble capa. La primera capa se centra en detectar si el audio es real o falso, mientras que la segunda capa identifica si el tono de voz pertenece al usuario legítimo o a otra persona, agregando un nivel adicional de personalización y seguridad. Este sistema se basa en un modelo de detección de suplantación de voz preentrenado que se somete a un ajuste fino con datos específicos del usuario para la primera capa, enfocada en determinar si el audio es real o falso. Para la segunda capa, se reutiliza el mismo modelo, pero en lugar de entrenarlo para la detección de suplantación, se entrena para reconocer el tono de voz específico del usuario utilizando muestras de voz reales de múltiples individuos con diferentes tonos de voz. Para facilitar la adopción y el uso de esta tecnología, hemos desarrollado una herramienta fácil de utilizar. Dicha herramienta permite a los usuarios proporcionar solo unos minutos de sus grabaciones de voz, tras lo cual genera automáticamente audio deepfake y analiza los tonos de voz mediante las técnicas mencionadas. Estos audios falsificados, junto con las grabaciones genuinas, se emplean para entrenar los modelos de detección, adaptándolos específicamente a la voz y el tono del usuario. Este enfoque de doble capa ofrece una solución sólida y personalizada para proteger la identidad vocal del usuario frente a amenazas de suplantación deepfake, no solo verificando la autenticidad del audio, sino también garantizando que el tono de voz pertenezca verdaderamente al usuario legítimo.Pregrado39 páginasapplication/pdfengUniversidad de los AndesIngeniería de Sistemas y ComputaciónFacultad de IngenieríaDepartamento de Ingeniería de Sistemas y ComputaciónAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/embargoedAccesshttp://purl.org/coar/access_right/c_f1cfDual-Layer anti-spoofing and voice tone recognition system for deepfake audio in englishSistema de reconocimiento de tono de voz y anti-spoofing de doble capa para audios deepfake en inglésTrabajo de grado - Pregradoinfo:eu-repo/semantics/bachelorThesisinfo:eu-repo/semantics/acceptedVersionhttp://purl.org/coar/resource_type/c_7a1fTexthttp://purl.org/redcol/resource_type/TPVoice Spoofing DetectionPersonalized Voice RecognitionVocal Tone IdentificationVoice conversiontext-to-speechIngenieríaZahra Khanjani, Gabrielle Watson, and Vandana P. Janeja. “How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey”. In: arXiv (2021). DOI: https://arxiv.org/abs/2111.14203.Yu Xie, Zhiyao Zhang, and Yi Yang. “Siamese network with WAV2VEC feature for spoofing speech detection”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2021. URL: https://doi.org/10.21437/interspeech.2021-847.Nicolas M¨uller et al. “Does Audio Deepfake Detection Generalize?” In: Proc. Interspeech 2022. 2022, pp. 2783–2787. DOI: 10.21437/Interspeech.2022-108.A. Author1 and B. Author2. “Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video”. In: arXiv (2022). DOI: https://arxiv.org/abs/2202.12883.Jordan J. Bird and Ahmad Lotfi. “Real-time detection of AI-Generated Speech for deepfake voice conversion”. In: arXiv (2023). DOI: https://arxiv.org/pdf/2308.12734.Hemlata Tak et al. “End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection”. In: arXiv (2021). DOI: https://arxiv.org/abs/2107.12710.XinWang and Junichi Yamagishi. “A comparative study on recent neural spoofing countermeasures for synthetic speech detection”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Vol. 6. 2021, pp. 4685–4689.X. Liu et al. “ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild. IEEE/ACM Transactions On Audio Speech And Language Processing”. In: IEEE 31 (2023). DOI: https://doi.org/10.1109/taslp.2023.3285283.P. A. Tamayo Fl´orez. “Does Audio Deepfake Detection Generalize?” In: Proc. Interspeech 2022. 2022, pp. 2783–2787. DOI: 10.21437/Interspeech.2022-108.Manel Rabhi, Spiridon Bakiras, and Roberto Di Pietro. “Audio-deepfake detection: Adversarial attacks and countermeasures”. In: Expert Systems with Applications 250 (2024), p. 123941. URL: https://doi.org/10.1016/j.eswa.2024.123941.Zhiyao Wu et al. “ASVspoof 2015: The first automatic speaker verification spoofing and countermeasures challenge”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Vol. 2015-Janua. 2015, pp. 2037–2041.Tayyab Arif et al. “Voice spoofing countermeasure for logical access attacks detection”. In: IEEE Access 9 (2021), pp. 162857–162868.Hermann Dinkel, Yanmin Qian, and Kai Yu. “Investigating raw wave deep neural networks for end-to-end speaker spoofing detection”. In: IEEE/ACM Transactions on Audio Speech and Language Processing 26 (2018), pp. 2002–2014.ASVspoof Consortium. ASVspoof 2019 evaluation plan. 2019.Hemlata Tak et al. “Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation”. In: Proc. Odyssey 2022 The Speaker and Language Recognition Workshop. 2022, pp. 141–147. URL: https://arxiv.org/abs/2202.12233.Ruiqi He et al. “Raw PC-DARTS: Searching for convolutional neural architectures for speaker verification”. In: arXiv preprint arXiv:2107.12212 (2021). URL: https://arxiv.org/abs/2107.12212.Elisabeth Zetterholm. “Detection of Speaker Characteristics Using Voice Imitation”. In: SpringerLink (2007). URL: https://doi.org/10.1007/978-3-540-74122-0_16.Qi. Li. Speaker authentication. eng. Signals and communication technology. Springer, 2012. ISBN: 1-283-45117-4.Sabu M. Thampi, Alexander. Gelbukh, and Jayanta. Mukhopadhyay. Advances in Signal Processing and Intelligent Recognition Systems. eng. Advances in Intelligent Systems and Computing, 264. Springer International Publishing, 2014. ISBN: 3-319-04960-7.Elisabeth Zetterholm. Detection of Speaker Characteristics Using Voice Imitation. Ed. by Christian M¨uller. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 192–205. ISBN: 978-3-540-74122-0. DOI: 10.1007/978-3-540-74122-0_16. URL: https://doi.org/10.1007/978-3-540-74122-0_16.TTS. coquiTTS v1. https://github.com/coqui-ai/TTS?tab=readme-ov-file. Accedido: 08 30, 2024. 2023.Galina Lavrentyeva. “STC Antispoofing Systems for the ASVspoof2019 Challenge”. In: arXiv (2019). URL: https://arxiv.org/abs/1904.05576.TTS. XTTS-v2. https://huggingface.co/coqui/XTTS-v2. Accedido: 09 10, 2024. 2024.fish-speech. fish-speech. https://github.com/fishaudio/fish-speech. Accedido: 09 11, 2024. 2024.TedManders. Deep Learning Challenge: Speaker Identification. https://github.com/TedManders/speaker-identification. Accedido: 09 24, 2024. 2022.vivekkr12. You Only Speak Once. https://github.com/Speaker-Identification/You-Only-Speak-Once. Accedido: 09 25, 2024. 2020.Leena Mary. “Significance of Prosody for Speaker, Language and Speech Recognition”. In: Extraction and Representation of Prosody for Speaker, Speech and Language Recognition. New York, NY: Springer New York, 2012, pp. 1–18. ISBN: 978-1-4614-1159-8. DOI: 10.1007/978-1-4614-1159-8_1. URL: https://doi.org/10.1007/978-1-4614-1159-8_1.James Betker. “Better speech synthesis through scaling”. In: arXiv (2023). URL: https://arxiv.org/abs/2305.07243.202111276202014872Publication9f6e12e0-098e-4548-ab81-75552e8385e7virtual::22103-19f6e12e0-098e-4548-ab81-75552e8385e7virtual::22103-1ORIGINALDual-layer Anti-Spoofing system.pdfDual-layer Anti-Spoofing system.pdfRestricción de acceso hasta el año 2026 para proteger los resultados y garantizar su publicación y aplicación antes de la difusión públicaapplication/pdf1779521https://repositorio.uniandes.edu.co/bitstreams/0d143a55-63a2-46f4-809e-253b18b6f029/download1d42bd85a080766181ac4efdef4a40acMD51autorizacion tesis signed.pdfautorizacion tesis signed.pdfHIDEapplication/pdf208488https://repositorio.uniandes.edu.co/bitstreams/89a82f01-4b9a-4aa9-92ea-6a23e8792134/download318e349664cbad385f331c47a7869bc1MD54CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8805https://repositorio.uniandes.edu.co/bitstreams/8d58c371-36e2-4482-bbe5-11036a53c3a3/download4460e5956bc1d1639be9ae6146a50347MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82535https://repositorio.uniandes.edu.co/bitstreams/3c287c2f-1952-4f3b-a7eb-238d3e9d70bc/downloadae9e573a68e7f92501b6913cc846c39fMD53TEXTDual-layer Anti-Spoofing system.pdf.txtDual-layer Anti-Spoofing system.pdf.txtExtracted texttext/plain94075https://repositorio.uniandes.edu.co/bitstreams/d0744714-d1c6-433f-9d1f-e1df3ea74e4c/downloada8955fd438417c21946813a372a085d2MD55autorizacion tesis signed.pdf.txtautorizacion tesis signed.pdf.txtExtracted texttext/plain1176https://repositorio.uniandes.edu.co/bitstreams/1b364a71-8077-4fd4-8aaa-1105ed707a2e/download530d320eb9a3b695b3582e4fc981c68eMD57THUMBNAILDual-layer Anti-Spoofing system.pdf.jpgDual-layer Anti-Spoofing system.pdf.jpgGenerated Thumbnailimage/jpeg8685https://repositorio.uniandes.edu.co/bitstreams/f1d31346-aa55-451a-a8d4-ed370897ee39/downloadf9fc4eae4100966bb8528a77984d1bf2MD56autorizacion tesis signed.pdf.jpgautorizacion tesis signed.pdf.jpgGenerated Thumbnailimage/jpeg10646https://repositorio.uniandes.edu.co/bitstreams/4bb27e85-79f7-41ab-93a4-efea74f814cc/download7dce52f55fc1598e16a3f46de7b71083MD581992/75424oai:repositorio.uniandes.edu.co:1992/754242025-03-05 10:01:56.52http://creativecommons.org/licenses/by-nc-nd/4.0/Attribution-NonCommercial-NoDerivatives 4.0 Internationalrestrictedhttps://repositorio.uniandes.edu.coRepositorio institucional Sénecaadminrepositorio@uniandes.edu.coPGgzPjxzdHJvbmc+RGVzY2FyZ28gZGUgUmVzcG9uc2FiaWxpZGFkIC0gTGljZW5jaWEgZGUgQXV0b3JpemFjacOzbjwvc3Ryb25nPjwvaDM+CjxwPjxzdHJvbmc+UG9yIGZhdm9yIGxlZXIgYXRlbnRhbWVudGUgZXN0ZSBkb2N1bWVudG8gcXVlIHBlcm1pdGUgYWwgUmVwb3NpdG9yaW8gSW5zdGl0dWNpb25hbCBTw6luZWNhIHJlcHJvZHVjaXIgeSBkaXN0cmlidWlyIGxvcyByZWN1cnNvcyBkZSBpbmZvcm1hY2nDs24gZGVwb3NpdGFkb3MgbWVkaWFudGUgbGEgYXV0b3JpemFjacOzbiBkZSBsb3Mgc2lndWllbnRlcyB0w6lybWlub3M6PC9zdHJvbmc+PC9wPgo8cD5Db25jZWRhIGxhIGxpY2VuY2lhIGRlIGRlcMOzc2l0byBlc3TDoW5kYXIgc2VsZWNjaW9uYW5kbyBsYSBvcGNpw7NuIDxzdHJvbmc+J0FjZXB0YXIgbG9zIHTDqXJtaW5vcyBhbnRlcmlvcm1lbnRlIGRlc2NyaXRvcyc8L3N0cm9uZz4geSBjb250aW51YXIgZWwgcHJvY2VzbyBkZSBlbnbDrW8gbWVkaWFudGUgZWwgYm90w7NuIDxzdHJvbmc+J1NpZ3VpZW50ZScuPC9zdHJvbmc+PC9wPgo8aHI+CjxwPllvLCBlbiBtaSBjYWxpZGFkIGRlIGF1dG9yIGRlbCB0cmFiYWpvIGRlIHRlc2lzLCBtb25vZ3JhZsOtYSBvIHRyYWJham8gZGUgZ3JhZG8sIGhhZ28gZW50cmVnYSBkZWwgZWplbXBsYXIgcmVzcGVjdGl2byB5IGRlIHN1cyBhbmV4b3MgZGUgc2VyIGVsIGNhc28sIGVuIGZvcm1hdG8gZGlnaXRhbCB5L28gZWxlY3Ryw7NuaWNvIHkgYXV0b3Jpem8gYSBsYSBVbml2ZXJzaWRhZCBkZSBsb3MgQW5kZXMgcGFyYSBxdWUgcmVhbGljZSBsYSBwdWJsaWNhY2nDs24gZW4gZWwgU2lzdGVtYSBkZSBCaWJsaW90ZWNhcyBvIGVuIGN1YWxxdWllciBvdHJvIHNpc3RlbWEgbyBiYXNlIGRlIGRhdG9zIHByb3BpbyBvIGFqZW5vIGEgbGEgVW5pdmVyc2lkYWQgeSBwYXJhIHF1ZSBlbiBsb3MgdMOpcm1pbm9zIGVzdGFibGVjaWRvcyBlbiBsYSBMZXkgMjMgZGUgMTk4MiwgTGV5IDQ0IGRlIDE5OTMsIERlY2lzacOzbiBBbmRpbmEgMzUxIGRlIDE5OTMsIERlY3JldG8gNDYwIGRlIDE5OTUgeSBkZW3DoXMgbm9ybWFzIGdlbmVyYWxlcyBzb2JyZSBsYSBtYXRlcmlhLCB1dGlsaWNlIGVuIHRvZGFzIHN1cyBmb3JtYXMsIGxvcyBkZXJlY2hvcyBwYXRyaW1vbmlhbGVzIGRlIHJlcHJvZHVjY2nDs24sIGNvbXVuaWNhY2nDs24gcMO6YmxpY2EsIHRyYW5zZm9ybWFjacOzbiB5IGRpc3RyaWJ1Y2nDs24gKGFscXVpbGVyLCBwcsOpc3RhbW8gcMO6YmxpY28gZSBpbXBvcnRhY2nDs24pIHF1ZSBtZSBjb3JyZXNwb25kZW4gY29tbyBjcmVhZG9yIGRlIGxhIG9icmEgb2JqZXRvIGRlbCBwcmVzZW50ZSBkb2N1bWVudG8uPC9wPgo8cD5MYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuIHNlIGVtaXRlIGVuIGNhbGlkYWQgZGUgYXV0b3IgZGUgbGEgb2JyYSBvYmpldG8gZGVsIHByZXNlbnRlIGRvY3VtZW50byB5IG5vIGNvcnJlc3BvbmRlIGEgY2VzacOzbiBkZSBkZXJlY2hvcywgc2lubyBhIGxhIGF1dG9yaXphY2nDs24gZGUgdXNvIGFjYWTDqW1pY28gZGUgY29uZm9ybWlkYWQgY29uIGxvIGFudGVyaW9ybWVudGUgc2XDsWFsYWRvLiBMYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuIHNlIGhhY2UgZXh0ZW5zaXZhIG5vIHNvbG8gYSBsYXMgZmFjdWx0YWRlcyB5IGRlcmVjaG9zIGRlIHVzbyBzb2JyZSBsYSBvYnJhIGVuIGZvcm1hdG8gbyBzb3BvcnRlIG1hdGVyaWFsLCBzaW5vIHRhbWJpw6luIHBhcmEgZm9ybWF0byBlbGVjdHLDs25pY28sIHkgZW4gZ2VuZXJhbCBwYXJhIGN1YWxxdWllciBmb3JtYXRvIGNvbm9jaWRvIG8gcG9yIGNvbm9jZXIuPC9wPgo8cD5FbCBhdXRvciwgbWFuaWZpZXN0YSBxdWUgbGEgb2JyYSBvYmpldG8gZGUgbGEgcHJlc2VudGUgYXV0b3JpemFjacOzbiBlcyBvcmlnaW5hbCB5IGxhIHJlYWxpesOzIHNpbiB2aW9sYXIgbyB1c3VycGFyIGRlcmVjaG9zIGRlIGF1dG9yIGRlIHRlcmNlcm9zLCBwb3IgbG8gdGFudG8sIGxhIG9icmEgZXMgZGUgc3UgZXhjbHVzaXZhIGF1dG9yw61hIHkgdGllbmUgbGEgdGl0dWxhcmlkYWQgc29icmUgbGEgbWlzbWEuPC9wPgo8cD5FbiBjYXNvIGRlIHByZXNlbnRhcnNlIGN1YWxxdWllciByZWNsYW1hY2nDs24gbyBhY2Npw7NuIHBvciBwYXJ0ZSBkZSB1biB0ZXJjZXJvIGVuIGN1YW50byBhIGxvcyBkZXJlY2hvcyBkZSBhdXRvciBzb2JyZSBsYSBvYnJhIGVuIGN1ZXN0acOzbiwgZWwgYXV0b3IgYXN1bWlyw6EgdG9kYSBsYSByZXNwb25zYWJpbGlkYWQsIHkgc2FsZHLDoSBkZSBkZWZlbnNhIGRlIGxvcyBkZXJlY2hvcyBhcXXDrSBhdXRvcml6YWRvcywgcGFyYSB0b2RvcyBsb3MgZWZlY3RvcyBsYSBVbml2ZXJzaWRhZCBhY3TDumEgY29tbyB1biB0ZXJjZXJvIGRlIGJ1ZW5hIGZlLjwvcD4KPHA+U2kgdGllbmUgYWxndW5hIGR1ZGEgc29icmUgbGEgbGljZW5jaWEsIHBvciBmYXZvciwgY29udGFjdGUgY29uIGVsIDxhIGhyZWY9Im1haWx0bzpiaWJsaW90ZWNhQHVuaWFuZGVzLmVkdS5jbyIgdGFyZ2V0PSJfYmxhbmsiPkFkbWluaXN0cmFkb3IgZGVsIFNpc3RlbWEuPC9hPjwvcD4K

Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english

Publicaciones similares