Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english
This research focuses on the development of a personalized voice spoofing detection model, specifically targeting audio generated through deepfake techniques, such as text-to-speech (TTS) and voice conversion (VC). The growing sophistication of generative artificial intelligence models has facilitat...
- Autores:
-
Daza Díaz, Paula Cecilia
Torres Ramírez, Sofia
- Tipo de recurso:
- Trabajo de grado de pregrado
- Fecha de publicación:
- 2024
- Institución:
- Universidad de los Andes
- Repositorio:
- Séneca: repositorio Uniandes
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.uniandes.edu.co:1992/75424
- Acceso en línea:
- https://hdl.handle.net/1992/75424
- Palabra clave:
- Voice Spoofing Detection
Personalized Voice Recognition
Vocal Tone Identification
Voice conversion
text-to-speech
Ingeniería
- Rights
- embargoedAccess
- License
- Attribution-NonCommercial-NoDerivatives 4.0 International
id |
UNIANDES2_41c155341c142ea93495ff21b0140b70 |
---|---|
oai_identifier_str |
oai:repositorio.uniandes.edu.co:1992/75424 |
network_acronym_str |
UNIANDES2 |
network_name_str |
Séneca: repositorio Uniandes |
repository_id_str |
|
dc.title.eng.fl_str_mv |
Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english |
dc.title.alternative.spa.fl_str_mv |
Sistema de reconocimiento de tono de voz y anti-spoofing de doble capa para audios deepfake en inglés |
title |
Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english |
spellingShingle |
Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english Voice Spoofing Detection Personalized Voice Recognition Vocal Tone Identification Voice conversion text-to-speech Ingeniería |
title_short |
Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english |
title_full |
Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english |
title_fullStr |
Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english |
title_full_unstemmed |
Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english |
title_sort |
Dual-Layer anti-spoofing and voice tone recognition system for deepfake audio in english |
dc.creator.fl_str_mv |
Daza Díaz, Paula Cecilia Torres Ramírez, Sofia |
dc.contributor.advisor.none.fl_str_mv |
Manrique Piramanrique, Rubén Francisco |
dc.contributor.author.none.fl_str_mv |
Daza Díaz, Paula Cecilia Torres Ramírez, Sofia |
dc.contributor.researchgroup.none.fl_str_mv |
Facultad de Ingeniería::TICSw: Tecnologías de Información y Construcción de Software |
dc.subject.keyword.eng.fl_str_mv |
Voice Spoofing Detection Personalized Voice Recognition Vocal Tone Identification Voice conversion text-to-speech |
topic |
Voice Spoofing Detection Personalized Voice Recognition Vocal Tone Identification Voice conversion text-to-speech Ingeniería |
dc.subject.themes.none.fl_str_mv |
Ingeniería |
description |
This research focuses on the development of a personalized voice spoofing detection model, specifically targeting audio generated through deepfake techniques, such as text-to-speech (TTS) and voice conversion (VC). The growing sophistication of generative artificial intelligence models has facilitated the creation of falsified audio that is nearly indistinguishable to the human ear, posing a significant risk to individual security and privacy. In response to this threat, our project proposes an innovative solution: a dual-layer voice spoofing detection system. The first layer focuses on detecting whether the audio is real or spoofed, while the second layer identifies whether the voice tone belongs to the legitimate user or someone else, adding an extra level of personalization and security. This system is built upon an existing pre-trained voice spoofing detection model that undergoes fine-tuning with user-specific data for the first layer, which focuses on detecting whether the audio is real or spoofed. For the second layer, the same model is repurposed, but instead of training it for spoof detection, it is trained to recognize the user's specific vocal tone using real voice samples from multiple individuals with different vocal tones. To facilitate the adoption and use of this technology, we have developed a user-friendly tool. This tool allows users to provide just a few minutes of their voice recordings, after which it automatically generates deepfake audio and analyzes voice tones using the mentioned techniques. These falsified audios, along with the genuine recordings, are used to train the detection models, adapting them specifically to the user's voice and tone. This dual-layered approach offers a robust and personalized solution to protect the user's vocal identity against deepfake spoofing threats, not only by verifying the authenticity of the audio but also by ensuring that the tone of voice truly belongs to the legitimate user. |
publishDate |
2024 |
dc.date.issued.none.fl_str_mv |
2024-12-03 |
dc.date.accessioned.none.fl_str_mv |
2025-01-15T14:25:06Z |
dc.date.accepted.none.fl_str_mv |
2025-01-15 |
dc.date.available.none.fl_str_mv |
2026-01-31 |
dc.type.none.fl_str_mv |
Trabajo de grado - Pregrado |
dc.type.driver.none.fl_str_mv |
info:eu-repo/semantics/bachelorThesis |
dc.type.version.none.fl_str_mv |
info:eu-repo/semantics/acceptedVersion |
dc.type.coar.none.fl_str_mv |
http://purl.org/coar/resource_type/c_7a1f |
dc.type.content.none.fl_str_mv |
Text |
dc.type.redcol.none.fl_str_mv |
http://purl.org/redcol/resource_type/TP |
format |
http://purl.org/coar/resource_type/c_7a1f |
status_str |
acceptedVersion |
dc.identifier.uri.none.fl_str_mv |
https://hdl.handle.net/1992/75424 |
dc.identifier.instname.none.fl_str_mv |
instname:Universidad de los Andes |
dc.identifier.reponame.none.fl_str_mv |
reponame:Repositorio Institucional Séneca |
dc.identifier.repourl.none.fl_str_mv |
repourl:https://repositorio.uniandes.edu.co/ |
url |
https://hdl.handle.net/1992/75424 |
identifier_str_mv |
instname:Universidad de los Andes reponame:Repositorio Institucional Séneca repourl:https://repositorio.uniandes.edu.co/ |
dc.language.iso.none.fl_str_mv |
eng |
language |
eng |
dc.relation.references.none.fl_str_mv |
Zahra Khanjani, Gabrielle Watson, and Vandana P. Janeja. “How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey”. In: arXiv (2021). DOI: https://arxiv.org/abs/2111.14203. Yu Xie, Zhiyao Zhang, and Yi Yang. “Siamese network with WAV2VEC feature for spoofing speech detection”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2021. URL: https://doi.org/10.21437/interspeech.2021-847. Nicolas M¨uller et al. “Does Audio Deepfake Detection Generalize?” In: Proc. Interspeech 2022. 2022, pp. 2783–2787. DOI: 10.21437/Interspeech.2022-108. A. Author1 and B. Author2. “Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video”. In: arXiv (2022). DOI: https://arxiv.org/abs/2202.12883. Jordan J. Bird and Ahmad Lotfi. “Real-time detection of AI-Generated Speech for deepfake voice conversion”. In: arXiv (2023). DOI: https://arxiv.org/pdf/2308.12734. Hemlata Tak et al. “End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection”. In: arXiv (2021). DOI: https://arxiv.org/abs/2107.12710. XinWang and Junichi Yamagishi. “A comparative study on recent neural spoofing countermeasures for synthetic speech detection”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Vol. 6. 2021, pp. 4685–4689. X. Liu et al. “ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild. IEEE/ACM Transactions On Audio Speech And Language Processing”. In: IEEE 31 (2023). DOI: https://doi.org/10.1109/taslp.2023.3285283. P. A. Tamayo Fl´orez. “Does Audio Deepfake Detection Generalize?” In: Proc. Interspeech 2022. 2022, pp. 2783–2787. DOI: 10.21437/Interspeech.2022-108. Manel Rabhi, Spiridon Bakiras, and Roberto Di Pietro. “Audio-deepfake detection: Adversarial attacks and countermeasures”. In: Expert Systems with Applications 250 (2024), p. 123941. URL: https://doi.org/10.1016/j.eswa.2024.123941. Zhiyao Wu et al. “ASVspoof 2015: The first automatic speaker verification spoofing and countermeasures challenge”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Vol. 2015-Janua. 2015, pp. 2037–2041. Tayyab Arif et al. “Voice spoofing countermeasure for logical access attacks detection”. In: IEEE Access 9 (2021), pp. 162857–162868. Hermann Dinkel, Yanmin Qian, and Kai Yu. “Investigating raw wave deep neural networks for end-to-end speaker spoofing detection”. In: IEEE/ACM Transactions on Audio Speech and Language Processing 26 (2018), pp. 2002–2014. ASVspoof Consortium. ASVspoof 2019 evaluation plan. 2019. Hemlata Tak et al. “Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation”. In: Proc. Odyssey 2022 The Speaker and Language Recognition Workshop. 2022, pp. 141–147. URL: https://arxiv.org/abs/2202.12233. Ruiqi He et al. “Raw PC-DARTS: Searching for convolutional neural architectures for speaker verification”. In: arXiv preprint arXiv:2107.12212 (2021). URL: https://arxiv.org/abs/2107.12212. Elisabeth Zetterholm. “Detection of Speaker Characteristics Using Voice Imitation”. In: SpringerLink (2007). URL: https://doi.org/10.1007/978-3-540-74122-0_16. Qi. Li. Speaker authentication. eng. Signals and communication technology. Springer, 2012. ISBN: 1-283-45117-4. Sabu M. Thampi, Alexander. Gelbukh, and Jayanta. Mukhopadhyay. Advances in Signal Processing and Intelligent Recognition Systems. eng. Advances in Intelligent Systems and Computing, 264. Springer International Publishing, 2014. ISBN: 3-319-04960-7. Elisabeth Zetterholm. Detection of Speaker Characteristics Using Voice Imitation. Ed. by Christian M¨uller. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 192–205. ISBN: 978-3-540-74122-0. DOI: 10.1007/978-3-540-74122-0_16. URL: https://doi.org/10.1007/978-3-540-74122-0_16. TTS. coquiTTS v1. https://github.com/coqui-ai/TTS?tab=readme-ov-file. Accedido: 08 30, 2024. 2023. Galina Lavrentyeva. “STC Antispoofing Systems for the ASVspoof2019 Challenge”. In: arXiv (2019). URL: https://arxiv.org/abs/1904.05576. TTS. XTTS-v2. https://huggingface.co/coqui/XTTS-v2. Accedido: 09 10, 2024. 2024. fish-speech. fish-speech. https://github.com/fishaudio/fish-speech. Accedido: 09 11, 2024. 2024. TedManders. Deep Learning Challenge: Speaker Identification. https://github.com/TedManders/speaker-identification. Accedido: 09 24, 2024. 2022. vivekkr12. You Only Speak Once. https://github.com/Speaker-Identification/You-Only-Speak-Once. Accedido: 09 25, 2024. 2020. Leena Mary. “Significance of Prosody for Speaker, Language and Speech Recognition”. In: Extraction and Representation of Prosody for Speaker, Speech and Language Recognition. New York, NY: Springer New York, 2012, pp. 1–18. ISBN: 978-1-4614-1159-8. DOI: 10.1007/978-1-4614-1159-8_1. URL: https://doi.org/10.1007/978-1-4614-1159-8_1. James Betker. “Better speech synthesis through scaling”. In: arXiv (2023). URL: https://arxiv.org/abs/2305.07243. |
dc.rights.en.fl_str_mv |
Attribution-NonCommercial-NoDerivatives 4.0 International |
dc.rights.uri.none.fl_str_mv |
http://creativecommons.org/licenses/by-nc-nd/4.0/ |
dc.rights.accessrights.none.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
dc.rights.coar.none.fl_str_mv |
http://purl.org/coar/access_right/c_f1cf |
rights_invalid_str_mv |
Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ http://purl.org/coar/access_right/c_f1cf |
eu_rights_str_mv |
embargoedAccess |
dc.format.extent.none.fl_str_mv |
39 páginas |
dc.format.mimetype.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Universidad de los Andes |
dc.publisher.program.none.fl_str_mv |
Ingeniería de Sistemas y Computación |
dc.publisher.faculty.none.fl_str_mv |
Facultad de Ingeniería |
dc.publisher.department.none.fl_str_mv |
Departamento de Ingeniería de Sistemas y Computación |
publisher.none.fl_str_mv |
Universidad de los Andes |
institution |
Universidad de los Andes |
bitstream.url.fl_str_mv |
https://repositorio.uniandes.edu.co/bitstreams/0d143a55-63a2-46f4-809e-253b18b6f029/download https://repositorio.uniandes.edu.co/bitstreams/89a82f01-4b9a-4aa9-92ea-6a23e8792134/download https://repositorio.uniandes.edu.co/bitstreams/8d58c371-36e2-4482-bbe5-11036a53c3a3/download https://repositorio.uniandes.edu.co/bitstreams/3c287c2f-1952-4f3b-a7eb-238d3e9d70bc/download https://repositorio.uniandes.edu.co/bitstreams/d0744714-d1c6-433f-9d1f-e1df3ea74e4c/download https://repositorio.uniandes.edu.co/bitstreams/1b364a71-8077-4fd4-8aaa-1105ed707a2e/download https://repositorio.uniandes.edu.co/bitstreams/f1d31346-aa55-451a-a8d4-ed370897ee39/download https://repositorio.uniandes.edu.co/bitstreams/4bb27e85-79f7-41ab-93a4-efea74f814cc/download |
bitstream.checksum.fl_str_mv |
1d42bd85a080766181ac4efdef4a40ac 318e349664cbad385f331c47a7869bc1 4460e5956bc1d1639be9ae6146a50347 ae9e573a68e7f92501b6913cc846c39f a8955fd438417c21946813a372a085d2 530d320eb9a3b695b3582e4fc981c68e f9fc4eae4100966bb8528a77984d1bf2 7dce52f55fc1598e16a3f46de7b71083 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositorio institucional Séneca |
repository.mail.fl_str_mv |
adminrepositorio@uniandes.edu.co |
_version_ |
1828159253543649280 |
spelling |
Manrique Piramanrique, Rubén Franciscovirtual::22103-1Daza Díaz, Paula CeciliaTorres Ramírez, SofiaFacultad de Ingeniería::TICSw: Tecnologías de Información y Construcción de Software2025-01-15T14:25:06Z2026-01-312024-12-032025-01-15https://hdl.handle.net/1992/75424instname:Universidad de los Andesreponame:Repositorio Institucional Sénecarepourl:https://repositorio.uniandes.edu.co/This research focuses on the development of a personalized voice spoofing detection model, specifically targeting audio generated through deepfake techniques, such as text-to-speech (TTS) and voice conversion (VC). The growing sophistication of generative artificial intelligence models has facilitated the creation of falsified audio that is nearly indistinguishable to the human ear, posing a significant risk to individual security and privacy. In response to this threat, our project proposes an innovative solution: a dual-layer voice spoofing detection system. The first layer focuses on detecting whether the audio is real or spoofed, while the second layer identifies whether the voice tone belongs to the legitimate user or someone else, adding an extra level of personalization and security. This system is built upon an existing pre-trained voice spoofing detection model that undergoes fine-tuning with user-specific data for the first layer, which focuses on detecting whether the audio is real or spoofed. For the second layer, the same model is repurposed, but instead of training it for spoof detection, it is trained to recognize the user's specific vocal tone using real voice samples from multiple individuals with different vocal tones. To facilitate the adoption and use of this technology, we have developed a user-friendly tool. This tool allows users to provide just a few minutes of their voice recordings, after which it automatically generates deepfake audio and analyzes voice tones using the mentioned techniques. These falsified audios, along with the genuine recordings, are used to train the detection models, adapting them specifically to the user's voice and tone. This dual-layered approach offers a robust and personalized solution to protect the user's vocal identity against deepfake spoofing threats, not only by verifying the authenticity of the audio but also by ensuring that the tone of voice truly belongs to the legitimate user.Esta investigación se centra en el desarrollo de un modelo de detección de suplantación de voz personalizado, específicamente dirigido a audio generado mediante técnicas de deepfake, como la conversión de texto a voz (TTS) y la conversión de voz (VC). El creciente perfeccionamiento de los modelos de inteligencia artificial generativa ha facilitado la creación de audio falsificado casi indistinguible para el oído humano, planteando un riesgo importante para la seguridad y la privacidad individual. En respuesta a esta amenaza, nuestro proyecto propone una solución innovadora: un sistema de detección de suplantación de voz de doble capa. La primera capa se centra en detectar si el audio es real o falso, mientras que la segunda capa identifica si el tono de voz pertenece al usuario legítimo o a otra persona, agregando un nivel adicional de personalización y seguridad. Este sistema se basa en un modelo de detección de suplantación de voz preentrenado que se somete a un ajuste fino con datos específicos del usuario para la primera capa, enfocada en determinar si el audio es real o falso. Para la segunda capa, se reutiliza el mismo modelo, pero en lugar de entrenarlo para la detección de suplantación, se entrena para reconocer el tono de voz específico del usuario utilizando muestras de voz reales de múltiples individuos con diferentes tonos de voz. Para facilitar la adopción y el uso de esta tecnología, hemos desarrollado una herramienta fácil de utilizar. Dicha herramienta permite a los usuarios proporcionar solo unos minutos de sus grabaciones de voz, tras lo cual genera automáticamente audio deepfake y analiza los tonos de voz mediante las técnicas mencionadas. Estos audios falsificados, junto con las grabaciones genuinas, se emplean para entrenar los modelos de detección, adaptándolos específicamente a la voz y el tono del usuario. Este enfoque de doble capa ofrece una solución sólida y personalizada para proteger la identidad vocal del usuario frente a amenazas de suplantación deepfake, no solo verificando la autenticidad del audio, sino también garantizando que el tono de voz pertenezca verdaderamente al usuario legítimo.Pregrado39 páginasapplication/pdfengUniversidad de los AndesIngeniería de Sistemas y ComputaciónFacultad de IngenieríaDepartamento de Ingeniería de Sistemas y ComputaciónAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/embargoedAccesshttp://purl.org/coar/access_right/c_f1cfDual-Layer anti-spoofing and voice tone recognition system for deepfake audio in englishSistema de reconocimiento de tono de voz y anti-spoofing de doble capa para audios deepfake en inglésTrabajo de grado - Pregradoinfo:eu-repo/semantics/bachelorThesisinfo:eu-repo/semantics/acceptedVersionhttp://purl.org/coar/resource_type/c_7a1fTexthttp://purl.org/redcol/resource_type/TPVoice Spoofing DetectionPersonalized Voice RecognitionVocal Tone IdentificationVoice conversiontext-to-speechIngenieríaZahra Khanjani, Gabrielle Watson, and Vandana P. Janeja. “How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey”. In: arXiv (2021). DOI: https://arxiv.org/abs/2111.14203.Yu Xie, Zhiyao Zhang, and Yi Yang. “Siamese network with WAV2VEC feature for spoofing speech detection”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2021. URL: https://doi.org/10.21437/interspeech.2021-847.Nicolas M¨uller et al. “Does Audio Deepfake Detection Generalize?” In: Proc. Interspeech 2022. 2022, pp. 2783–2787. DOI: 10.21437/Interspeech.2022-108.A. Author1 and B. Author2. “Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video”. In: arXiv (2022). DOI: https://arxiv.org/abs/2202.12883.Jordan J. Bird and Ahmad Lotfi. “Real-time detection of AI-Generated Speech for deepfake voice conversion”. In: arXiv (2023). DOI: https://arxiv.org/pdf/2308.12734.Hemlata Tak et al. “End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection”. In: arXiv (2021). DOI: https://arxiv.org/abs/2107.12710.XinWang and Junichi Yamagishi. “A comparative study on recent neural spoofing countermeasures for synthetic speech detection”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Vol. 6. 2021, pp. 4685–4689.X. Liu et al. “ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild. IEEE/ACM Transactions On Audio Speech And Language Processing”. In: IEEE 31 (2023). DOI: https://doi.org/10.1109/taslp.2023.3285283.P. A. Tamayo Fl´orez. “Does Audio Deepfake Detection Generalize?” In: Proc. Interspeech 2022. 2022, pp. 2783–2787. DOI: 10.21437/Interspeech.2022-108.Manel Rabhi, Spiridon Bakiras, and Roberto Di Pietro. “Audio-deepfake detection: Adversarial attacks and countermeasures”. In: Expert Systems with Applications 250 (2024), p. 123941. URL: https://doi.org/10.1016/j.eswa.2024.123941.Zhiyao Wu et al. “ASVspoof 2015: The first automatic speaker verification spoofing and countermeasures challenge”. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Vol. 2015-Janua. 2015, pp. 2037–2041.Tayyab Arif et al. “Voice spoofing countermeasure for logical access attacks detection”. In: IEEE Access 9 (2021), pp. 162857–162868.Hermann Dinkel, Yanmin Qian, and Kai Yu. “Investigating raw wave deep neural networks for end-to-end speaker spoofing detection”. In: IEEE/ACM Transactions on Audio Speech and Language Processing 26 (2018), pp. 2002–2014.ASVspoof Consortium. ASVspoof 2019 evaluation plan. 2019.Hemlata Tak et al. “Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation”. In: Proc. Odyssey 2022 The Speaker and Language Recognition Workshop. 2022, pp. 141–147. URL: https://arxiv.org/abs/2202.12233.Ruiqi He et al. “Raw PC-DARTS: Searching for convolutional neural architectures for speaker verification”. In: arXiv preprint arXiv:2107.12212 (2021). URL: https://arxiv.org/abs/2107.12212.Elisabeth Zetterholm. “Detection of Speaker Characteristics Using Voice Imitation”. In: SpringerLink (2007). URL: https://doi.org/10.1007/978-3-540-74122-0_16.Qi. Li. Speaker authentication. eng. Signals and communication technology. Springer, 2012. ISBN: 1-283-45117-4.Sabu M. Thampi, Alexander. Gelbukh, and Jayanta. Mukhopadhyay. Advances in Signal Processing and Intelligent Recognition Systems. eng. Advances in Intelligent Systems and Computing, 264. Springer International Publishing, 2014. ISBN: 3-319-04960-7.Elisabeth Zetterholm. Detection of Speaker Characteristics Using Voice Imitation. Ed. by Christian M¨uller. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 192–205. ISBN: 978-3-540-74122-0. DOI: 10.1007/978-3-540-74122-0_16. URL: https://doi.org/10.1007/978-3-540-74122-0_16.TTS. coquiTTS v1. https://github.com/coqui-ai/TTS?tab=readme-ov-file. Accedido: 08 30, 2024. 2023.Galina Lavrentyeva. “STC Antispoofing Systems for the ASVspoof2019 Challenge”. In: arXiv (2019). URL: https://arxiv.org/abs/1904.05576.TTS. XTTS-v2. https://huggingface.co/coqui/XTTS-v2. Accedido: 09 10, 2024. 2024.fish-speech. fish-speech. https://github.com/fishaudio/fish-speech. Accedido: 09 11, 2024. 2024.TedManders. Deep Learning Challenge: Speaker Identification. https://github.com/TedManders/speaker-identification. Accedido: 09 24, 2024. 2022.vivekkr12. You Only Speak Once. https://github.com/Speaker-Identification/You-Only-Speak-Once. Accedido: 09 25, 2024. 2020.Leena Mary. “Significance of Prosody for Speaker, Language and Speech Recognition”. In: Extraction and Representation of Prosody for Speaker, Speech and Language Recognition. New York, NY: Springer New York, 2012, pp. 1–18. ISBN: 978-1-4614-1159-8. DOI: 10.1007/978-1-4614-1159-8_1. URL: https://doi.org/10.1007/978-1-4614-1159-8_1.James Betker. “Better speech synthesis through scaling”. In: arXiv (2023). URL: https://arxiv.org/abs/2305.07243.202111276202014872Publication9f6e12e0-098e-4548-ab81-75552e8385e7virtual::22103-19f6e12e0-098e-4548-ab81-75552e8385e7virtual::22103-1ORIGINALDual-layer Anti-Spoofing system.pdfDual-layer Anti-Spoofing system.pdfRestricción de acceso hasta el año 2026 para proteger los resultados y garantizar su publicación y aplicación antes de la difusión públicaapplication/pdf1779521https://repositorio.uniandes.edu.co/bitstreams/0d143a55-63a2-46f4-809e-253b18b6f029/download1d42bd85a080766181ac4efdef4a40acMD51autorizacion tesis signed.pdfautorizacion tesis signed.pdfHIDEapplication/pdf208488https://repositorio.uniandes.edu.co/bitstreams/89a82f01-4b9a-4aa9-92ea-6a23e8792134/download318e349664cbad385f331c47a7869bc1MD54CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8805https://repositorio.uniandes.edu.co/bitstreams/8d58c371-36e2-4482-bbe5-11036a53c3a3/download4460e5956bc1d1639be9ae6146a50347MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82535https://repositorio.uniandes.edu.co/bitstreams/3c287c2f-1952-4f3b-a7eb-238d3e9d70bc/downloadae9e573a68e7f92501b6913cc846c39fMD53TEXTDual-layer Anti-Spoofing system.pdf.txtDual-layer Anti-Spoofing system.pdf.txtExtracted texttext/plain94075https://repositorio.uniandes.edu.co/bitstreams/d0744714-d1c6-433f-9d1f-e1df3ea74e4c/downloada8955fd438417c21946813a372a085d2MD55autorizacion tesis signed.pdf.txtautorizacion tesis signed.pdf.txtExtracted texttext/plain1176https://repositorio.uniandes.edu.co/bitstreams/1b364a71-8077-4fd4-8aaa-1105ed707a2e/download530d320eb9a3b695b3582e4fc981c68eMD57THUMBNAILDual-layer Anti-Spoofing system.pdf.jpgDual-layer Anti-Spoofing system.pdf.jpgGenerated Thumbnailimage/jpeg8685https://repositorio.uniandes.edu.co/bitstreams/f1d31346-aa55-451a-a8d4-ed370897ee39/downloadf9fc4eae4100966bb8528a77984d1bf2MD56autorizacion tesis signed.pdf.jpgautorizacion tesis signed.pdf.jpgGenerated Thumbnailimage/jpeg10646https://repositorio.uniandes.edu.co/bitstreams/4bb27e85-79f7-41ab-93a4-efea74f814cc/download7dce52f55fc1598e16a3f46de7b71083MD581992/75424oai:repositorio.uniandes.edu.co:1992/754242025-03-05 10:01:56.52http://creativecommons.org/licenses/by-nc-nd/4.0/Attribution-NonCommercial-NoDerivatives 4.0 Internationalrestrictedhttps://repositorio.uniandes.edu.coRepositorio institucional Sénecaadminrepositorio@uniandes.edu.coPGgzPjxzdHJvbmc+RGVzY2FyZ28gZGUgUmVzcG9uc2FiaWxpZGFkIC0gTGljZW5jaWEgZGUgQXV0b3JpemFjacOzbjwvc3Ryb25nPjwvaDM+CjxwPjxzdHJvbmc+UG9yIGZhdm9yIGxlZXIgYXRlbnRhbWVudGUgZXN0ZSBkb2N1bWVudG8gcXVlIHBlcm1pdGUgYWwgUmVwb3NpdG9yaW8gSW5zdGl0dWNpb25hbCBTw6luZWNhIHJlcHJvZHVjaXIgeSBkaXN0cmlidWlyIGxvcyByZWN1cnNvcyBkZSBpbmZvcm1hY2nDs24gZGVwb3NpdGFkb3MgbWVkaWFudGUgbGEgYXV0b3JpemFjacOzbiBkZSBsb3Mgc2lndWllbnRlcyB0w6lybWlub3M6PC9zdHJvbmc+PC9wPgo8cD5Db25jZWRhIGxhIGxpY2VuY2lhIGRlIGRlcMOzc2l0byBlc3TDoW5kYXIgc2VsZWNjaW9uYW5kbyBsYSBvcGNpw7NuIDxzdHJvbmc+J0FjZXB0YXIgbG9zIHTDqXJtaW5vcyBhbnRlcmlvcm1lbnRlIGRlc2NyaXRvcyc8L3N0cm9uZz4geSBjb250aW51YXIgZWwgcHJvY2VzbyBkZSBlbnbDrW8gbWVkaWFudGUgZWwgYm90w7NuIDxzdHJvbmc+J1NpZ3VpZW50ZScuPC9zdHJvbmc+PC9wPgo8aHI+CjxwPllvLCBlbiBtaSBjYWxpZGFkIGRlIGF1dG9yIGRlbCB0cmFiYWpvIGRlIHRlc2lzLCBtb25vZ3JhZsOtYSBvIHRyYWJham8gZGUgZ3JhZG8sIGhhZ28gZW50cmVnYSBkZWwgZWplbXBsYXIgcmVzcGVjdGl2byB5IGRlIHN1cyBhbmV4b3MgZGUgc2VyIGVsIGNhc28sIGVuIGZvcm1hdG8gZGlnaXRhbCB5L28gZWxlY3Ryw7NuaWNvIHkgYXV0b3Jpem8gYSBsYSBVbml2ZXJzaWRhZCBkZSBsb3MgQW5kZXMgcGFyYSBxdWUgcmVhbGljZSBsYSBwdWJsaWNhY2nDs24gZW4gZWwgU2lzdGVtYSBkZSBCaWJsaW90ZWNhcyBvIGVuIGN1YWxxdWllciBvdHJvIHNpc3RlbWEgbyBiYXNlIGRlIGRhdG9zIHByb3BpbyBvIGFqZW5vIGEgbGEgVW5pdmVyc2lkYWQgeSBwYXJhIHF1ZSBlbiBsb3MgdMOpcm1pbm9zIGVzdGFibGVjaWRvcyBlbiBsYSBMZXkgMjMgZGUgMTk4MiwgTGV5IDQ0IGRlIDE5OTMsIERlY2lzacOzbiBBbmRpbmEgMzUxIGRlIDE5OTMsIERlY3JldG8gNDYwIGRlIDE5OTUgeSBkZW3DoXMgbm9ybWFzIGdlbmVyYWxlcyBzb2JyZSBsYSBtYXRlcmlhLCB1dGlsaWNlIGVuIHRvZGFzIHN1cyBmb3JtYXMsIGxvcyBkZXJlY2hvcyBwYXRyaW1vbmlhbGVzIGRlIHJlcHJvZHVjY2nDs24sIGNvbXVuaWNhY2nDs24gcMO6YmxpY2EsIHRyYW5zZm9ybWFjacOzbiB5IGRpc3RyaWJ1Y2nDs24gKGFscXVpbGVyLCBwcsOpc3RhbW8gcMO6YmxpY28gZSBpbXBvcnRhY2nDs24pIHF1ZSBtZSBjb3JyZXNwb25kZW4gY29tbyBjcmVhZG9yIGRlIGxhIG9icmEgb2JqZXRvIGRlbCBwcmVzZW50ZSBkb2N1bWVudG8uPC9wPgo8cD5MYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuIHNlIGVtaXRlIGVuIGNhbGlkYWQgZGUgYXV0b3IgZGUgbGEgb2JyYSBvYmpldG8gZGVsIHByZXNlbnRlIGRvY3VtZW50byB5IG5vIGNvcnJlc3BvbmRlIGEgY2VzacOzbiBkZSBkZXJlY2hvcywgc2lubyBhIGxhIGF1dG9yaXphY2nDs24gZGUgdXNvIGFjYWTDqW1pY28gZGUgY29uZm9ybWlkYWQgY29uIGxvIGFudGVyaW9ybWVudGUgc2XDsWFsYWRvLiBMYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuIHNlIGhhY2UgZXh0ZW5zaXZhIG5vIHNvbG8gYSBsYXMgZmFjdWx0YWRlcyB5IGRlcmVjaG9zIGRlIHVzbyBzb2JyZSBsYSBvYnJhIGVuIGZvcm1hdG8gbyBzb3BvcnRlIG1hdGVyaWFsLCBzaW5vIHRhbWJpw6luIHBhcmEgZm9ybWF0byBlbGVjdHLDs25pY28sIHkgZW4gZ2VuZXJhbCBwYXJhIGN1YWxxdWllciBmb3JtYXRvIGNvbm9jaWRvIG8gcG9yIGNvbm9jZXIuPC9wPgo8cD5FbCBhdXRvciwgbWFuaWZpZXN0YSBxdWUgbGEgb2JyYSBvYmpldG8gZGUgbGEgcHJlc2VudGUgYXV0b3JpemFjacOzbiBlcyBvcmlnaW5hbCB5IGxhIHJlYWxpesOzIHNpbiB2aW9sYXIgbyB1c3VycGFyIGRlcmVjaG9zIGRlIGF1dG9yIGRlIHRlcmNlcm9zLCBwb3IgbG8gdGFudG8sIGxhIG9icmEgZXMgZGUgc3UgZXhjbHVzaXZhIGF1dG9yw61hIHkgdGllbmUgbGEgdGl0dWxhcmlkYWQgc29icmUgbGEgbWlzbWEuPC9wPgo8cD5FbiBjYXNvIGRlIHByZXNlbnRhcnNlIGN1YWxxdWllciByZWNsYW1hY2nDs24gbyBhY2Npw7NuIHBvciBwYXJ0ZSBkZSB1biB0ZXJjZXJvIGVuIGN1YW50byBhIGxvcyBkZXJlY2hvcyBkZSBhdXRvciBzb2JyZSBsYSBvYnJhIGVuIGN1ZXN0acOzbiwgZWwgYXV0b3IgYXN1bWlyw6EgdG9kYSBsYSByZXNwb25zYWJpbGlkYWQsIHkgc2FsZHLDoSBkZSBkZWZlbnNhIGRlIGxvcyBkZXJlY2hvcyBhcXXDrSBhdXRvcml6YWRvcywgcGFyYSB0b2RvcyBsb3MgZWZlY3RvcyBsYSBVbml2ZXJzaWRhZCBhY3TDumEgY29tbyB1biB0ZXJjZXJvIGRlIGJ1ZW5hIGZlLjwvcD4KPHA+U2kgdGllbmUgYWxndW5hIGR1ZGEgc29icmUgbGEgbGljZW5jaWEsIHBvciBmYXZvciwgY29udGFjdGUgY29uIGVsIDxhIGhyZWY9Im1haWx0bzpiaWJsaW90ZWNhQHVuaWFuZGVzLmVkdS5jbyIgdGFyZ2V0PSJfYmxhbmsiPkFkbWluaXN0cmFkb3IgZGVsIFNpc3RlbWEuPC9hPjwvcD4K |