Historical Ink: semantic shift detection for 19th century spanish
This paper explores the evolution of word meanings in 19th-century Spanish texts, with an emphasis on Latin American Spanish, using computational linguistics techniques. It addresses the Semantic Shift Detection (SSD) task, which is crucial for understanding linguistic evolution, particularly in his...
- Autores:
-
Montes Buitrago, Tony Santiago
- Tipo de recurso:
- Trabajo de grado de pregrado
- Fecha de publicación:
- 2024
- Institución:
- Universidad de los Andes
- Repositorio:
- Séneca: repositorio Uniandes
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.uniandes.edu.co:1992/74556
- Acceso en línea:
- https://hdl.handle.net/1992/74556
- Palabra clave:
- Semantic Shift Detection
SSD
Old Spanish
BERT
Latin-America
corpus
sense
DWUG
Context Embedding
Ingeniería
- Rights
- embargoedAccess
- License
- Attribution-NonCommercial-NoDerivatives 4.0 International
id |
UNIANDES2_e50d7f9c4268ad49f9d1007b4119df96 |
---|---|
oai_identifier_str |
oai:repositorio.uniandes.edu.co:1992/74556 |
network_acronym_str |
UNIANDES2 |
network_name_str |
Séneca: repositorio Uniandes |
repository_id_str |
|
dc.title.eng.fl_str_mv |
Historical Ink: semantic shift detection for 19th century spanish |
dc.title.alternative.spa.fl_str_mv |
Historical Ink: detección de cambios semánticos en el español del siglo XIX |
title |
Historical Ink: semantic shift detection for 19th century spanish |
spellingShingle |
Historical Ink: semantic shift detection for 19th century spanish Semantic Shift Detection SSD Old Spanish BERT Latin-America corpus sense DWUG Context Embedding Ingeniería |
title_short |
Historical Ink: semantic shift detection for 19th century spanish |
title_full |
Historical Ink: semantic shift detection for 19th century spanish |
title_fullStr |
Historical Ink: semantic shift detection for 19th century spanish |
title_full_unstemmed |
Historical Ink: semantic shift detection for 19th century spanish |
title_sort |
Historical Ink: semantic shift detection for 19th century spanish |
dc.creator.fl_str_mv |
Montes Buitrago, Tony Santiago |
dc.contributor.advisor.none.fl_str_mv |
Manrique Piramanrique, Rubén Francisco |
dc.contributor.author.none.fl_str_mv |
Montes Buitrago, Tony Santiago |
dc.contributor.other.none.fl_str_mv |
Manrique Gómez, Laura Viviana |
dc.contributor.researchgroup.none.fl_str_mv |
Facultad de Ingeniería |
dc.subject.keyword.eng.fl_str_mv |
Semantic Shift Detection SSD Old Spanish BERT Latin-America corpus sense DWUG Context Embedding |
topic |
Semantic Shift Detection SSD Old Spanish BERT Latin-America corpus sense DWUG Context Embedding Ingeniería |
dc.subject.themes.spa.fl_str_mv |
Ingeniería |
description |
This paper explores the evolution of word meanings in 19th-century Spanish texts, with an emphasis on Latin American Spanish, using computational linguistics techniques. It addresses the Semantic Shift Detection (SSD) task, which is crucial for understanding linguistic evolution, particularly in historical contexts. The study focuses on analyzing a set of Spanish target words. To achieve this, a 19th-century Spanish corpus is constructed, and a customizable pipeline for SSD tasks is developed. This pipeline helps find the senses of a word and measure their semantic change between two corpora using fine-tuned BERT-like models with old Spanish texts for both Latin American and general Spanish cases. The results provide valuable insights into the cultural and societal shifts reflected in language changes over time. |
publishDate |
2024 |
dc.date.accessioned.none.fl_str_mv |
2024-07-16T15:42:58Z |
dc.date.issued.none.fl_str_mv |
2024-07-14 |
dc.date.accepted.none.fl_str_mv |
2024-07-16 |
dc.date.available.none.fl_str_mv |
2025-08-15 |
dc.type.none.fl_str_mv |
Trabajo de grado - Pregrado |
dc.type.driver.none.fl_str_mv |
info:eu-repo/semantics/bachelorThesis |
dc.type.version.none.fl_str_mv |
info:eu-repo/semantics/acceptedVersion |
dc.type.coar.none.fl_str_mv |
http://purl.org/coar/resource_type/c_7a1f |
dc.type.content.none.fl_str_mv |
Text |
dc.type.redcol.none.fl_str_mv |
http://purl.org/redcol/resource_type/TP |
format |
http://purl.org/coar/resource_type/c_7a1f |
status_str |
acceptedVersion |
dc.identifier.uri.none.fl_str_mv |
https://hdl.handle.net/1992/74556 |
dc.identifier.instname.none.fl_str_mv |
instname:Universidad de los Andes |
dc.identifier.reponame.none.fl_str_mv |
reponame:Repositorio Institucional Séneca |
dc.identifier.repourl.none.fl_str_mv |
repourl:https://repositorio.uniandes.edu.co/ |
url |
https://hdl.handle.net/1992/74556 |
identifier_str_mv |
instname:Universidad de los Andes reponame:Repositorio Institucional Séneca repourl:https://repositorio.uniandes.edu.co/ |
dc.language.iso.none.fl_str_mv |
eng |
language |
eng |
dc.rights.en.fl_str_mv |
Attribution-NonCommercial-NoDerivatives 4.0 International |
dc.rights.uri.none.fl_str_mv |
http://creativecommons.org/licenses/by-nc-nd/4.0/ |
dc.rights.accessrights.none.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
dc.rights.coar.none.fl_str_mv |
http://purl.org/coar/access_right/c_f1cf |
rights_invalid_str_mv |
Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ http://purl.org/coar/access_right/c_f1cf |
eu_rights_str_mv |
embargoedAccess |
dc.format.extent.none.fl_str_mv |
13 páginas |
dc.format.mimetype.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Universidad de los Andes |
dc.publisher.program.none.fl_str_mv |
Ingeniería de Sistemas y Computación |
dc.publisher.faculty.none.fl_str_mv |
Facultad de Ingeniería |
dc.publisher.department.none.fl_str_mv |
Departamento de Ingeniería de Sistemas y Computación |
publisher.none.fl_str_mv |
Universidad de los Andes |
institution |
Universidad de los Andes |
bitstream.url.fl_str_mv |
https://repositorio.uniandes.edu.co/bitstreams/27624d8b-2975-4845-9b31-892fdba68282/download https://repositorio.uniandes.edu.co/bitstreams/41c1c525-c197-4ad0-b609-8c970a427421/download https://repositorio.uniandes.edu.co/bitstreams/289f35ac-8e5c-40d6-b780-b7b05a75b6c2/download https://repositorio.uniandes.edu.co/bitstreams/046a9177-cde8-485a-8525-dae8087cb307/download https://repositorio.uniandes.edu.co/bitstreams/62e08ffb-5b83-4dfb-a88e-322f067c04e1/download https://repositorio.uniandes.edu.co/bitstreams/beab4d90-53f8-4e94-b79b-e2e5fb4ee18a/download https://repositorio.uniandes.edu.co/bitstreams/42f7225a-17d0-43a1-aaa8-858240ceb0d5/download https://repositorio.uniandes.edu.co/bitstreams/6c53217c-6928-4e35-a8b0-5eba947e2819/download |
bitstream.checksum.fl_str_mv |
14643bbf32733d61027303bfb2771cf1 dc66a8bd9f817c822bf8f70b7e5ce8e6 ae9e573a68e7f92501b6913cc846c39f 4460e5956bc1d1639be9ae6146a50347 530d320eb9a3b695b3582e4fc981c68e 292c3660f1a758f0ed16741c1fd8db3b baee67be5f47a64c6b9a22d60407dfcf 6370bf27e3132054e65cdcd2ed6109c4 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositorio institucional Séneca |
repository.mail.fl_str_mv |
adminrepositorio@uniandes.edu.co |
_version_ |
1812133816578342912 |
spelling |
Manrique Piramanrique, Rubén Franciscovirtual::18805-1Montes Buitrago, Tony SantiagoManrique Gómez, Laura VivianaFacultad de Ingeniería2024-07-16T15:42:58Z2025-08-152024-07-142024-07-16https://hdl.handle.net/1992/74556instname:Universidad de los Andesreponame:Repositorio Institucional Sénecarepourl:https://repositorio.uniandes.edu.co/This paper explores the evolution of word meanings in 19th-century Spanish texts, with an emphasis on Latin American Spanish, using computational linguistics techniques. It addresses the Semantic Shift Detection (SSD) task, which is crucial for understanding linguistic evolution, particularly in historical contexts. The study focuses on analyzing a set of Spanish target words. To achieve this, a 19th-century Spanish corpus is constructed, and a customizable pipeline for SSD tasks is developed. This pipeline helps find the senses of a word and measure their semantic change between two corpora using fine-tuned BERT-like models with old Spanish texts for both Latin American and general Spanish cases. The results provide valuable insights into the cultural and societal shifts reflected in language changes over time.PregradoNLPAI13 páginasapplication/pdfengUniversidad de los AndesIngeniería de Sistemas y ComputaciónFacultad de IngenieríaDepartamento de Ingeniería de Sistemas y ComputaciónAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/embargoedAccesshttp://purl.org/coar/access_right/c_f1cfHistorical Ink: semantic shift detection for 19th century spanishHistorical Ink: detección de cambios semánticos en el español del siglo XIXTrabajo de grado - Pregradoinfo:eu-repo/semantics/bachelorThesisinfo:eu-repo/semantics/acceptedVersionhttp://purl.org/coar/resource_type/c_7a1fTexthttp://purl.org/redcol/resource_type/TPSemantic Shift DetectionSSDOld SpanishBERTLatin-AmericacorpussenseDWUGContext EmbeddingIngeniería202014562Publication9f6e12e0-098e-4548-ab81-75552e8385e79f6e12e0-098e-4548-ab81-75552e8385e7virtual::18805-19f6e12e0-098e-4548-ab81-75552e8385e7virtual::18805-1ORIGINALFormato Entrega Tesis - Biblioteca.pdfFormato Entrega Tesis - Biblioteca.pdfHIDEapplication/pdf223427https://repositorio.uniandes.edu.co/bitstreams/27624d8b-2975-4845-9b31-892fdba68282/download14643bbf32733d61027303bfb2771cf1MD51SSD_Old_Spanish_Paper.pdfSSD_Old_Spanish_Paper.pdfEsperando publicación en ACL, que puede tomar un tiempo en aparecer. En estado privado hasta 15 de Agosto del 2025application/pdf1369446https://repositorio.uniandes.edu.co/bitstreams/41c1c525-c197-4ad0-b609-8c970a427421/downloaddc66a8bd9f817c822bf8f70b7e5ce8e6MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82535https://repositorio.uniandes.edu.co/bitstreams/289f35ac-8e5c-40d6-b780-b7b05a75b6c2/downloadae9e573a68e7f92501b6913cc846c39fMD53CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8805https://repositorio.uniandes.edu.co/bitstreams/046a9177-cde8-485a-8525-dae8087cb307/download4460e5956bc1d1639be9ae6146a50347MD55TEXTFormato Entrega Tesis - Biblioteca.pdf.txtFormato Entrega Tesis - Biblioteca.pdf.txtExtracted texttext/plain1176https://repositorio.uniandes.edu.co/bitstreams/62e08ffb-5b83-4dfb-a88e-322f067c04e1/download530d320eb9a3b695b3582e4fc981c68eMD56SSD_Old_Spanish_Paper.pdf.txtSSD_Old_Spanish_Paper.pdf.txtExtracted texttext/plain47717https://repositorio.uniandes.edu.co/bitstreams/beab4d90-53f8-4e94-b79b-e2e5fb4ee18a/download292c3660f1a758f0ed16741c1fd8db3bMD58THUMBNAILFormato Entrega Tesis - Biblioteca.pdf.jpgFormato Entrega Tesis - Biblioteca.pdf.jpgGenerated Thumbnailimage/jpeg11017https://repositorio.uniandes.edu.co/bitstreams/42f7225a-17d0-43a1-aaa8-858240ceb0d5/downloadbaee67be5f47a64c6b9a22d60407dfcfMD57SSD_Old_Spanish_Paper.pdf.jpgSSD_Old_Spanish_Paper.pdf.jpgGenerated Thumbnailimage/jpeg14945https://repositorio.uniandes.edu.co/bitstreams/6c53217c-6928-4e35-a8b0-5eba947e2819/download6370bf27e3132054e65cdcd2ed6109c4MD591992/74556oai:repositorio.uniandes.edu.co:1992/745562024-09-12 16:19:24.63http://creativecommons.org/licenses/by-nc-nd/4.0/Attribution-NonCommercial-NoDerivatives 4.0 Internationalrestrictedhttps://repositorio.uniandes.edu.coRepositorio institucional Sénecaadminrepositorio@uniandes.edu.coPGgzPjxzdHJvbmc+RGVzY2FyZ28gZGUgUmVzcG9uc2FiaWxpZGFkIC0gTGljZW5jaWEgZGUgQXV0b3JpemFjacOzbjwvc3Ryb25nPjwvaDM+CjxwPjxzdHJvbmc+UG9yIGZhdm9yIGxlZXIgYXRlbnRhbWVudGUgZXN0ZSBkb2N1bWVudG8gcXVlIHBlcm1pdGUgYWwgUmVwb3NpdG9yaW8gSW5zdGl0dWNpb25hbCBTw6luZWNhIHJlcHJvZHVjaXIgeSBkaXN0cmlidWlyIGxvcyByZWN1cnNvcyBkZSBpbmZvcm1hY2nDs24gZGVwb3NpdGFkb3MgbWVkaWFudGUgbGEgYXV0b3JpemFjacOzbiBkZSBsb3Mgc2lndWllbnRlcyB0w6lybWlub3M6PC9zdHJvbmc+PC9wPgo8cD5Db25jZWRhIGxhIGxpY2VuY2lhIGRlIGRlcMOzc2l0byBlc3TDoW5kYXIgc2VsZWNjaW9uYW5kbyBsYSBvcGNpw7NuIDxzdHJvbmc+J0FjZXB0YXIgbG9zIHTDqXJtaW5vcyBhbnRlcmlvcm1lbnRlIGRlc2NyaXRvcyc8L3N0cm9uZz4geSBjb250aW51YXIgZWwgcHJvY2VzbyBkZSBlbnbDrW8gbWVkaWFudGUgZWwgYm90w7NuIDxzdHJvbmc+J1NpZ3VpZW50ZScuPC9zdHJvbmc+PC9wPgo8aHI+CjxwPllvLCBlbiBtaSBjYWxpZGFkIGRlIGF1dG9yIGRlbCB0cmFiYWpvIGRlIHRlc2lzLCBtb25vZ3JhZsOtYSBvIHRyYWJham8gZGUgZ3JhZG8sIGhhZ28gZW50cmVnYSBkZWwgZWplbXBsYXIgcmVzcGVjdGl2byB5IGRlIHN1cyBhbmV4b3MgZGUgc2VyIGVsIGNhc28sIGVuIGZvcm1hdG8gZGlnaXRhbCB5L28gZWxlY3Ryw7NuaWNvIHkgYXV0b3Jpem8gYSBsYSBVbml2ZXJzaWRhZCBkZSBsb3MgQW5kZXMgcGFyYSBxdWUgcmVhbGljZSBsYSBwdWJsaWNhY2nDs24gZW4gZWwgU2lzdGVtYSBkZSBCaWJsaW90ZWNhcyBvIGVuIGN1YWxxdWllciBvdHJvIHNpc3RlbWEgbyBiYXNlIGRlIGRhdG9zIHByb3BpbyBvIGFqZW5vIGEgbGEgVW5pdmVyc2lkYWQgeSBwYXJhIHF1ZSBlbiBsb3MgdMOpcm1pbm9zIGVzdGFibGVjaWRvcyBlbiBsYSBMZXkgMjMgZGUgMTk4MiwgTGV5IDQ0IGRlIDE5OTMsIERlY2lzacOzbiBBbmRpbmEgMzUxIGRlIDE5OTMsIERlY3JldG8gNDYwIGRlIDE5OTUgeSBkZW3DoXMgbm9ybWFzIGdlbmVyYWxlcyBzb2JyZSBsYSBtYXRlcmlhLCB1dGlsaWNlIGVuIHRvZGFzIHN1cyBmb3JtYXMsIGxvcyBkZXJlY2hvcyBwYXRyaW1vbmlhbGVzIGRlIHJlcHJvZHVjY2nDs24sIGNvbXVuaWNhY2nDs24gcMO6YmxpY2EsIHRyYW5zZm9ybWFjacOzbiB5IGRpc3RyaWJ1Y2nDs24gKGFscXVpbGVyLCBwcsOpc3RhbW8gcMO6YmxpY28gZSBpbXBvcnRhY2nDs24pIHF1ZSBtZSBjb3JyZXNwb25kZW4gY29tbyBjcmVhZG9yIGRlIGxhIG9icmEgb2JqZXRvIGRlbCBwcmVzZW50ZSBkb2N1bWVudG8uPC9wPgo8cD5MYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuIHNlIGVtaXRlIGVuIGNhbGlkYWQgZGUgYXV0b3IgZGUgbGEgb2JyYSBvYmpldG8gZGVsIHByZXNlbnRlIGRvY3VtZW50byB5IG5vIGNvcnJlc3BvbmRlIGEgY2VzacOzbiBkZSBkZXJlY2hvcywgc2lubyBhIGxhIGF1dG9yaXphY2nDs24gZGUgdXNvIGFjYWTDqW1pY28gZGUgY29uZm9ybWlkYWQgY29uIGxvIGFudGVyaW9ybWVudGUgc2XDsWFsYWRvLiBMYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuIHNlIGhhY2UgZXh0ZW5zaXZhIG5vIHNvbG8gYSBsYXMgZmFjdWx0YWRlcyB5IGRlcmVjaG9zIGRlIHVzbyBzb2JyZSBsYSBvYnJhIGVuIGZvcm1hdG8gbyBzb3BvcnRlIG1hdGVyaWFsLCBzaW5vIHRhbWJpw6luIHBhcmEgZm9ybWF0byBlbGVjdHLDs25pY28sIHkgZW4gZ2VuZXJhbCBwYXJhIGN1YWxxdWllciBmb3JtYXRvIGNvbm9jaWRvIG8gcG9yIGNvbm9jZXIuPC9wPgo8cD5FbCBhdXRvciwgbWFuaWZpZXN0YSBxdWUgbGEgb2JyYSBvYmpldG8gZGUgbGEgcHJlc2VudGUgYXV0b3JpemFjacOzbiBlcyBvcmlnaW5hbCB5IGxhIHJlYWxpesOzIHNpbiB2aW9sYXIgbyB1c3VycGFyIGRlcmVjaG9zIGRlIGF1dG9yIGRlIHRlcmNlcm9zLCBwb3IgbG8gdGFudG8sIGxhIG9icmEgZXMgZGUgc3UgZXhjbHVzaXZhIGF1dG9yw61hIHkgdGllbmUgbGEgdGl0dWxhcmlkYWQgc29icmUgbGEgbWlzbWEuPC9wPgo8cD5FbiBjYXNvIGRlIHByZXNlbnRhcnNlIGN1YWxxdWllciByZWNsYW1hY2nDs24gbyBhY2Npw7NuIHBvciBwYXJ0ZSBkZSB1biB0ZXJjZXJvIGVuIGN1YW50byBhIGxvcyBkZXJlY2hvcyBkZSBhdXRvciBzb2JyZSBsYSBvYnJhIGVuIGN1ZXN0acOzbiwgZWwgYXV0b3IgYXN1bWlyw6EgdG9kYSBsYSByZXNwb25zYWJpbGlkYWQsIHkgc2FsZHLDoSBkZSBkZWZlbnNhIGRlIGxvcyBkZXJlY2hvcyBhcXXDrSBhdXRvcml6YWRvcywgcGFyYSB0b2RvcyBsb3MgZWZlY3RvcyBsYSBVbml2ZXJzaWRhZCBhY3TDumEgY29tbyB1biB0ZXJjZXJvIGRlIGJ1ZW5hIGZlLjwvcD4KPHA+U2kgdGllbmUgYWxndW5hIGR1ZGEgc29icmUgbGEgbGljZW5jaWEsIHBvciBmYXZvciwgY29udGFjdGUgY29uIGVsIDxhIGhyZWY9Im1haWx0bzpiaWJsaW90ZWNhQHVuaWFuZGVzLmVkdS5jbyIgdGFyZ2V0PSJfYmxhbmsiPkFkbWluaXN0cmFkb3IgZGVsIFNpc3RlbWEuPC9hPjwvcD4K |