Sentiment Analysis of News Articles in Spanish Using Predicate Features

RESUMEN: La predicción automática del modo de proceder de los agentes involucrados en las tendencias sociales o económicas es un desafío preponderante en la actualidad. Sin embargo, es una tarea difícil debido al hecho de que la postura u opinión a menudo se extiende a través de documentos extensos...

Full description

Autores:: Tamayo Herrera, Antonio Jesús
Arias Londoño, Julián David
Quiróz Herrera, Gabriel Ángel
Burgos Herrera, Diego Alberto

Tipo de recurso:: Article of investigation

Fecha de publicación:: 2019

Institución:: Universidad de Antioquia

Repositorio:: Repositorio UdeA

Idioma:: eng

id	UDEA2_7b35b2fc96d3608c9d3e99849347e6e6
oai_identifier_str	oai:bibliotecadigital.udea.edu.co:10495/14178
network_acronym_str	UDEA2
network_name_str	Repositorio UdeA
repository_id_str
dc.title.spa.fl_str_mv	Sentiment Analysis of News Articles in Spanish Using Predicate Features
dc.title.alternative.spa.fl_str_mv	Análisis de sentimientos en artículos de prensa en español usando predicados como características Analyse de sentiments dans des articles de presse en espagnol en utilisant des prédicats en tant que caractéristiques
title	Sentiment Analysis of News Articles in Spanish Using Predicate Features
spellingShingle	Sentiment Analysis of News Articles in Spanish Using Predicate Features Supervised learning (Machine learning) Dimension reduction (Statistics) Investigación lingüística Linguistic research Sintaxis Syntax Semántica Semantics Lingüística informática Computational linguistics Análisis semántico Semantic analysis Aprendizaje automático (inteligencia artificial) http://id.loc.gov/authorities/subjects/sh94008290 http://id.loc.gov/authorities/subjects/sh2010000188 http://vocabularies.unesco.org/thesaurus/concept12899 http://vocabularies.unesco.org/thesaurus/concept11611 http://vocabularies.unesco.org/thesaurus/concept13409 http://vocabularies.unesco.org/thesaurus/concept3411
title_short	Sentiment Analysis of News Articles in Spanish Using Predicate Features
title_full	Sentiment Analysis of News Articles in Spanish Using Predicate Features
title_fullStr	Sentiment Analysis of News Articles in Spanish Using Predicate Features
title_full_unstemmed	Sentiment Analysis of News Articles in Spanish Using Predicate Features
title_sort	Sentiment Analysis of News Articles in Spanish Using Predicate Features
dc.creator.fl_str_mv	Tamayo Herrera, Antonio Jesús Arias Londoño, Julián David Quiróz Herrera, Gabriel Ángel Burgos Herrera, Diego Alberto
dc.contributor.author.none.fl_str_mv	Tamayo Herrera, Antonio Jesús Arias Londoño, Julián David Quiróz Herrera, Gabriel Ángel Burgos Herrera, Diego Alberto
dc.subject.lcsh.none.fl_str_mv	Supervised learning (Machine learning) Dimension reduction (Statistics)
topic	Supervised learning (Machine learning) Dimension reduction (Statistics) Investigación lingüística Linguistic research Sintaxis Syntax Semántica Semantics Lingüística informática Computational linguistics Análisis semántico Semantic analysis Aprendizaje automático (inteligencia artificial) http://id.loc.gov/authorities/subjects/sh94008290 http://id.loc.gov/authorities/subjects/sh2010000188 http://vocabularies.unesco.org/thesaurus/concept12899 http://vocabularies.unesco.org/thesaurus/concept11611 http://vocabularies.unesco.org/thesaurus/concept13409 http://vocabularies.unesco.org/thesaurus/concept3411
dc.subject.unesco.none.fl_str_mv	Investigación lingüística Linguistic research Sintaxis Syntax Semántica Semantics Lingüística informática Computational linguistics
dc.subject.lemb.none.fl_str_mv	Análisis semántico Semantic analysis Aprendizaje automático (inteligencia artificial)
dc.subject.lcshuri.none.fl_str_mv	http://id.loc.gov/authorities/subjects/sh94008290 http://id.loc.gov/authorities/subjects/sh2010000188
dc.subject.unescouri.none.fl_str_mv	http://vocabularies.unesco.org/thesaurus/concept12899 http://vocabularies.unesco.org/thesaurus/concept11611 http://vocabularies.unesco.org/thesaurus/concept13409 http://vocabularies.unesco.org/thesaurus/concept3411
description	RESUMEN: La predicción automática del modo de proceder de los agentes involucrados en las tendencias sociales o económicas es un desafío preponderante en la actualidad. Sin embargo, es una tarea difícil debido al hecho de que la postura u opinión a menudo se extiende a través de documentos extensos y complejos, como los artículos de noticias. El presente trabajo evalúa los predicados de oraciones como características para determinar automáticamente la postura del escritor en los artículos de noticias. Capturamos la semántica y la postura del texto codificando características como el atributo de oraciones copulativas, el predicado de oraciones transitivas, sintagmas adjetivales y la sección del artículo. Bajo el supuesto de que estas características son lo suficientemente informativas para modelar la semántica del texto, cada secuencia de palabras se desambigua y se le asigna un valor de sentimiento con reglas de ponderación. Se realizaron diferentes experimentos empleando SentiWordNet y ML-Senticon para determinar la opinión de las palabras. Los vectores de características se construyen automáticamente para completar una base de datos que se prueba mediante el uso de dos algoritmos de aprendizaje automático. Se logró una eficiencia del 69 % utilizando una SVM con kernel gaussiano junto con una estrategia de selección de características. Esta puntuación superó la línea de base de la técnica de "bag of words" en un 12 %. Estos resultados son prometedores si tenemos en cuenta que el análisis de sentimientos se hace en documentos muy complejos en español.
publishDate	2019
dc.date.issued.none.fl_str_mv	2019
dc.date.accessioned.none.fl_str_mv	2020-05-05T22:51:49Z
dc.date.available.none.fl_str_mv	2020-05-05T22:51:49Z
dc.type.spa.fl_str_mv	info:eu-repo/semantics/article
dc.type.coarversion.fl_str_mv	http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.coar.spa.fl_str_mv	http://purl.org/coar/resource_type/c_2df8fbb1
dc.type.redcol.spa.fl_str_mv	https://purl.org/redcol/resource_type/ART
dc.type.local.spa.fl_str_mv	Artículo de investigación
format	http://purl.org/coar/resource_type/c_2df8fbb1
dc.identifier.issn.none.fl_str_mv	0120-3479
dc.identifier.uri.none.fl_str_mv	http://hdl.handle.net/10495/14178
dc.identifier.eissn.none.fl_str_mv	2539-3804
identifier_str_mv	0120-3479 2539-3804
url	http://hdl.handle.net/10495/14178
dc.language.iso.spa.fl_str_mv	eng
language	eng
dc.relation.ispartofjournalabbrev.spa.fl_str_mv	Lenguaje
dc.rights.*.fl_str_mv	Atribución-NoComercial-SinDerivadas 2.5 Colombia (CC BY-NC-ND 2.5 CO)
dc.rights.spa.fl_str_mv	info:eu-repo/semantics/openAccess
dc.rights.uri.*.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/2.5/co/
dc.rights.accessrights.spa.fl_str_mv	http://purl.org/coar/access_right/c_abf2
dc.rights.creativecommons.spa.fl_str_mv	https://creativecommons.org/licenses/by-nc-nd/4.0/
rights_invalid_str_mv	Atribución-NoComercial-SinDerivadas 2.5 Colombia (CC BY-NC-ND 2.5 CO) http://creativecommons.org/licenses/by-nc-nd/2.5/co/ http://purl.org/coar/access_right/c_abf2 https://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv	openAccess
dc.format.extent.spa.fl_str_mv	32
dc.format.mimetype.spa.fl_str_mv	application/pdf
dc.publisher.spa.fl_str_mv	Universidad del Valle, Escuela de Ciencias del Lenguaje
dc.publisher.place.spa.fl_str_mv	Cali, Colombia
institution	Universidad de Antioquia
bitstream.url.fl_str_mv	http://bibliotecadigital.udea.edu.co/bitstream/10495/14178/2/license_rdf http://bibliotecadigital.udea.edu.co/bitstream/10495/14178/4/TamayoAntonio_2019_SentimentAnalysisFeatures.pdf http://bibliotecadigital.udea.edu.co/bitstream/10495/14178/5/license.txt
bitstream.checksum.fl_str_mv	b88b088d9957e670ce3b3fbe2eedbc13 ec7ecd390a2800a688c3e74916d4eb14 8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio Institucional Universidad de Antioquia
repository.mail.fl_str_mv	andres.perez@udea.edu.co
_version_	1837098174019272704
spelling	Tamayo Herrera, Antonio JesúsArias Londoño, Julián DavidQuiróz Herrera, Gabriel ÁngelBurgos Herrera, Diego Alberto2020-05-05T22:51:49Z2020-05-05T22:51:49Z20190120-3479http://hdl.handle.net/10495/141782539-3804RESUMEN: La predicción automática del modo de proceder de los agentes involucrados en las tendencias sociales o económicas es un desafío preponderante en la actualidad. Sin embargo, es una tarea difícil debido al hecho de que la postura u opinión a menudo se extiende a través de documentos extensos y complejos, como los artículos de noticias. El presente trabajo evalúa los predicados de oraciones como características para determinar automáticamente la postura del escritor en los artículos de noticias. Capturamos la semántica y la postura del texto codificando características como el atributo de oraciones copulativas, el predicado de oraciones transitivas, sintagmas adjetivales y la sección del artículo. Bajo el supuesto de que estas características son lo suficientemente informativas para modelar la semántica del texto, cada secuencia de palabras se desambigua y se le asigna un valor de sentimiento con reglas de ponderación. Se realizaron diferentes experimentos empleando SentiWordNet y ML-Senticon para determinar la opinión de las palabras. Los vectores de características se construyen automáticamente para completar una base de datos que se prueba mediante el uso de dos algoritmos de aprendizaje automático. Se logró una eficiencia del 69 % utilizando una SVM con kernel gaussiano junto con una estrategia de selección de características. Esta puntuación superó la línea de base de la técnica de "bag of words" en un 12 %. Estos resultados son prometedores si tenemos en cuenta que el análisis de sentimientos se hace en documentos muy complejos en español.ABSTRACT: The automatic prediction of the course of action of agents involved in social or economic trends is an imperative challenge nowadays. However, it is a difficult task because stance or opinion is often spread throughout long, complex texts, such as news articles. The current study tests sentence predicates as features to automatically determine the writer’s stance in news articles. We capture the semantics and stance of the text by encoding features such as the attribute of copulative sentences, the predicate of transitive sentences, adjectival phrases, and the section of the article. Under the assumption that these features are informative enough to model the semantics of the text, each word sequence is disambiguated and assigned a sentiment value using weighting rules. Different experiments were run using either SentiWordNet and ML-Senticon to determine words’ sentiment. Feature vectors are automatically built to populate a database that is tested using two machine learning algorithms. An efficiency of 69% was achieved using a SVM with Gaussian kernel along with a feature selection strategy. This score outperformed the bag-of-words baseline in 12%. These results are promising considering that the sentiment analysis is performed on very complex texts written in Spanish.RESUMEN: La predicción automática del modo de proceder de los agentes involucrados en las tendencias sociales o económicas es un desafío preponderante en la actualidad. Sin embargo, es una tarea difícil debido al hecho de que la postura u opinión a menudo se extiende a través de documentos extensos y complejos, como los artículos de noticias. El presente trabajo evalúa los predicados de oraciones como características para determinar automáticamente la postura del escritor en los características como el atributo de oraciones copulativas, el predicado de oraciones transitivas, sintagmas adjetivales y la sección del artículo. Bajo el supuesto de que estas características son lo suficientemente informativas para modelar la semántica del texto, cada secuencia de palabras se desambigua y se le asigna un valor de sentimiento con reglas de ponderación. Se realizaron diferentes experimentos empleando SentiWordNet y ML-Senticon para determinar la opinión de las palabras. Los vectores de características se construyen automáticamente para completar una base de datos que se prueba mediante el uso de dos algoritmos de aprendizaje automático. Se logró una eficiencia del 69 % utilizando una SVM con kernel gaussiano junto con una estrategia de selección de características. Esta puntuación superó la línea de base de la técnica de "bag of words" en un 12 %. Estos resultados son prometedores si tenemos en cuenta que el análisis de sentimientos se hace en documentos muy complejos en español.RÉSUMÉ: La prédiction automatique des façons d’agir des agents impliqués dans les tendances sociales ou économiques est un défi impératif de nos jours. Cependant, cette tâche s´avère difficile étant donné que les avis s’entendent dans des documents longs et complexes, tels que des articles de presse. Le présent travail évalue les prédicats de phrases en tant que caractéristiques pour déterminer systématiquement le point de vue de l’écrivain dans les articles de presse. Il s’agit de capturer la sémantique et la posture du texte en décodant des caractéristiques telles que l'attribut de phrases copulatives, le prédicat de phrases transitives, de syntagmes adjectivaux et la section de l'article. En supposant que ces fonctionnalités soient suffisamment informatives pour modéliser la sémantique du texte, chaque séquence de mots est désambiguïsée et une valeur de sentiment est attribuée à l'aide de règles de pondération. Différentes expériences ont été réalisées à l'aide de SentiWordNet et de ML-Senticon afin de déterminer l’avis des mots. Les vecteurs de fonctionnalités sont construits de façon automatique pour alimenter une base de données testée à l'aide de deux algorithmes d'apprentissage automatique. Une efficacité de 69% a été obtenue avec un SVM à noyau gaussien et une stratégie de sélection des fonctionnalités. Ce score a dépassé la base de référence de la technique «bag of words» dans 12%. Ces résultats sont prometteurs compte tenu du fait que l'analyse des sentiments est effectuée sur des documents très complexes en espagnol.32application/pdfengUniversidad del Valle, Escuela de Ciencias del LenguajeCali, Colombiainfo:eu-repo/semantics/articlehttp://purl.org/coar/resource_type/c_2df8fbb1https://purl.org/redcol/resource_type/ARTArtículo de investigaciónhttp://purl.org/coar/version/c_970fb48d4fbd8a85Atribución-NoComercial-SinDerivadas 2.5 Colombia (CC BY-NC-ND 2.5 CO)info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-nd/2.5/co/http://purl.org/coar/access_right/c_abf2https://creativecommons.org/licenses/by-nc-nd/4.0/Supervised learning (Machine learning)Dimension reduction (Statistics)Investigación lingüísticaLinguistic researchSintaxisSyntaxSemánticaSemanticsLingüística informáticaComputational linguisticsAnálisis semánticoSemantic analysisAprendizaje automático (inteligencia artificial)http://id.loc.gov/authorities/subjects/sh94008290http://id.loc.gov/authorities/subjects/sh2010000188http://vocabularies.unesco.org/thesaurus/concept12899http://vocabularies.unesco.org/thesaurus/concept11611http://vocabularies.unesco.org/thesaurus/concept13409http://vocabularies.unesco.org/thesaurus/concept3411Sentiment Analysis of News Articles in Spanish Using Predicate FeaturesAnálisis de sentimientos en artículos de prensa en español usando predicados como característicasAnalyse de sentiments dans des articles de presse en espagnol en utilisant des prédicats en tant que caractéristiquesLenguajeLenguaje235267472CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8823http://bibliotecadigital.udea.edu.co/bitstream/10495/14178/2/license_rdfb88b088d9957e670ce3b3fbe2eedbc13MD52ORIGINALTamayoAntonio_2019_SentimentAnalysisFeatures.pdfTamayoAntonio_2019_SentimentAnalysisFeatures.pdfArtículo de investigaciónapplication/pdf1343959http://bibliotecadigital.udea.edu.co/bitstream/10495/14178/4/TamayoAntonio_2019_SentimentAnalysisFeatures.pdfec7ecd390a2800a688c3e74916d4eb14MD54LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://bibliotecadigital.udea.edu.co/bitstream/10495/14178/5/license.txt8a4605be74aa9ea9d79846c1fba20a33MD5510495/14178oai:bibliotecadigital.udea.edu.co:10495/141782021-05-23 14:04:06.94Repositorio Institucional Universidad de Antioquiaandres.perez@udea.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=

Sentiment Analysis of News Articles in Spanish Using Predicate Features

Publicaciones similares