Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019

Social networks have been a revolutionary scenario for celebrities because they allow them to reach a wider audience with much higher frequency than using traditional means. These platforms enable them to improve or sometimes deteriorate, their careers through the construction of closer relationship...

Full description

Autores:
Tipo de recurso:
Fecha de publicación:
2019
Institución:
Universidad Tecnológica de Bolívar
Repositorio:
Repositorio Institucional UTB
Idioma:
eng
OAI Identifier:
oai:repositorio.utb.edu.co:20.500.12585/9190
Acceso en línea:
https://hdl.handle.net/20.500.12585/9190
Palabra clave:
Author profiling
Celebrity profiling
Computational linguistic
Natural language processing
Socio-linguistic feature
Twitter
User profiling
Classification (of information)
Computational linguistics
Decision making
Marketing
Natural language processing systems
Social networking (online)
Author profiling
Celebrity profiling
Linguistic features
Natural language processing
Twitter
User profiling
Linguistics
Rights
restrictedAccess
License
http://creativecommons.org/licenses/by-nc-nd/4.0/
id UTB2_be4f72eb9349744a8696fc3f24eff037
oai_identifier_str oai:repositorio.utb.edu.co:20.500.12585/9190
network_acronym_str UTB2
network_name_str Repositorio Institucional UTB
repository_id_str
dc.title.none.fl_str_mv Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019
title Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019
spellingShingle Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019
Author profiling
Celebrity profiling
Computational linguistic
Natural language processing
Socio-linguistic feature
Twitter
User profiling
Classification (of information)
Computational linguistics
Decision making
Marketing
Natural language processing systems
Social networking (online)
Author profiling
Celebrity profiling
Linguistic features
Natural language processing
Twitter
User profiling
Linguistics
title_short Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019
title_full Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019
title_fullStr Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019
title_full_unstemmed Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019
title_sort Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019
dc.contributor.editor.none.fl_str_mv Cappellato L.
Ferro N.
Losada D.E.
Muller H.
dc.subject.keywords.none.fl_str_mv Author profiling
Celebrity profiling
Computational linguistic
Natural language processing
Socio-linguistic feature
Twitter
User profiling
Classification (of information)
Computational linguistics
Decision making
Marketing
Natural language processing systems
Social networking (online)
Author profiling
Celebrity profiling
Linguistic features
Natural language processing
Twitter
User profiling
Linguistics
topic Author profiling
Celebrity profiling
Computational linguistic
Natural language processing
Socio-linguistic feature
Twitter
User profiling
Classification (of information)
Computational linguistics
Decision making
Marketing
Natural language processing systems
Social networking (online)
Author profiling
Celebrity profiling
Linguistic features
Natural language processing
Twitter
User profiling
Linguistics
description Social networks have been a revolutionary scenario for celebrities because they allow them to reach a wider audience with much higher frequency than using traditional means. These platforms enable them to improve or sometimes deteriorate, their careers through the construction of closer relationships with their fans and the acquisition of new ones. Indeed, networks have promoted the emergence of a new type of celebrities that exists only in the digital world. Being able to characterize the celebrities that are more active on social networks, such as Twitter, gives an enormous opportunity to identify what is their real level of fame, what is their relevance for an age group, or a specific gender or occupation. These facts may enrich decision making, especially in advertising and marketing. To achieve this aim, this paper presents a novel strategy for the characterization of celebrities profile on Twitter based on the generation of socio-linguistic features from their posts that serve as input to a set of classifiers. Specifically, we produced four classifiers that describe the level of fame, the gender, the birth date, and the possible occupation of a celebrity. We obtained the training and test data sets as part of our participation at PAN 2019 at CLEF. Results of each classifier are reported including the analysis of which features are more relevant, which classification techniques were more useful and which were the final precision and recall results. © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
publishDate 2019
dc.date.issued.none.fl_str_mv 2019
dc.date.accessioned.none.fl_str_mv 2020-03-26T16:33:10Z
dc.date.available.none.fl_str_mv 2020-03-26T16:33:10Z
dc.type.coarversion.fl_str_mv http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.coar.fl_str_mv http://purl.org/coar/resource_type/c_c94f
dc.type.driver.none.fl_str_mv info:eu-repo/semantics/conferenceObject
dc.type.hasversion.none.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.spa.none.fl_str_mv Conferencia
status_str publishedVersion
dc.identifier.citation.none.fl_str_mv CEUR Workshop Proceedings; Vol. 2380
dc.identifier.issn.none.fl_str_mv 16130073
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12585/9190
dc.identifier.instname.none.fl_str_mv Universidad Tecnológica de Bolívar
dc.identifier.reponame.none.fl_str_mv Repositorio UTB
dc.identifier.orcid.none.fl_str_mv 57194828933
57202285682
57191078469
57203852380
8738428200
56986551200
identifier_str_mv CEUR Workshop Proceedings; Vol. 2380
16130073
Universidad Tecnológica de Bolívar
Repositorio UTB
57194828933
57202285682
57191078469
57203852380
8738428200
56986551200
url https://hdl.handle.net/20.500.12585/9190
dc.language.iso.none.fl_str_mv eng
language eng
dc.relation.conferencedate.none.fl_str_mv 9 September 2019 through 12 September 2019
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_16ec
dc.rights.uri.none.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.accessrights.none.fl_str_mv info:eu-repo/semantics/restrictedAccess
dc.rights.cc.none.fl_str_mv Atribución-NoComercial 4.0 Internacional
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
Atribución-NoComercial 4.0 Internacional
http://purl.org/coar/access_right/c_16ec
eu_rights_str_mv restrictedAccess
dc.format.medium.none.fl_str_mv Recurso electrónico
dc.format.mimetype.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv CEUR-WS
publisher.none.fl_str_mv CEUR-WS
dc.source.none.fl_str_mv https://www.scopus.com/inward/record.uri?eid=2-s2.0-85070517749&partnerID=40&md5=fa41968a27e8ebc57402aac5c3de64c1
institution Universidad Tecnológica de Bolívar
dc.source.event.none.fl_str_mv 20th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2019
bitstream.url.fl_str_mv https://repositorio.utb.edu.co/bitstream/20.500.12585/9190/1/MiniProdInv.png
bitstream.checksum.fl_str_mv 0cb0f101a8d16897fb46fc914d3d7043
bitstream.checksumAlgorithm.fl_str_mv MD5
repository.name.fl_str_mv Repositorio Institucional UTB
repository.mail.fl_str_mv repositorioutb@utb.edu.co
_version_ 1814021638188957696
spelling Cappellato L.Ferro N.Losada D.E.Muller H.Moreno-Sandoval L.G.Puertas E.Plaza-Del-Arco F.M.Pomares-Quimbaya A.Alvarado‑Valencia, Jorge AndresAlfonso Ureña-López L.2020-03-26T16:33:10Z2020-03-26T16:33:10Z2019CEUR Workshop Proceedings; Vol. 238016130073https://hdl.handle.net/20.500.12585/9190Universidad Tecnológica de BolívarRepositorio UTB57194828933572022856825719107846957203852380873842820056986551200Social networks have been a revolutionary scenario for celebrities because they allow them to reach a wider audience with much higher frequency than using traditional means. These platforms enable them to improve or sometimes deteriorate, their careers through the construction of closer relationships with their fans and the acquisition of new ones. Indeed, networks have promoted the emergence of a new type of celebrities that exists only in the digital world. Being able to characterize the celebrities that are more active on social networks, such as Twitter, gives an enormous opportunity to identify what is their real level of fame, what is their relevance for an age group, or a specific gender or occupation. These facts may enrich decision making, especially in advertising and marketing. To achieve this aim, this paper presents a novel strategy for the characterization of celebrities profile on Twitter based on the generation of socio-linguistic features from their posts that serve as input to a set of classifiers. Specifically, we produced four classifiers that describe the level of fame, the gender, the birth date, and the possible occupation of a celebrity. We obtained the training and test data sets as part of our participation at PAN 2019 at CLEF. Results of each classifier are reported including the analysis of which features are more relevant, which classification techniques were more useful and which were the final precision and recall results. © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).Recurso electrónicoapplication/pdfengCEUR-WShttp://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/restrictedAccessAtribución-NoComercial 4.0 Internacionalhttp://purl.org/coar/access_right/c_16echttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85070517749&partnerID=40&md5=fa41968a27e8ebc57402aac5c3de64c120th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2019Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionConferenciahttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_c94fAuthor profilingCelebrity profilingComputational linguisticNatural language processingSocio-linguistic featureTwitterUser profilingClassification (of information)Computational linguisticsDecision makingMarketingNatural language processing systemsSocial networking (online)Author profilingCelebrity profilingLinguistic featuresNatural language processingTwitterUser profilingLinguistics9 September 2019 through 12 September 2019Aragón, M.E., López-Monroy, A.P., A straightforward multimodal approach for author profiling (2018) Proceedings of the Ninth International Conference of the CLEF Association (CLEF 2018)Basile, A., Dwyer, G., Medvedeva, M., Rawee, J., Haagsma, H., Nissim, M., (2017) N-Gram: New Groningen Author-Profiling Model, , arXiv preprintCopland, F., Shaw, S., Snell, J., (2016) Linguistic Ethnography: Interdisciplinary Explorations, , SpringerDaelemans, W., Kestemont, M., Manjavancas, E., Potthast, M., Rangel, F., Rosso, P., Specht, G., Zangerle, E., Overview of PAN 2019: Author profiling, celebrity profiling, cross-domain authorship attribution and style change detection (2019) Proceedings of the Tenth International Conference of the CLEF Association (CLEF 2019), , Crestani, F., Braschler, M., Savoy, J., Rauber, A., Müller, H., Losada, D., Heinatz, G., Cappellato, L., Ferro, N. eds Springer SepFatima, M., Hasan, K., Anwar, S., Nawab, R.M.A., Multilingual author profiling on facebook (2017) Information Processing & Management, 53 (4), pp. 886-904Ferrari, A., Consoli, A., (2016) Building Accurate Hav Exploiting User Profiling and Sentiment Analysis, , arXiv preprintGollub, T., Stein, B., Burrows, S., Hoppe, D., TiRA: Configuring, executing, and disseminating information retrieval experiments (2012) 2012 23rd International Workshop on Database and Expert Systems Applications, pp. 151-155HaCohen-Kerner, Y., Yigal, Y., Elyashiv Shayovitz, D.M., Breckon, T., (2018) Author Profiling: Gender Prediction from Tweets and ImagesKarlgren, J., Esposito, L., Gratton, C., Kanerva, P., Authorship profiling without using topical information: Notebook for PAN at CLEF 2018 (2018) 19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018, 2125. , Avignon, France, 10 September 2018 through 14 September 2018.. CEUR-WSKhamis, S., Ang, L., Welling, R., Self-branding,'micro-celebrity'and the rise of social media influencers (2017) Celebrity Studies, 8 (2), pp. 191-208McCarthy, P.M., Jarvis, S., VOCD: A theoretical and empirical evaluation (2007) Language Testing, 24 (4), pp. 459-488. , https://doi.org/10.1177/0265532207080767Nieuwenhuis, M., Wilkens, J., Twitter text and image gender classification with a logistic regression n-gram model (2018) Proceedings of the Ninth International Conference of the CLEF Association (CLEF 2018)Potthast, M., Gollub, T., Wiegmann, M., Stein, B., TIRA integrated research architecture (2019) Information Retrieval Evaluation in a Changing World - Lessons Learned from 20 Years of CLEF, , Ferro, N., Peters, C. eds SpringerRangel, F., Rosso, P., Use of language and author profiling: Identification of gender and age (2013) Natural Language Processing and Cognitive Science, 177Rangel, F., Rosso, P., Montes-Y Gómez, M., Potthast, M., Stein, B., Overview of the 6th author profiling task at pan 2018: Multimodal gender identification in twitter (2018) Working Notes Papers of the CLEFRangel, F., Rosso, P., Potthast, M., Stein, B., Overview of the 5th author profiling task at pan 2017: Gender and language variety identification in twitter (2017) Working Notes Papers of the CLEFRosso, P., Rangel, F., Farías, I.H., Cagnina, L., Zaghouani, W., Charfi, A., A survey on author profiling, deception, and irony detection for the Arabic language (2018) Language and Linguistics Compass, 12 (4)Stamatatos, E., Rangel, F., Tschuggnall, M., Stein, B., Kestemont, M., Rosso, P., Potthast, M., Overview of PAN 2018: Author identification, author profiling, and author obfuscation (2018) Experimental IR Meets Multilinguality, Multimodality, and Interaction. 9th International Conference of the CLEF Association, CLEF 2018, pp. 267-285. , Avignon, France, September 10-14/Bellot, Patrice edit.Tellez, E.S., Miranda-Jiménez, S., Moctezuma, D., Graff, M., Salgado, V., Ortiz-Bejar, J., Gender identification through multi-modal tweet analysis using microtc and bag of visual words (2018) Proceedings of the Ninth International Conference of the CLEF Association (CLEF 2018)Wiegmann, M., Stein, B., Potthast, M., Overview of the celebrity profiling task at pan 2019 (2019) CLEF 2019 Labs and Workshops, Notebook Papers, , Cappellato, L., Ferro, N., Losada, D., Müller, H. eds CEUR-WS.org Sephttp://purl.org/coar/resource_type/c_c94fTHUMBNAILMiniProdInv.pngMiniProdInv.pngimage/png23941https://repositorio.utb.edu.co/bitstream/20.500.12585/9190/1/MiniProdInv.png0cb0f101a8d16897fb46fc914d3d7043MD5120.500.12585/9190oai:repositorio.utb.edu.co:20.500.12585/91902023-05-25 10:23:54.052Repositorio Institucional UTBrepositorioutb@utb.edu.co