Bots and gender profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019

Unfortunately, in social networks, software bots or just bots are becoming more and more common because malicious people have seen their usefulness to spread false messages, spread rumors and even manipulate public opinion. Even though the text generated by users in social networks is a rich source...

Full description

Autores:
Tipo de recurso:
Fecha de publicación:
2019
Institución:
Universidad Tecnológica de Bolívar
Repositorio:
Repositorio Institucional UTB
Idioma:
eng
OAI Identifier:
oai:repositorio.utb.edu.co:20.500.12585/9191
Acceso en línea:
https://hdl.handle.net/20.500.12585/9191
Palabra clave:
Author profiling
Bots profiling
Computational linguistic
Gender profiling
Sociolinguistic
User profiling
Character recognition
Classification (of information)
Computational linguistics
Learning algorithms
Linguistics
Machine learning
Social aspects
Social networking (online)
Social sciences computing
Author profiling
Bots profiling
Gender profiling
Sociolinguistic
User profiling
Botnet
Rights
restrictedAccess
License
http://creativecommons.org/licenses/by-nc-nd/4.0/
Description
Summary:Unfortunately, in social networks, software bots or just bots are becoming more and more common because malicious people have seen their usefulness to spread false messages, spread rumors and even manipulate public opinion. Even though the text generated by users in social networks is a rich source of information that can be used to identify different aspects of its authors, not being able to recognize which users are truly humans and which are not, is a big drawback. In this work, we describe the properties of our multilingual classification model submitted for PAN2019 that is able to recognize bots from humans, and females from males. This solution extracted 18 features from the user's posts and applying a machine learning algorithm obtained good performance results. © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).