A Genetic Clustering Algorithm for Automatic Text Summarization
Abstract. Automatic text summarization has become a relevant topic due to the information overload. This automatization aims to help humans and machines to deal with the vast amount of text data (structured and un-structured) offered on the web and deep web. In this research a novel approach for aut...
- Autores:
-
Suaréz Benjumea, Sebastian
- Tipo de recurso:
- Fecha de publicación:
- 2016
- Institución:
- Universidad Nacional de Colombia
- Repositorio:
- Universidad Nacional de Colombia
- Idioma:
- spa
- OAI Identifier:
- oai:repositorio.unal.edu.co:unal/57548
- Acceso en línea:
- https://repositorio.unal.edu.co/handle/unal/57548
http://bdigital.unal.edu.co/53848/
- Palabra clave:
- 0 Generalidades / Computer science, information and general works
62 Ingeniería y operaciones afines / Engineering
Text mining
Genetic algorithm
Clustering algorithm
Automatic text summarization
Single document automatic text summarization
- Rights
- openAccess
- License
- Atribución-NoComercial 4.0 Internacional
Summary: | Abstract. Automatic text summarization has become a relevant topic due to the information overload. This automatization aims to help humans and machines to deal with the vast amount of text data (structured and un-structured) offered on the web and deep web. In this research a novel approach for automatic extractive text summarization called SENCLUS is presented. Using a genetic clustering algorithm, SENCLUS clusters the sentences as close representation of the text topics using a fitness function based on redundancy and coverage, and applies a scoring function to select the most relevant sentences of each topic to be part of the extractive summary. The approach was validated using the DUC2002 data set and ROUGE summary quality measures. The results shows that the approach is representative against the state of the art methods for extractive automatic text summarization. |
---|