Selecting electrical billing attributes: big data preprocessing improvements
The attribute selection is a very relevant activity of data preprocessing when discovering knowledge on databases. Its main objective is to eliminate irrelevant and/or redundant attributes to obtain computationally treatable issues, without affecting the quality of the solution. Various techniques a...
- Autores:
-
Viloria, Amelec
García Guiliany, Jesús Enrique
Orellano Llinás, Nataly
Hernandez-P, Hugo
Steffens Sanabria, Ernesto
Pineda, Omar
- Tipo de recurso:
- Article of journal
- Fecha de publicación:
- 2020
- Institución:
- Corporación Universidad de la Costa
- Repositorio:
- REDICUC - Repositorio CUC
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.cuc.edu.co:11323/7786
- Acceso en línea:
- https://hdl.handle.net/11323/7786
https://doi.org/10.1007/978-981-15-3125-5_44
https://repositorio.cuc.edu.co/
- Palabra clave:
- Electric billing
Concave programming
Data mining
Electric service billing
- Rights
- openAccess
- License
- Attribution-NonCommercial-NoDerivatives 4.0 International
Summary: | The attribute selection is a very relevant activity of data preprocessing when discovering knowledge on databases. Its main objective is to eliminate irrelevant and/or redundant attributes to obtain computationally treatable issues, without affecting the quality of the solution. Various techniques are proposed, mainly from two approaches: wrapper and ranking. This article evaluates a novel approach proposed by Bradley and Mangasarian (Machine learning ICML. Morgan Kaufmann, Sn Fco, CA, pp. 82–90, 1998 [1]) which uses concave programming for minimizing the classification error and the number of attributes required to perform the task. The technique is evaluated using the electric service billing database in Colombia. The results are compared against traditional techniques for evaluating: attribute reduction, processing time, discovered knowledge size, and solution quality. |
---|