Selecting electrical billing attributes: big data preprocessing improvements

The attribute selection is a very relevant activity of data preprocessing when discovering knowledge on databases. Its main objective is to eliminate irrelevant and/or redundant attributes to obtain computationally treatable issues, without affecting the quality of the solution. Various techniques a...

Full description

Autores:
Viloria, Amelec
García Guiliany, Jesús Enrique
Orellano Llinás, Nataly
Hernandez-P, Hugo
Steffens Sanabria, Ernesto
Pineda, Omar
Tipo de recurso:
Article of journal
Fecha de publicación:
2020
Institución:
Corporación Universidad de la Costa
Repositorio:
REDICUC - Repositorio CUC
Idioma:
eng
OAI Identifier:
oai:repositorio.cuc.edu.co:11323/7786
Acceso en línea:
https://hdl.handle.net/11323/7786
https://doi.org/10.1007/978-981-15-3125-5_44
https://repositorio.cuc.edu.co/
Palabra clave:
Electric billing
Concave programming
Data mining
Electric service billing
Rights
openAccess
License
Attribution-NonCommercial-NoDerivatives 4.0 International
Description
Summary:The attribute selection is a very relevant activity of data preprocessing when discovering knowledge on databases. Its main objective is to eliminate irrelevant and/or redundant attributes to obtain computationally treatable issues, without affecting the quality of the solution. Various techniques are proposed, mainly from two approaches: wrapper and ranking. This article evaluates a novel approach proposed by Bradley and Mangasarian (Machine learning ICML. Morgan Kaufmann, Sn Fco, CA, pp. 82–90, 1998 [1]) which uses concave programming for minimizing the classification error and the number of attributes required to perform the task. The technique is evaluated using the electric service billing database in Colombia. The results are compared against traditional techniques for evaluating: attribute reduction, processing time, discovered knowledge size, and solution quality.