Data mining techniques and multivariate analysis to discover Patterns in university final researches

The aim of this study is to extract knowledge from the final researches of the Mumbai University Science Faculty. Five classification models were applied: Vector Support Machines, Neural Networks, Decision Tree, Random Forest and Powering; considering the Experiment Design and Multivariate Analysis...

Full description

Autores:
amelec, viloria
Rodríguez López, Jorge
García Leyva, Diana Margarita
Vargas Mercado, Carlos
Hernández-Palma, Hugo
ORELLANO LLINAS, NATALY
Arrozola David, Mónica
Velasquez Rodriguez, Javier
Tipo de recurso:
Article of journal
Fecha de publicación:
2019
Institución:
Corporación Universidad de la Costa
Repositorio:
REDICUC - Repositorio CUC
Idioma:
eng
OAI Identifier:
oai:repositorio.cuc.edu.co:11323/5867
Acceso en línea:
https://hdl.handle.net/11323/5867
https://repositorio.cuc.edu.co/
Palabra clave:
Data mining education
Education indicators
Classification.
Data mining techniques
Educación en minería de datos
Técnicas de minería de datos
Indicadores de educación
Clasificación
Rights
openAccess
License
http://creativecommons.org/publicdomain/zero/1.0/
id RCUC2_afc65041cacb05a18a671a7ba9256b82
oai_identifier_str oai:repositorio.cuc.edu.co:11323/5867
network_acronym_str RCUC2
network_name_str REDICUC - Repositorio CUC
repository_id_str
dc.title.spa.fl_str_mv Data mining techniques and multivariate analysis to discover Patterns in university final researches
dc.title.translated.spa.fl_str_mv Técnicas de minería de datos y análisis multivariado para descubrir patrones en investigaciones finales universitarias.
title Data mining techniques and multivariate analysis to discover Patterns in university final researches
spellingShingle Data mining techniques and multivariate analysis to discover Patterns in university final researches
Data mining education
Education indicators
Classification.
Data mining techniques
Educación en minería de datos
Técnicas de minería de datos
Indicadores de educación
Clasificación
title_short Data mining techniques and multivariate analysis to discover Patterns in university final researches
title_full Data mining techniques and multivariate analysis to discover Patterns in university final researches
title_fullStr Data mining techniques and multivariate analysis to discover Patterns in university final researches
title_full_unstemmed Data mining techniques and multivariate analysis to discover Patterns in university final researches
title_sort Data mining techniques and multivariate analysis to discover Patterns in university final researches
dc.creator.fl_str_mv amelec, viloria
Rodríguez López, Jorge
García Leyva, Diana Margarita
Vargas Mercado, Carlos
Hernández-Palma, Hugo
ORELLANO LLINAS, NATALY
Arrozola David, Mónica
Velasquez Rodriguez, Javier
dc.contributor.author.spa.fl_str_mv amelec, viloria
Rodríguez López, Jorge
García Leyva, Diana Margarita
Vargas Mercado, Carlos
Hernández-Palma, Hugo
ORELLANO LLINAS, NATALY
Arrozola David, Mónica
Velasquez Rodriguez, Javier
dc.subject.spa.fl_str_mv Data mining education
Education indicators
Classification.
Data mining techniques
Educación en minería de datos
Técnicas de minería de datos
Indicadores de educación
Clasificación
topic Data mining education
Education indicators
Classification.
Data mining techniques
Educación en minería de datos
Técnicas de minería de datos
Indicadores de educación
Clasificación
description The aim of this study is to extract knowledge from the final researches of the Mumbai University Science Faculty. Five classification models were applied: Vector Support Machines, Neural Networks, Decision Tree, Random Forest and Powering; considering the Experiment Design and Multivariate Analysis Lines. Results showed that for the Experiment Design line, the most accurate model was Random Forest with 71.48% predictions that are correct respecting to the total. Regarding the Multivariate Analysis line, there was no significant difference in overall accuracy, fluctuating by 97%.
publishDate 2019
dc.date.issued.none.fl_str_mv 2019
dc.date.accessioned.none.fl_str_mv 2020-01-17T19:41:31Z
dc.date.available.none.fl_str_mv 2020-01-17T19:41:31Z
dc.type.spa.fl_str_mv Artículo de revista
dc.type.coar.fl_str_mv http://purl.org/coar/resource_type/c_2df8fbb1
dc.type.coar.spa.fl_str_mv http://purl.org/coar/resource_type/c_6501
dc.type.content.spa.fl_str_mv Text
dc.type.driver.spa.fl_str_mv info:eu-repo/semantics/article
dc.type.redcol.spa.fl_str_mv http://purl.org/redcol/resource_type/ART
dc.type.version.spa.fl_str_mv info:eu-repo/semantics/acceptedVersion
format http://purl.org/coar/resource_type/c_6501
status_str acceptedVersion
dc.identifier.issn.spa.fl_str_mv 1877-0509
dc.identifier.uri.spa.fl_str_mv https://hdl.handle.net/11323/5867
dc.identifier.instname.spa.fl_str_mv Corporación Universidad de la Costa
dc.identifier.reponame.spa.fl_str_mv REDICUC - Repositorio CUC
dc.identifier.repourl.spa.fl_str_mv https://repositorio.cuc.edu.co/
identifier_str_mv 1877-0509
Corporación Universidad de la Costa
REDICUC - Repositorio CUC
url https://hdl.handle.net/11323/5867
https://repositorio.cuc.edu.co/
dc.language.iso.none.fl_str_mv eng
language eng
dc.relation.ispartof.spa.fl_str_mv https://doi.org/10.1016/j.procs.2019.08.081
dc.relation.references.spa.fl_str_mv Vasquez, C., Torres, M., Viloria, A.: Public policies in science and technology in Latin American countries with universities in the top 100 of web ranking. J. Eng. Appl. Sci. 12(11), 2963–2965 (2017).
Aguado-López, E., Rogel-Salazar, R., Becerril-García, A., Baca-Zapata, G.: Presencia de universidades en la Red: La brecha digital entre Estados Unidos y el resto del mundo. Revista de Universidad y Sociedad del Conocimiento 6(1), 1–17 (2009).
Torres-Samuel, M., Vásquez, C., Viloria, A., Lis-Gutiérrez, J.P., Borrero, T.C., Varela, N.: Web Visibility Profiles of Top100 Latin American Universities. In: Tan Y., Shi Y., Tang Q. (eds) Data Mining and Big Data. DMBD 2018. Lecture Notes in Computer Science, Springer, Cham, vol 10943, 1-12 (2018).
Viloria, A., Lis-Gutiérrez, J.P., Gaitán-Angulo, M., Godoy, A.R.M., Moreno, G.C., Kamatkar, S.J. : Methodology for the Design of a Student Pattern Recognition Tool to Facilitate the Teaching – Learning Process Through Knowledge Data Discovery (Big Data). In: Tan Y., Shi Yang Q. (eds) Data Mining and Big Data. DMBD 2018. Lecture Notes in Computer Science, Springer, Cham, vol 10943,1-12 (2018).
Caicedo, E.J.C., Guerrero, S., López, D.: Propuesta para la construcción de un índice socioeconómico para los estudiantes que presentan las pruebas Saber Pro. Comunicaciones en Estadística, vol. 9(1), 93-106 (2016).
Mazón, J.N., Trujillo, J., Serrano, M., Piattini, M.: Designing Data Warehouses: From Business Requirement Analysis to Multidimensional Modeling. In Proceedings of the 1st Int. Workshop on Requirements Engineering for Business Need and IT Alignment. Paris, France (2005).
Vásquez, C., Torres-Samuel, M., Viloria, A., Lis-Gutiérrez, J.P., Crissien Borrero, T., Varela, N., Cabrera, D.: Cluster of the Latin American Universities Top100 According to Webometrics 2017. In: Tan Y., Shi Y., Tang Q. (eds) Data Mining and Big Data. DMBD 2018. Lecture Notes in Computer Science, Springer, Cham , vol 10943, 1-12 (2018).
Haykin, S.: Neural Networks a Comprehensive Foundation. Second Edition. Macmillan College Publishing, Inc. USA. ISBN 9780023527616 (1999).
Isasi, P., Galván, I.: Redes de Neuronas Artificiales. Un enfoque Práctico. Pearson. ISBN 8420540250 (2004).
Haykin, S.: Neural Networks and Learning Machines. New Jersey, Prentice Hall International (2009).
Rafailidis, D., Kefalas, P., Manolopoulos, Y.: Preference dynamics with multimodal user-item interactions in social media recommendation. Expert Systems with Applications 74(1), 11-18 (2017).
Zheng, C., Haihong, E., Song, M., Song, J.: CMPTF: Contextual Modeling Probabilistic Tensor Factorization for recommender systems. Neurocomputing 205(1), 141-151 (2016).
Hidasi, B., Tikk, D.: Fast ALS-based tensor factorization for context-aware recommendation from implicit feedback. Machine Learning and Knowledge Discovery in Databases (2012).
Lee, J., Lee, D., Lee, Y. C., Hwang, W. S., Kim, S. W.: Improving the accuracy of top-n recommendation using a preference model. Information Sciences 348(1), 290-304 (2016).
Abhay, K.A., Neelendra, B.: Data Storing in Intelligent and Distributed Data Warehouse using Unique Identification Number, published in International Journal of Grid and Distributed Computing, Publisher: SERSC Australia 10(9), 13-32 (September, 2017).
dc.rights.uri.spa.fl_str_mv http://creativecommons.org/publicdomain/zero/1.0/
dc.rights.accessrights.spa.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.coar.spa.fl_str_mv http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv http://creativecommons.org/publicdomain/zero/1.0/
http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv openAccess
dc.publisher.spa.fl_str_mv Procedia Computer Science
institution Corporación Universidad de la Costa
bitstream.url.fl_str_mv https://repositorio.cuc.edu.co/bitstreams/99091639-0085-4bab-961a-cda889615a7e/download
https://repositorio.cuc.edu.co/bitstreams/d157718c-01d2-49c1-b15b-9c069ac6622d/download
https://repositorio.cuc.edu.co/bitstreams/60a60c90-bd26-4711-8c31-c5075a2d404c/download
https://repositorio.cuc.edu.co/bitstreams/feb08d6f-6dc9-4209-b2e9-120c2a64b88c/download
https://repositorio.cuc.edu.co/bitstreams/1904269e-9f2d-464f-9539-7db916dce609/download
bitstream.checksum.fl_str_mv 42fd4ad1e89814f5e4a476b409eb708c
48772c8ab623b41e1fb39c01f3ca816d
8a4605be74aa9ea9d79846c1fba20a33
d09181ae79096ada10bb9e8a96ffbae4
03aecfe28ecd42379d5a35f017b36e7e
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio de la Universidad de la Costa CUC
repository.mail.fl_str_mv repdigital@cuc.edu.co
_version_ 1811760738425896960
spelling amelec, viloriaRodríguez López, JorgeGarcía Leyva, Diana MargaritaVargas Mercado, CarlosHernández-Palma, HugoORELLANO LLINAS, NATALYArrozola David, MónicaVelasquez Rodriguez, Javier2020-01-17T19:41:31Z2020-01-17T19:41:31Z20191877-0509https://hdl.handle.net/11323/5867Corporación Universidad de la CostaREDICUC - Repositorio CUChttps://repositorio.cuc.edu.co/The aim of this study is to extract knowledge from the final researches of the Mumbai University Science Faculty. Five classification models were applied: Vector Support Machines, Neural Networks, Decision Tree, Random Forest and Powering; considering the Experiment Design and Multivariate Analysis Lines. Results showed that for the Experiment Design line, the most accurate model was Random Forest with 71.48% predictions that are correct respecting to the total. Regarding the Multivariate Analysis line, there was no significant difference in overall accuracy, fluctuating by 97%.El objetivo de este estudio es extraer conocimiento de las investigaciones finales de la Facultad de Ciencias de la Universidad de Mumbai. Se aplicaron cinco modelos de clasificación: máquinas de soporte de vectores, redes neuronales, árbol de decisión, bosque aleatorio y alimentación; considerando el diseño del experimento y las líneas de análisis multivariante. Los resultados mostraron que para la línea de diseño de experimentos, el modelo más preciso fue Random Forest con 71.48% de predicciones que son correctas con respecto al total. Con respecto a la línea de Análisis Multivariante, no hubo diferencias significativas en la precisión general, fluctuando en un 97%.Amelec, Viloria-will be generated-orcid-0000-0003-2673-6350-600Rodríguez López, JorgeGarcía Leyva, Diana MargaritaVargas Mercado, Carlos-will be generated-orcid-0000-0002-5436-0568-600Hernández-Palma, HugoOrellano Llinas, Nataly-will be generated-orcid-0000-0002-5341-6718-600Arrozola David, MónicaVelasquez Rodriguez, JavierengProcedia Computer Sciencehttps://doi.org/10.1016/j.procs.2019.08.081Vasquez, C., Torres, M., Viloria, A.: Public policies in science and technology in Latin American countries with universities in the top 100 of web ranking. J. Eng. Appl. Sci. 12(11), 2963–2965 (2017).Aguado-López, E., Rogel-Salazar, R., Becerril-García, A., Baca-Zapata, G.: Presencia de universidades en la Red: La brecha digital entre Estados Unidos y el resto del mundo. Revista de Universidad y Sociedad del Conocimiento 6(1), 1–17 (2009).Torres-Samuel, M., Vásquez, C., Viloria, A., Lis-Gutiérrez, J.P., Borrero, T.C., Varela, N.: Web Visibility Profiles of Top100 Latin American Universities. In: Tan Y., Shi Y., Tang Q. (eds) Data Mining and Big Data. DMBD 2018. Lecture Notes in Computer Science, Springer, Cham, vol 10943, 1-12 (2018).Viloria, A., Lis-Gutiérrez, J.P., Gaitán-Angulo, M., Godoy, A.R.M., Moreno, G.C., Kamatkar, S.J. : Methodology for the Design of a Student Pattern Recognition Tool to Facilitate the Teaching – Learning Process Through Knowledge Data Discovery (Big Data). In: Tan Y., Shi Yang Q. (eds) Data Mining and Big Data. DMBD 2018. Lecture Notes in Computer Science, Springer, Cham, vol 10943,1-12 (2018).Caicedo, E.J.C., Guerrero, S., López, D.: Propuesta para la construcción de un índice socioeconómico para los estudiantes que presentan las pruebas Saber Pro. Comunicaciones en Estadística, vol. 9(1), 93-106 (2016).Mazón, J.N., Trujillo, J., Serrano, M., Piattini, M.: Designing Data Warehouses: From Business Requirement Analysis to Multidimensional Modeling. In Proceedings of the 1st Int. Workshop on Requirements Engineering for Business Need and IT Alignment. Paris, France (2005).Vásquez, C., Torres-Samuel, M., Viloria, A., Lis-Gutiérrez, J.P., Crissien Borrero, T., Varela, N., Cabrera, D.: Cluster of the Latin American Universities Top100 According to Webometrics 2017. In: Tan Y., Shi Y., Tang Q. (eds) Data Mining and Big Data. DMBD 2018. Lecture Notes in Computer Science, Springer, Cham , vol 10943, 1-12 (2018).Haykin, S.: Neural Networks a Comprehensive Foundation. Second Edition. Macmillan College Publishing, Inc. USA. ISBN 9780023527616 (1999).Isasi, P., Galván, I.: Redes de Neuronas Artificiales. Un enfoque Práctico. Pearson. ISBN 8420540250 (2004).Haykin, S.: Neural Networks and Learning Machines. New Jersey, Prentice Hall International (2009).Rafailidis, D., Kefalas, P., Manolopoulos, Y.: Preference dynamics with multimodal user-item interactions in social media recommendation. Expert Systems with Applications 74(1), 11-18 (2017).Zheng, C., Haihong, E., Song, M., Song, J.: CMPTF: Contextual Modeling Probabilistic Tensor Factorization for recommender systems. Neurocomputing 205(1), 141-151 (2016).Hidasi, B., Tikk, D.: Fast ALS-based tensor factorization for context-aware recommendation from implicit feedback. Machine Learning and Knowledge Discovery in Databases (2012).Lee, J., Lee, D., Lee, Y. C., Hwang, W. S., Kim, S. W.: Improving the accuracy of top-n recommendation using a preference model. Information Sciences 348(1), 290-304 (2016).Abhay, K.A., Neelendra, B.: Data Storing in Intelligent and Distributed Data Warehouse using Unique Identification Number, published in International Journal of Grid and Distributed Computing, Publisher: SERSC Australia 10(9), 13-32 (September, 2017).http://creativecommons.org/publicdomain/zero/1.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Data mining educationEducation indicatorsClassification.Data mining techniquesEducación en minería de datosTécnicas de minería de datosIndicadores de educaciónClasificaciónData mining techniques and multivariate analysis to discover Patterns in university final researchesTécnicas de minería de datos y análisis multivariado para descubrir patrones en investigaciones finales universitarias.Artículo de revistahttp://purl.org/coar/resource_type/c_6501http://purl.org/coar/resource_type/c_2df8fbb1Textinfo:eu-repo/semantics/articlehttp://purl.org/redcol/resource_type/ARTinfo:eu-repo/semantics/acceptedVersionPublicationCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8701https://repositorio.cuc.edu.co/bitstreams/99091639-0085-4bab-961a-cda889615a7e/download42fd4ad1e89814f5e4a476b409eb708cMD54ORIGINALData Mining Techniques.pdfData Mining Techniques.pdfapplication/pdf494165https://repositorio.cuc.edu.co/bitstreams/d157718c-01d2-49c1-b15b-9c069ac6622d/download48772c8ab623b41e1fb39c01f3ca816dMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorio.cuc.edu.co/bitstreams/60a60c90-bd26-4711-8c31-c5075a2d404c/download8a4605be74aa9ea9d79846c1fba20a33MD55THUMBNAILData Mining Techniques.pdf.jpgData Mining Techniques.pdf.jpgimage/jpeg45056https://repositorio.cuc.edu.co/bitstreams/feb08d6f-6dc9-4209-b2e9-120c2a64b88c/downloadd09181ae79096ada10bb9e8a96ffbae4MD57TEXTData Mining Techniques.pdf.txtData Mining Techniques.pdf.txttext/plain20434https://repositorio.cuc.edu.co/bitstreams/1904269e-9f2d-464f-9539-7db916dce609/download03aecfe28ecd42379d5a35f017b36e7eMD5811323/5867oai:repositorio.cuc.edu.co:11323/58672024-09-17 10:54:04.024http://creativecommons.org/publicdomain/zero/1.0/open.accesshttps://repositorio.cuc.edu.coRepositorio de la Universidad de la Costa CUCrepdigital@cuc.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=