Diseño e Implementación de una Plataforma de Análisis de Datos en el Sector Educativo
In the educational sector, tests of student´s performance are constantly being applied; these tests produce data that are not analyzed in a deep way. The goal of this project is to study and / or compare the aforementioned data through a platform for the analysis of such data using protocols to get...
- Autores:
-
Osorio Salcedo, Karen Paola
- Tipo de recurso:
- Fecha de publicación:
- 2016
- Institución:
- Universidad del Norte
- Repositorio:
- Repositorio Uninorte
- Idioma:
- spa
- OAI Identifier:
- oai:manglar.uninorte.edu.co:10584/5860
- Acceso en línea:
- http://hdl.handle.net/10584/5860
- Palabra clave:
- BigData
A
BigData
data
- Rights
- License
- Universidad del Norte
Summary: | In the educational sector, tests of student´s performance are constantly being applied; these tests produce data that are not analyzed in a deep way. The goal of this project is to study and / or compare the aforementioned data through a platform for the analysis of such data using protocols to get enough bases and obtain resources for decision-making. the tools used for development are: Apache Spark Platform and Software alternates: Rapidminer, Weka, R. The solution is about an implementation of a framework called Apache Spark, for the configuration and development of a strategic environment for analyzing data. To achieve this project´s goal it was divided into two phases .The first phase, was based on the hardware and software design of a data analysis platform. The second phase was based on the design and implementation of data architecture for the platform. When designing and implementing a hardware and software infrastructure that supports a data analysis platform, the first tests were performed using virtual machines. The best environments in which the Apache Spark platform could be installed were VmWare and no virtualization. The other options did not support the large amount of information that was going to use or simply because of the computer capacity. The Apache Spark platform was compared with common applications for data mining. Apache Spark excelled in using time and resources to other applications. The analysis of any type of data allows us to obtain a global or specific sample of estimates that contribute for making a decision. Experimenting with these new technologies and comparing them to common technologies show how efficient and optimal the results of a sample of data can be to find similarities in them. |
---|