Source code analysis on student assignments using machine learning techniques

Abstract. To increase the success in computer programming courses, it is important to understand the learning process and common difficulties faced by students. Although several studies have investigated possible relationships between students performance and self-regulated learning characteristics,...

Full description

Autores:
Castellanos Morales, Hugo Armando
Tipo de recurso:
Fecha de publicación:
2017
Institución:
Universidad Nacional de Colombia
Repositorio:
Universidad Nacional de Colombia
Idioma:
spa
OAI Identifier:
oai:repositorio.unal.edu.co:unal/60068
Acceso en línea:
https://repositorio.unal.edu.co/handle/unal/60068
http://bdigital.unal.edu.co/58004/
Palabra clave:
0 Generalidades / Computer science, information and general works
37 Educación / Education
Motivation
Learning strategies
Machine learning
Source code analysis
Self-regulation
Rights
openAccess
License
Atribución-NoComercial 4.0 Internacional
id UNACIONAL2_733d4fe13f3e16629c62ff19a86255f2
oai_identifier_str oai:repositorio.unal.edu.co:unal/60068
network_acronym_str UNACIONAL2
network_name_str Universidad Nacional de Colombia
repository_id_str
dc.title.spa.fl_str_mv Source code analysis on student assignments using machine learning techniques
title Source code analysis on student assignments using machine learning techniques
spellingShingle Source code analysis on student assignments using machine learning techniques
0 Generalidades / Computer science, information and general works
37 Educación / Education
Motivation
Learning strategies
Machine learning
Source code analysis
Self-regulation
title_short Source code analysis on student assignments using machine learning techniques
title_full Source code analysis on student assignments using machine learning techniques
title_fullStr Source code analysis on student assignments using machine learning techniques
title_full_unstemmed Source code analysis on student assignments using machine learning techniques
title_sort Source code analysis on student assignments using machine learning techniques
dc.creator.fl_str_mv Castellanos Morales, Hugo Armando
dc.contributor.advisor.spa.fl_str_mv Gonzalez Osorio, Fabio Augusto (Thesis advisor)
dc.contributor.author.spa.fl_str_mv Castellanos Morales, Hugo Armando
dc.contributor.spa.fl_str_mv Restrepo Calle, Felipe
dc.subject.ddc.spa.fl_str_mv 0 Generalidades / Computer science, information and general works
37 Educación / Education
topic 0 Generalidades / Computer science, information and general works
37 Educación / Education
Motivation
Learning strategies
Machine learning
Source code analysis
Self-regulation
dc.subject.proposal.spa.fl_str_mv Motivation
Learning strategies
Machine learning
Source code analysis
Self-regulation
description Abstract. To increase the success in computer programming courses, it is important to understand the learning process and common difficulties faced by students. Although several studies have investigated possible relationships between students performance and self-regulated learning characteristics, little attention has been given the source code produced by students in this regard. Such source code might contain valuable information about their learning process, specially in a context where practical programming assignments are frequent and students write source code constantly during the course. This poses the following research questions: What is the relationship between the characteristics of students source code and their performance in a computer programming course?. What is the relationship between source code features and self-regulated learning characteristics (i.e., motivation and learning strategies) in a computer programming course?. How the source code and self-regulated features can predict the students' performance? In order to answer these questions, a strategy to support the correlation analysis among students performance, motivation, use of learning strategies, and source code metrics in computer programming courses is proposed. A comprehensive case study is presented to evaluate the strategy. Additionally, an automatic grading tool for programming assignments was used, which facilitated to obtain the source code of the participants for further automatic source code analysis. Moreover, self-regulated learning characteristics were collected using the Motivated Strategies for Learning Questionnaire (MSLQ). Results show that the main features from source code which are significantly related to students performance and self-regulated learning features are: length-related metrics, with mainly positive correlations; and Halstead complexity measures, correlated negatively. In the light of the findings of this study, it is possible to understand better students source code as an artifact that can be used to monitorize several characteristics related to self-regulated learning, course performance, and in general, their learning process. In this way, more research in the area is required to verify if these relationships could give to computing educators new ways to identify and help students with problems.
publishDate 2017
dc.date.issued.spa.fl_str_mv 2017
dc.date.accessioned.spa.fl_str_mv 2019-07-02T17:26:49Z
dc.date.available.spa.fl_str_mv 2019-07-02T17:26:49Z
dc.type.spa.fl_str_mv Trabajo de grado - Maestría
dc.type.driver.spa.fl_str_mv info:eu-repo/semantics/masterThesis
dc.type.version.spa.fl_str_mv info:eu-repo/semantics/acceptedVersion
dc.type.content.spa.fl_str_mv Text
dc.type.redcol.spa.fl_str_mv http://purl.org/redcol/resource_type/TM
status_str acceptedVersion
dc.identifier.uri.none.fl_str_mv https://repositorio.unal.edu.co/handle/unal/60068
dc.identifier.eprints.spa.fl_str_mv http://bdigital.unal.edu.co/58004/
url https://repositorio.unal.edu.co/handle/unal/60068
http://bdigital.unal.edu.co/58004/
dc.language.iso.spa.fl_str_mv spa
language spa
dc.relation.ispartof.spa.fl_str_mv Universidad Nacional de Colombia Sede Bogotá Facultad de Ingeniería Departamento de Ingeniería de Sistemas e Industrial Ingeniería de Sistemas
Ingeniería de Sistemas
dc.relation.references.spa.fl_str_mv Castellanos Morales, Hugo Armando (2017) Source code analysis on student assignments using machine learning techniques. Maestría thesis, Universidad Nacional de Colombia - Sede Bogotá.
dc.rights.spa.fl_str_mv Derechos reservados - Universidad Nacional de Colombia
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.rights.license.spa.fl_str_mv Atribución-NoComercial 4.0 Internacional
dc.rights.uri.spa.fl_str_mv http://creativecommons.org/licenses/by-nc/4.0/
dc.rights.accessrights.spa.fl_str_mv info:eu-repo/semantics/openAccess
rights_invalid_str_mv Atribución-NoComercial 4.0 Internacional
Derechos reservados - Universidad Nacional de Colombia
http://creativecommons.org/licenses/by-nc/4.0/
http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv openAccess
dc.format.mimetype.spa.fl_str_mv application/pdf
institution Universidad Nacional de Colombia
bitstream.url.fl_str_mv https://repositorio.unal.edu.co/bitstream/unal/60068/1/HugoA.CastellanosMorales.2017.pdf
https://repositorio.unal.edu.co/bitstream/unal/60068/2/HugoA.CastellanosMorales.2017.pdf.jpg
bitstream.checksum.fl_str_mv 958720ac1f644ac26d09f495da080d86
5e958048a40255885dc603945739a6b3
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositorio Institucional Universidad Nacional de Colombia
repository.mail.fl_str_mv repositorio_nal@unal.edu.co
_version_ 1806886715388854272
spelling Atribución-NoComercial 4.0 InternacionalDerechos reservados - Universidad Nacional de Colombiahttp://creativecommons.org/licenses/by-nc/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Restrepo Calle, FelipeGonzalez Osorio, Fabio Augusto (Thesis advisor)cdc14b69-bf63-4f8c-ab69-fd166d3c8142-1Castellanos Morales, Hugo Armandob5af9c95-7a06-465d-876b-7c04a14720f43002019-07-02T17:26:49Z2019-07-02T17:26:49Z2017https://repositorio.unal.edu.co/handle/unal/60068http://bdigital.unal.edu.co/58004/Abstract. To increase the success in computer programming courses, it is important to understand the learning process and common difficulties faced by students. Although several studies have investigated possible relationships between students performance and self-regulated learning characteristics, little attention has been given the source code produced by students in this regard. Such source code might contain valuable information about their learning process, specially in a context where practical programming assignments are frequent and students write source code constantly during the course. This poses the following research questions: What is the relationship between the characteristics of students source code and their performance in a computer programming course?. What is the relationship between source code features and self-regulated learning characteristics (i.e., motivation and learning strategies) in a computer programming course?. How the source code and self-regulated features can predict the students' performance? In order to answer these questions, a strategy to support the correlation analysis among students performance, motivation, use of learning strategies, and source code metrics in computer programming courses is proposed. A comprehensive case study is presented to evaluate the strategy. Additionally, an automatic grading tool for programming assignments was used, which facilitated to obtain the source code of the participants for further automatic source code analysis. Moreover, self-regulated learning characteristics were collected using the Motivated Strategies for Learning Questionnaire (MSLQ). Results show that the main features from source code which are significantly related to students performance and self-regulated learning features are: length-related metrics, with mainly positive correlations; and Halstead complexity measures, correlated negatively. In the light of the findings of this study, it is possible to understand better students source code as an artifact that can be used to monitorize several characteristics related to self-regulated learning, course performance, and in general, their learning process. In this way, more research in the area is required to verify if these relationships could give to computing educators new ways to identify and help students with problems.Para mejorar el éxito de los estudiantes en los cursos de programación, es importante entender el proceso de aprendizaje y las dificultades comunes que enfrentan los estudiantes. Aunque muchos estudios han investigado las posibles relaciones entre el rendimiento de los estudiantes y aspectos de la auto-regulación del aprendizaje, poca atención se le ha dado al código fuente producido por los estudiantes. El cual puede contener información valiosa acerca de su proceso de aprendizaje. Esto es especialmente cierto en contextos donde las actividades prácticas de programación son frecuentes y los estudiantes escriben código fuente constantemente durante el desarrollo del curso. Lo anterior, plantea las siguientes preguntas de investigación: ¿Cuál es la relación entre las características del código fuente de los estudiantes y su rendimiento en un curso de programación de computadores?. ¿Cuál es la relación entre las características del código fuente y características de aprendizaje auto-regulado (motivación y estrategias de aprendizaje) en un curso de programación de computadores?. ¿Cómo el código fuente y las características de aprendizaje auto-regulado pueden predecir el rendimiento de los estudiantes? Para responder estas preguntas, se presenta una estrategia para realizar el análisis de correlaciones entre el rendimiento de los estudiantes, motivación, el uso de estrategias de aprendizaje, y las métricas de código fuente en cursos de programación de computadores. Un caso de estudio exhaustivo es presentado para evaluar la estrategia propuesta usando datos recolectados de estudiantes. Además se usaba una herramienta de calificación automática para evaluar las practicas, lo cual facilitaba la obtención de código fuente de estudiantes para su análisis posterior. Las características de aprendizaje auto-regulado fueron obtenidas usando el cuestionario: Motivated Strategies for Learning Questionnaire Colombia (MSLQColombia). Los resultados muestran que las principales características del código fuente que están relacionadas con el rendimiento de los estudiantes y características auto-reguladas son: las métricas de longitud, que se correlaciona positivamente; y las medidas de complejidad de Halstead, las cuales se correlacionan negativamente. Dados los resultados, es posible entender mejor el código fuente de los estudiantes como un artefacto que puede ser usado para monitorear características relacionadas con el aprendizaje auto-regulado, rendimiento en el curso, y en general, su proceso de aprendizaje. De esta forma, investigaciones adicionales son necesarias para verificar si dichas relaciones pueden dar a los educadores nuevas herramientas para identificar y ayudar a estudiantes con problemas.Maestríaapplication/pdfspaUniversidad Nacional de Colombia Sede Bogotá Facultad de Ingeniería Departamento de Ingeniería de Sistemas e Industrial Ingeniería de SistemasIngeniería de SistemasCastellanos Morales, Hugo Armando (2017) Source code analysis on student assignments using machine learning techniques. Maestría thesis, Universidad Nacional de Colombia - Sede Bogotá.0 Generalidades / Computer science, information and general works37 Educación / EducationMotivationLearning strategiesMachine learningSource code analysisSelf-regulationSource code analysis on student assignments using machine learning techniquesTrabajo de grado - Maestríainfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/acceptedVersionTexthttp://purl.org/redcol/resource_type/TMORIGINALHugoA.CastellanosMorales.2017.pdfapplication/pdf1390912https://repositorio.unal.edu.co/bitstream/unal/60068/1/HugoA.CastellanosMorales.2017.pdf958720ac1f644ac26d09f495da080d86MD51THUMBNAILHugoA.CastellanosMorales.2017.pdf.jpgHugoA.CastellanosMorales.2017.pdf.jpgGenerated Thumbnailimage/jpeg4399https://repositorio.unal.edu.co/bitstream/unal/60068/2/HugoA.CastellanosMorales.2017.pdf.jpg5e958048a40255885dc603945739a6b3MD52unal/60068oai:repositorio.unal.edu.co:unal/600682024-04-11 23:09:15.569Repositorio Institucional Universidad Nacional de Colombiarepositorio_nal@unal.edu.co