Bridging gaps in code generation with large language models

Large Language Models (LLMs) are transforming natural language processing and extending their impact to code generation. This thesis evaluates both academic and industrial LLMs, focusing on their ability to generate pragmatic, functional code for non-standalone functions—a critical aspect of real-wo...

Full description

Autores:: Osorio Cálad, Juan José

Tipo de recurso:: Trabajo de grado de pregrado

Fecha de publicación:: 2025

Institución:: Universidad de los Andes

Repositorio:: Séneca: repositorio Uniandes

Idioma:: eng

Description
Summary:	Large Language Models (LLMs) are transforming natural language processing and extending their impact to code generation. This thesis evaluates both academic and industrial LLMs, focusing on their ability to generate pragmatic, functional code for non-standalone functions—a critical aspect of real-world programming. To address gaps in performance, reproducibility, and applicability, this research introduces two key tools: the CoderEval-Prompt-Inference repository for structured evaluation and the huggingface_search repository to overcome API limitations in model discovery. The evaluation framework leverages curated datasets to assess the correctness and utility of LLM outputs, emphasizing reproducible workflows and context-aware code generation. Challenges addressed include cleaning model outputs and ensuring their functionality within real-world constraints. Results highlight significant disparities between academic and industry models, providing insights into their alignment for practical use cases. By integrating GPU-based testing for scalability, this work establishes a robust pipeline for evaluating and deploying LLMs in software engineering. This research contributes to bridging the gap between academic innovation and industry application by enhancing model discovery, standardizing evaluation methods, and fostering collaboration across domains. Future efforts will focus on refining tools and methodologies to further unlock the potential of LLMs in real-world software development.

Bridging gaps in code generation with large language models

Publicaciones similares