Adaptive fine-tuning of LLMs with QLoRA adapters for enhanced understanding in cooperative multi-agent scenarios

This work explores fine-tuning of Large Language Models (LLMs) using QLoRA adapters to enhance performance in cooperative multi-agent scenarios. Using the Melting Pot framework and integrating multiple indicators of collective welfare and agent comprehension into a unified signal, the approach optim...

Full description

Autores:
Gómez Barrera, Daniel Fernando
Tipo de recurso:
Trabajo de grado de pregrado
Fecha de publicación:
2024
Institución:
Universidad de los Andes
Repositorio:
Séneca: repositorio Uniandes
Idioma:
eng
OAI Identifier:
oai:repositorio.uniandes.edu.co:1992/74837
Acceso en línea:
https://hdl.handle.net/1992/74837
Palabra clave:
Artificial Intelligence
Cooperative AI
Multi-agent scenarios
Machine learning
Natural language processing
NLP
LLM
Large Language Models
Ingeniería
Rights
embargoedAccess
License
Attribution-ShareAlike 4.0 International
Description
Summary:This work explores fine-tuning of Large Language Models (LLMs) using QLoRA adapters to enhance performance in cooperative multi-agent scenarios. Using the Melting Pot framework and integrating multiple indicators of collective welfare and agent comprehension into a unified signal, the approach optimizes the selection of training examples. Fine-tuning applied to the quantized Llama-3B models resulted in improved stability and performance, particularly in reward acquisition and equality maintenance. Despite quantitative support for the positive effects of fine-tuning on collective well-being and increased cooperativity, the training heavily depends on the model's original state, limiting the spectrum of solutions and preventing agents from explicitly reasoning about the common good.