Adaptive fine-tuning of LLMs with QLoRA adapters for enhanced understanding in cooperative multi-agent scenarios
This work explores fine-tuning of Large Language Models (LLMs) using QLoRA adapters to enhance performance in cooperative multi-agent scenarios. Using the Melting Pot framework and integrating multiple indicators of collective welfare and agent comprehension into a unified signal, the approach optim...
- Autores:
-
Gómez Barrera, Daniel Fernando
- Tipo de recurso:
- Trabajo de grado de pregrado
- Fecha de publicación:
- 2024
- Institución:
- Universidad de los Andes
- Repositorio:
- Séneca: repositorio Uniandes
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.uniandes.edu.co:1992/74837
- Acceso en línea:
- https://hdl.handle.net/1992/74837
- Palabra clave:
- Artificial Intelligence
Cooperative AI
Multi-agent scenarios
Machine learning
Natural language processing
NLP
LLM
Large Language Models
Ingeniería
- Rights
- embargoedAccess
- License
- Attribution-ShareAlike 4.0 International
Summary: | This work explores fine-tuning of Large Language Models (LLMs) using QLoRA adapters to enhance performance in cooperative multi-agent scenarios. Using the Melting Pot framework and integrating multiple indicators of collective welfare and agent comprehension into a unified signal, the approach optimizes the selection of training examples. Fine-tuning applied to the quantized Llama-3B models resulted in improved stability and performance, particularly in reward acquisition and equality maintenance. Despite quantitative support for the positive effects of fine-tuning on collective well-being and increased cooperativity, the training heavily depends on the model's original state, limiting the spectrum of solutions and preventing agents from explicitly reasoning about the common good. |
---|