Adaptive fine-tuning of LLMs with QLoRA adapters for enhanced understanding in cooperative multi-agent scenarios

This work explores fine-tuning of Large Language Models (LLMs) using QLoRA adapters to enhance performance in cooperative multi-agent scenarios. Using the Melting Pot framework and integrating multiple indicators of collective welfare and agent comprehension into a unified signal, the approach optim...

Full description

Autores:: Gómez Barrera, Daniel Fernando

Tipo de recurso:: Trabajo de grado de pregrado

Fecha de publicación:: 2024

Institución:: Universidad de los Andes

Repositorio:: Séneca: repositorio Uniandes

Idioma:: eng

Description
Summary:	This work explores fine-tuning of Large Language Models (LLMs) using QLoRA adapters to enhance performance in cooperative multi-agent scenarios. Using the Melting Pot framework and integrating multiple indicators of collective welfare and agent comprehension into a unified signal, the approach optimizes the selection of training examples. Fine-tuning applied to the quantized Llama-3B models resulted in improved stability and performance, particularly in reward acquisition and equality maintenance. Despite quantitative support for the positive effects of fine-tuning on collective well-being and increased cooperativity, the training heavily depends on the model's original state, limiting the spectrum of solutions and preventing agents from explicitly reasoning about the common good.

Adaptive fine-tuning of LLMs with QLoRA adapters for enhanced understanding in cooperative multi-agent scenarios

Publicaciones similares