Deep reinforcement learning for optimal gameplay in street fighter III: a resource-constrained approach

This bachelor’s thesis investigates the performance of reinforcement learning (RL) algorithms in the context of fighting games, specifically Street Fighter III Third Strike, under heavy resource constraints. The research focuses on four distinct RL algorithms: Proximal Policy Optimization (PPO), Asy...

Full description

Autores:
Zambrano Huertas, Daniel Ernesto
Díaz Salamanca, Jhoan Sebastián
Tipo de recurso:
Trabajo de grado de pregrado
Fecha de publicación:
2023
Institución:
Universidad de los Andes
Repositorio:
Séneca: repositorio Uniandes
Idioma:
eng
OAI Identifier:
oai:repositorio.uniandes.edu.co:1992/70987
Acceso en línea:
https://hdl.handle.net/1992/70987
Palabra clave:
Deep Reinforcement Learning
Machine Learning
VideoGames
FightGames
Discrete Spaces
Constrained Resources
RL Agent Policies
Ingeniería
Rights
openAccess
License
https://repositorio.uniandes.edu.co/static/pdf/aceptacion_uso_es.pdf
Description
Summary:This bachelor’s thesis investigates the performance of reinforcement learning (RL) algorithms in the context of fighting games, specifically Street Fighter III Third Strike, under heavy resource constraints. The research focuses on four distinct RL algorithms: Proximal Policy Optimization (PPO), Asynchronous Advantage Actor-Critic (A3C), Advantage Actor-Critic(A2C) and Deep Q-Network (DQN). Each algorithm is trained under the same restricted resource conditions to facilitate a fair comparison of their final or estimated performances. The training process involves a combination of batch training and the use of the replay buffer for the off-policy DQN algorithm. The RL agents are trained with a reward function based on the game’s health values, with damage inflicted on the opponent corresponding to a positive reward and damage suffered by the agent corresponding to a negative reward. The primary objective of this research is to determine which RL algorithm performs best under resource constraints and to identify the optimal training conditions for each, with the secondary focus to explore various strategies that could potentially make the algorithms perform better when changing the RL agent’s behavior, by modifying the agent’s reward. The research also explores the potential of developing a meta-agent that can select the best-performing agent based on the current game state, aiming to improve the overall performance. The results of this project aim to contribute to the understanding and advancement of reinforcement learning in complex, dynamic, and discrete environments, such as fighting games. The research also lays the groundwork for future investigations into the development of a metaagent and the formulation of more effective reward functions.