Inverse reinforcement learning via stochastic mirror descent
Inverse Reinforcement Learning (IRL) and Apprenticeship Learning (AL) are foundational problems in decision-making under uncertainty, where the goal is to infer cost functions and policies from observed behavior. In this thesis, we establish the equivalence between the inverse optimization framework...
- Autores:
-
Leiva Montoya, Esteban
- Tipo de recurso:
- Trabajo de grado de pregrado
- Fecha de publicación:
- 2025
- Institución:
- Universidad de los Andes
- Repositorio:
- Séneca: repositorio Uniandes
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.uniandes.edu.co:1992/75575
- Acceso en línea:
- https://hdl.handle.net/1992/75575
- Palabra clave:
- Inverse optimization
Inverse reinforcement learning
Stochastic mirror descent
Markov decision processes
Matemáticas
- Rights
- openAccess
- License
- Attribution 4.0 International
Summary: | Inverse Reinforcement Learning (IRL) and Apprenticeship Learning (AL) are foundational problems in decision-making under uncertainty, where the goal is to infer cost functions and policies from observed behavior. In this thesis, we establish the equivalence between the inverse optimization framework for Markov decision processes (MDPs) and the apprenticeship learning formalism, showing that both approaches can be unified under a shared structure. We formulate IRL and AL as regularized min-max problems and develop an algorithm that leverages stochastic mirror descent (SMD) that offers theoretical guarantees on convergence. |
---|