Gradiente estocástico y aproximación estocástica aplicados a Q-learning

The project is motivated to demonstrate the convergence of Q-learning. This is an algorithm applied to finite Markov decision processes in discrete time, where there is not enough information. Thus, what the algorithm seeks is to solve the optimality equations (or Bellman's equations). With thi...

Full description

Autores:: ñungo Manrique, José Sebastián

Tipo de recurso:: Trabajo de grado de pregrado

Fecha de publicación:: 2020

Institución:: Universidad de los Andes

Repositorio:: Séneca: repositorio Uniandes

Idioma:: spa

Description
Summary:	The project is motivated to demonstrate the convergence of Q-learning. This is an algorithm applied to finite Markov decision processes in discrete time, where there is not enough information. Thus, what the algorithm seeks is to solve the optimality equations (or Bellman's equations). With this purpose in mind, in the project we discussed four main things: 1. Finite Markov decision processes in discrete time, which is the model that interests us from the beginning. 2. Stochastic approximation (SA), which is the algorithm that serves as the general framework for many algorithms, including Q-learning. Under some premises we will be able to establish the convergence of A.E. 3. Stochastic gradient descent method, which is the main tool by which the convergence of the A.E. algorithm can be established. (and many of the Machine Learning algorithms) 4. Reinforcement learning, which is the branch in which the Q-learning algorithm is found.

Gradiente estocástico y aproximación estocástica aplicados a Q-learning

Publicaciones similares