Reinforcement Learning (CentraleSupelec M2 2020-2021)

Table of Contents
Overview
Optimal control, stochastic and structured bandits, model-based MDP, planning, deep reinforcement learning.
Teaching assistant for practical sessions.
-
1. MDP
Introduction to Markov Decision Processes, Bellman operators and control.
-
2. Bandits
Introduction to stochastic and structured bandits.
-
3. Planning
Planning in bandits: pure exploration, best arm identification. Planning in MDP: Monte Carlo Tree Search.
-
4. Deep Reinforcement Learning
Deep RL, DQN.