Reinforcement Learning (CentraleSupelec M2 2020-2021)

Table of Contents


Optimal control, stochastic and structured bandits, model-based MDP, planning, deep reinforcement learning.

Teaching assistant for practical sessions.

  • 1. MDP

    Introduction to Markov Decision Processes, Bellman operators and control.

  • 2. Bandits

    Introduction to stochastic and structured bandits.

  • 3. Planning

    Planning in bandits: pure exploration, best arm identification. Planning in MDP: Monte Carlo Tree Search.

  • 4. Deep Reinforcement Learning

    Deep RL, DQN.