Contents

Reinforcement Learning (CentraleSupelec M2 2020-2021)

Table of Contents

Overview

Optimal control, stochastic and structured bandits, model-based MDP, planning, deep reinforcement learning.

Teaching assistant for practical sessions.

1. MDP

Introduction to Markov Decision Processes, Bellman operators and control.
2. Bandits

Introduction to stochastic and structured bandits.
3. Planning

Planning in bandits: pure exploration, best arm identification. Planning in MDP: Monte Carlo Tree Search.
4. Deep Reinforcement Learning

Deep RL, DQN.

Last updated on Mar 23, 2021