Model-based

Model-based MDP: value iteration when the transition probabilities and rewards are known, UCRL algorithms when they are estimated from observations.

Last updated on Feb 9, 2022