Model-based
Model-based MDP: value iteration when the transition probabilities and rewards are known, UCRL algorithms when they are estimated from observations.
Model-based MDP: value iteration when the transition probabilities and rewards are known, UCRL algorithms when they are estimated from observations.