Statistical theory for reinforcement learning: Oracle inequalities, Markov chains, and stochastic approximation: Neyman seminar

Statistical theory for reinforcement learning: Oracle inequalities, Markov chains, and stochastic approximation: Neyman seminar

Neyman Seminar
Jan 26, 2022, 04:00 PM - 05:00 PM | Zoom id: 97648161149. No passcode. Evans Hall | Happening As Scheduled
Wenlong Mou, Berkeley EECS

Abstract:
Dynamic programming provides a formalism for making near-optimal decisions in sequential settings. A central task in approximate dynamic programming and reinforcement learning is to estimate the solution to the Bellman fixed point equation. Given the large state spaces that arise in practice, function approximation plays an essential role, and the resulting projected Bellman equations...