Day and Time:
2023-06-07 (Wednesday) 14:00 - 15:00
Reinforcement learning (RL) is an active subarea in machine learning, and has been developed predominantly for Markov decision processes (MDPs) in discrete time with discrete state and action spaces. In this talk, I will report the recent development of RL theory in continuous time and state space with possibly continuous action space, focusing on exploratory formulation and policy evaluation. The theory employs stochastic relaxed control to obtain explicit optimal stochastic policies for exploration that are instrumental for function parameterization in learning, and derives some martingale conditions that lead naturally to policy evaluation algorithms.