Bilkent University
Department of Computer Engineering


Convergence of Reinforcement Learning Algorithms with Function Approximation


Dr. Semih Çaycı

Computer Engineering Department
Bilkent University

Abstract: Reinforcement learning (RL) algorithms, equipped with function approximation and entropy regularization, have achieved impressive empirical success in large-scale decision-making problems. However, a concrete theoretical understanding of these methods remains elusive due to the underlying nonconvex optimization landscape and complex exploration dynamics. In this talk, I will present my research on the convergence of widely-used reinforcement learning algorithms with function approximation for large state spaces. The first part of this talk focuses on entropy-regularized natural policy gradient (NPG) with linear function approximation for direct policy optimization. I will show that this method achieves global optimality at a fast O(1/T) convergence rate up to a function approximation error under minimal assumptions, which can be improved to a linear convergence rate under standard regularity conditions on the function approximation scheme. In the second part of the talk, I will focus on RL algorithms with neural network approximation. The main result is that entropy-regularized natural actor-critic, which employs neural networks for both policy parameterization and policy evaluation, can learn an optimal policy with sharp sample complexity and overparameterization bounds for any given target error. The representation power of neural networks and entropy regularization, which encourages exploration, have key roles in these results. Collectively, these results establish sharp convergence guarantees for RL algorithms, and shed light on the critical roles of function approximation and regularization in practice. Bio: Semih Çaycı is a Postdoctoral Fellow at the Coordinated Science Laboratory at the University of Illinois at Urbana-Champaign. His research interests broadly lie in machine learning, with a focus on reinforcement learning, deep learning theory and online learning. He received his PhD in Electrical and Computer Engineering from the Ohio State University in 2020. He obtained his MSc from Bilkent University and BSc from Bogazici University, both in Electrical and Electronics Engineering.


DATE: 19 November 2021, Friday @ 15:30 Zoom