Recent & Upcoming Talks

2024

Experiment Planning with Function Approximation.

We study the problem of experiment planning with function approximation in contextual bandit problems. In settings where there is a …

The Dissimilarity Dimension: Sharper Bounds for Optimistic Algorithms.

The principle of Optimism in the Face of Uncertainty (OFU) is one of the foundational algorithmic design choices in Reinforcement …

2023

RLHF: Reinforcement Learning with Once-per-Episode Feedback

Despite Reinforcement learning’s remarkable success in several application and simulation domains, research in the field has …

LEARNING SYSTEMS IN ADAPTIVE ENVIRONMENTS. THEORY, ALGORITHMS AND DESIGN

February/March 2023

2022

Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity.

Reinforcement learning provides an automated framework for learning behaviors from high-level reward specifications, but in practice …

On the Statistical Complexity of Batch Learning: Theory and Algorithms

In this work we develop the technique of optimism regularization, a simple way of inducing optimistic predictions in NN models. I show …