A Tale of Two Algorithmic Principles: Optimism and Posterior Sampling.

Abstract

The principle of optimism in the face of uncertainty and posterior sampling are algorithmic ideas that drive many no-regret algorithms for multi-armed bandits, contextual bandits and reinforcement learning. In this talk, we delve deeper into the theory of optimistic and posterior sampling algorithms. First, we introduce the dissimilarity dimension, a statistical dimension that can be used to achieve sharper analysis of optimistic algorithms than the eluder dimension in function approximation settings. Next, we introduce the Decision Pretrained Transformer, a way to meta learn posterior sampling strategies from data using autoregressive architectures. Finally, we will explore the fundamental limits of optimistic and posterior sampling strategies, uncovering new insights into their performance and constraints.

Date
Dec 10, 2024 12:00 AM
Event
IEOR/DRO Seminar Columbia University
Location
Columbia University in the City of New York
Avatar
Aldo Pacchiano
Assistant Professor / Visiting Scientist

My research interests include online learning, Reinforcement Learning, Deep RL and Fairness.