How to Learn Sequential Decision Making Algorithms from Data.

Name: How to Learn Sequential Decision Making Algorithms from Data.
Start: 2025-04-10T00:00:00Z
Location: King Abdullah University of Science and Technology

Abstract

Large transformer models pretrained on diverse data exhibit impressive in-context learning, performing new tasks without explicit training. In this talk, we explore their capabilities in decision-making settings such as bandits and Markov decision processes. We introduce the Decision-Pretrained Transformer (DPT), a simple supervised pretraining approach where a transformer predicts optimal actions from an in-context dataset of past interactions. DPT enables in-context reinforcement learning, including online exploration and offline conservatism, without explicit training for these behaviors. Remarkably, it generalizes beyond the pretraining distribution and adapts to new task structures. Theoretically, DPT approximates Bayesian posterior sampling, yielding provable regret guarantees and faster learning than the algorithms used to generate its training data. Our results highlight a simple yet powerful method for equipping transformers with strong decision-making abilities.

Date

Apr 10, 2025 12:00 AM

Event

KAUST Rising Stars Workshop

Location

King Abdullah University of Science and Technology

Aldo Pacchiano

Assistant Professor / Visiting Scientist

My research interests include online learning, Reinforcement Learning, Deep RL and Fairness.