A Theoretical Framework for Partially Observed Reward-States in RLHF

Chinmaya Kausik, Mirco Mutti, Aldo Pacchiano, Ambuj Tewari

January 2025

PDF

Type

Conference paper

Publication

13th International Conference on Learning Representations