Aldo Pacchiano
Home
Publications
Contact
Talks
Jonathan Lee
Latest
Estimating Optimal Policy Value in General Linear Contextual Bandits
Dueling RL: Reinforcement Learning with Trajectory Preferences
Near Optimal Policy Optimization via REPS
Accelerated Message Passing for Entropy-Regularized MAP Inference
Online Model Selection for Reinforcement Learning with Function Approximation
Convergence Rates of Smooth Message Passing with Rounding in Entropy-Regularized MAP Inference
Cite
×