The Dissimilarity Dimension

The Statistical Complexity of Optimistic Algorithms

Overview

For a time the Eluder dimension has been used to provide bounds for optimistic algorithms in function approximation regimes. We introduce the dissimilarity dimension that provides us with tighter bounds. The dissimilarity dimension is a combinatorial statistical dimension that can be used to define the statistical complexity of optimistic algorithms in structured bandit problems. This statistical dimension’s definition is based on defining maximal sequences that achieve good historical fit while also satisfying what we call a large self evalution property that is connected with optimism. Using a graph theoretic argument, we prove the regret of optimistic least squares achieves a regret bound depending on this statistical dimension that is sharper than the typical eluder dimension bound.

Paper:

Avatar
Aldo Pacchiano
Assistant Professor / Visiting Scientist

My research interests include online learning, Reinforcement Learning, Deep RL and Fairness.

Related