Regret Bound Balancing and Elimination for Model Selection in Bandits and RL