Difference between revisions of "Reinforcement Learning"

From Wiki2
Jump to navigation Jump to search
Line 8: Line 8:
 
===Git Repos===
 
===Git Repos===
 
* [https://github.com/bgalbraith/bandits, basic] (softmax, UCB, epsilon-greedy)
 
* [https://github.com/bgalbraith/bandits, basic] (softmax, UCB, epsilon-greedy)
* [https://github.com/david-cortes/contextualbandits, intermediate] (more algorithms)
+
* [https://github.com/david-cortes/contextualbandits, intermediate] (more algorithms, contextual bandits)

Revision as of 17:24, 9 July 2019

Multi-Armed Bandit Examples


Git Repos

  • basic (softmax, UCB, epsilon-greedy)
  • intermediate (more algorithms, contextual bandits)