Difference between revisions of "Reinforcement Learning"

From Wiki2
Jump to navigation Jump to search
Line 4: Line 4:
 
* [https://www.analyticsvidhya.com/blog/2018/09/reinforcement-multi-armed-bandit-scratch-python/ Click Through Rate: Random, UCB]
 
* [https://www.analyticsvidhya.com/blog/2018/09/reinforcement-multi-armed-bandit-scratch-python/ Click Through Rate: Random, UCB]
 
* [https://www.spotx.tv/resources/blog/developer-blog/introduction-to-multi-armed-bandits-with-applications-in-digital-advertising/ Digital Advertising] (Epsilon-greedy and Thompson sampling)
 
* [https://www.spotx.tv/resources/blog/developer-blog/introduction-to-multi-armed-bandits-with-applications-in-digital-advertising/ Digital Advertising] (Epsilon-greedy and Thompson sampling)
 +
 +
 +
===Git Repos===
 +
* [https://github.com/bgalbraith/bandits, basic] (softmax, UCB, epsilon-greedy)
 +
* [https://github.com/david-cortes/contextualbandits, intermediate] (more algorithms)

Revision as of 17:24, 9 July 2019

Multi-Armed Bandit Examples


Git Repos