Difference between revisions of "Reinforcement Learning"

From Wiki2
Jump to navigation Jump to search
Line 4: Line 4:
 
* [https://www.analyticsvidhya.com/blog/2018/09/reinforcement-multi-armed-bandit-scratch-python/ Click Through Rate: Random, UCB]
 
* [https://www.analyticsvidhya.com/blog/2018/09/reinforcement-multi-armed-bandit-scratch-python/ Click Through Rate: Random, UCB]
 
* [https://www.spotx.tv/resources/blog/developer-blog/introduction-to-multi-armed-bandits-with-applications-in-digital-advertising/ Digital Advertising] (Epsilon-greedy and Thompson sampling)
 
* [https://www.spotx.tv/resources/blog/developer-blog/introduction-to-multi-armed-bandits-with-applications-in-digital-advertising/ Digital Advertising] (Epsilon-greedy and Thompson sampling)
 +
 +
 +
==Image Ranking==
 +
* [https://medium.com/idealo-tech-blog/using-deep-learning-to-automatically-rank-millions-of-hotel-images-c7e2d2e5cae2 Hotel Image Ranking]
  
  

Revision as of 08:44, 10 July 2019

Multi-Armed Bandit Examples


Image Ranking


Git Repos

  • basic (softmax, UCB, epsilon-greedy)
  • intermediate (more algorithms, contextual bandits)


Literature