Difference between revisions of "Reinforcement Learning"

From Wiki2
Jump to navigation Jump to search
Line 28: Line 28:
 
* [https://arxiv.org/pdf/1706.06978.pdf Zhou et al. 2018] (Alibaba Group, Deep Interest Network, Click Through Rate Prediction)
 
* [https://arxiv.org/pdf/1706.06978.pdf Zhou et al. 2018] (Alibaba Group, Deep Interest Network, Click Through Rate Prediction)
 
* [https://medium.com/@vermashresth/a-primer-on-deep-reinforcement-learning-frameworks-part-1-6c9ab6a0f555 RL Frameworks]
 
* [https://medium.com/@vermashresth/a-primer-on-deep-reinforcement-learning-frameworks-part-1-6c9ab6a0f555 RL Frameworks]
* [https://arxiv.org/pdf/1802.09756.pdf Real Time Bidding] (Distributed Coordinated Multi-agent reinforcement learning) [https://chemoinformatician.co.uk/images/RTB_multi-agent.png RTB image]
+
* [https://arxiv.org/pdf/1802.09756.pdf Real Time Bidding] (Distributed Coordinated Multi-agent reinforcement learning)  
 +
[https://chemoinformatician.co.uk/images/RTB_multi-agent.png RTB image]
 +
* [https://rise.cs.berkeley.edu/blog/scaling-multi-agent-rl-with-rllib/ Berkeley Multi-agent RL]

Revision as of 15:01, 10 July 2019

Multi-Armed Bandit Examples


Image Ranking

Multi-Agent Learning

Extra

Git Repos

Literature

RTB image