Supplemental Material
- Richard S. Sutton Barto and Andrew G.2015. Reinforcement Learning: An Introduction.Google Scholar
- Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed Chi. 2018. Top-K Off-Policy Correction for a REINFORCE Recommender System. (2018). https://doi.org/10.1145/3289600.3290999Google Scholar
- Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. Proceedings of the 10th ACM Conference on Recommender Systems - RecSys ’16 (2016), 191–198. https://doi.org/10.1145/2959100.2959190Google ScholarDigital Library
- Criteo. 2020. Criteo 1TB Click Logs dataset. https://ailab.criteo.com/download-criteo-1tb-click-logs-dataset/Google Scholar
- James Edwards and David Leslie. 2018. Diversity as a Response to User Preference Uncertainty. In Statistical Data Science. WORLD SCIENTIFIC (EUROPE), 55–68. https://doi.org/10.1142/9781786345400_0004Google Scholar
- Simen Eide, David S. Leslie, and Arnoldo Frigessi. 2021. Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sampling. (2021), 1–30. https://doi.org/10.21203/rs.3.rs-525958/v1Google Scholar
- Balázs Hidasi, Massimo Quadrana, Alexandros Karatzoglou, and Domonkos Tikk. 2016. Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems - RecSys ’16. 241–248. https://doi.org/10.1145/2959100.2959167Google ScholarDigital Library
- Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In 2008 Eighth IEEE International Conference on Data Mining. IEEE, 263–272. https://doi.org/10.1109/ICDM.2008.22Google ScholarDigital Library
- Eugene Ie, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng Tze Cheng, Tushar Chandra, and Craig Boutilier. 2019. SLateq: A tractable decomposition for reinforcement learning with recommendation sets. IJCAI International Joint Conference on Artificial Intelligence 2019-Augus(2019), 2592–2599. https://doi.org/10.24963/ijcai.2019/360Google ScholarCross Ref
- Eugene Ie, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng Tze Cheng, Morgane Lustman, Vince Gatto, Paul Covington, Jim McFadden, Tushar Chandra, and Craig Boutilier. 2019. Reinforcement learning for slate-based recommender systems: A tractable decomposition and practical methodology. arXiv (5 2019). http://arxiv.org/abs/1905.12767Google Scholar
- Eugene Iey, Chih Wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekarx, Jing Wang, Rui Wu, and Craig Boutilier. 2019. RECSIM: A configurable simulation platform for recommender systems. arXiv (2019), 1–23.Google Scholar
- Thorsten Joachims, Adith Swaminathan, and Maarten De Rijke. 2018. Deep learning with logged bandit feedback. In International Conference on Learning Representations. http://www.joachims.org/banditnet/Google Scholar
- Tor Lattimore and Csaba Szepesvári. 2019. Bandit Algorithms. Technical Report.Google Scholar
- James McInerney, Benjamin Lacker, Samantha Hansen, Karl Higley, Hugues Bouchard, Alois Gruson, and Rishabh Mehrotra Spotify. 2018. Explore, Exploit, and Ex-plain: Personalizing Explainable Recommendations with Bandits. (2018). https://doi.org/10.1145/3240323.3240354Google Scholar
- Navid Rekabsaz, Oleg Lesota, Markus Schedl, Jon Brassey, and Carsten Eickhoff. 2021. TripClick: The Log Files of a Large Health Web Search Engine. Vol. 1. Association for Computing Machinery. http://arxiv.org/abs/2103.07901Google Scholar
- Jacopo Tagliabue, Ciro Greco, Jean-Francis Roy, Bingqing Yu, Patrick John Chia, Federico Bianchi, and Giovanni Cassani. 2021. SIGIR 2021 E-Commerce Workshop Data Challenge. Vol. 1. Association for Computing Machinery. http://arxiv.org/abs/2104.09423Google Scholar
- Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Dawei Yin, Yihong Zhao, and Jiliang Tang. 2018. Deep Reinforcement Learning for List-wise Recommendations. 9 (2018). https://doi.org/10.1145/nnnnnnn.nnnnnnnGoogle Scholar
Recommendations
REVEAL 2022: Reinforcement Learning-Based Recommender Systems at Scale
RecSys '22: Proceedings of the 16th ACM Conference on Recommender SystemsRecommendation systems are increasingly modelled as a sequential decision making process, where the system decides which items to recommend to a given user. Each decision to recommend an item or slate of items has a significant impact on immediate and ...
Learning from Sets of Items in Recommender Systems
Most of the existing recommender systems use the ratings provided by users on individual items. An additional source of preference information is to use the ratings that users provide on sets of items. The advantages of using preferences on sets are ...
Dataset-driven research for improving recommender systems for learning
LAK '11: Proceedings of the 1st International Conference on Learning Analytics and KnowledgeIn the world of recommender systems, it is a common practice to use public available datasets from different application environments (e.g. MovieLens, Book-Crossing, or Each-Movie) in order to evaluate recommendation algorithms. These datasets are used ...
Comments