Autosoft Journal

Online Manuscript Access

RLCF: A collaborative filtering approach based on reinforcement learning with sequential ratings



We present a novel approach for collaborative filtering, RLCF, that considers the dynamics of user ratings. RLCF is based on reinforcement learning applied to the sequence of ratings. First, we formalize the collaborative filtering problem as a Markov Decision Process. Then, we learn the connection between the temporal sequences of user ratings using Q-learning. Experiments demonstrate the feasibility of our approach and a tight relationship between the past and the current ratings. We also suggest an ensemble learning in RLCF and demonstrate its improved performance.



Total Pages: 6
Pages: 439-444


Manuscript ViewPdf Subscription required to access this document

Obtain access this manuscript in one of the following ways

Already subscribed?

Need information on obtaining a subscription? Personal and institutional subscriptions are available.

Already an author? Have access via email address?


Volume: 23
Issue: 3
Year: 2016

Cite this document


Altman, N. S. "An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression." The American Statistician 46.3 (1992): 175-185. Crossref. Web.

Bellman R. Journal of Mathematics and Mechanics

Bengio, Y. "Learning Deep Architectures for AI." Foundations and Trends® in Machine Learning 2.1 (2009): 1-127. Crossref. Web.

Harper, F. Maxwell, and Joseph A. Konstan. "The MovieLens Datasets." ACM Transactions on Interactive Intelligent Systems 5.4 (2015): 1-19. Crossref. Web.

Hinton, Geoffrey E. "Training Products of Experts by Minimizing Contrastive Divergence." Neural Computation 14.8 (2002): 1771-1800. Crossref. Web.

Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A Fast Learning Algorithm for Deep Belief Nets." Neural Computation 18.7 (2006): 1527-1554. Crossref. Web.

Jannach D. Recommender systems: An introduction

Kaelbling L. Journal of Artificial Intelligence Research

Puterman, Martin L. "Chapter 8 Markov Decision Processes." Stochastic Models (1990): 331-434. Crossref. Web.

Stewart, G. W. "On the Early History of the Singular Value Decomposition." SIAM Review 35.4 (1993): 551-566. Crossref. Web.

Sutton, Richard S., ed. "Reinforcement Learning." (1992): n. pag. Crossref. Web.

Szepesvari C. Algorithms for reinforcement learning

Thurstone, L. L. "A Law of Comparative Judgment." Psychological Review 34.4 (1927): 273-286. Crossref. Web.

Wang, Fei-Yue, Huaguang Zhang, and Derong Liu. "Adaptive Dynamic Programming: An Introduction." IEEE Computational Intelligence Magazine 4.2 (2009): 39-47. Crossref. Web.

Wang, Xinxi et al. "Exploration in Interactive Personalized Music Recommendation." ACM Transactions on Multimedia Computing, Communications, and Applications 11.1 (2014): 1-22. Crossref. Web.

Watkins, Christopher J. C. H., and Peter Dayan. "Q-Learning." Machine Learning 8.3-4 (1992): 279-292. Crossref. Web.

Xu, Xin, Lei Zuo, and Zhenhua Huang. "Reinforcement Learning Algorithms with Function Approximation: Recent Advances and Applications." Information Sciences 261 (2014): 1-31. Crossref. Web.

Zou, Tengfei et al. "An Effective Collaborative Filtering Via Enhanced Similarity and Probability Interval Prediction." Intelligent Automation & Soft Computing 20.4 (2014): 555-566. Crossref. Web.


ISSN PRINT: 1079-8587
ISSN ONLINE: 2326-005X
DOI PREFIX: 10.31209
10.1080/10798587 with T&F
IMPACT FACTOR: 0.652 (2017/2018)

SJR: "The two years line is equivalent to journal impact factor ™ (Thomson Reuters) metric."

Journal: 1995-Present


TSI Press
18015 Bullis Hill
San Antonio, TX 78258 USA
PH: 210 479 1022
FAX: 210 479 1048