Recommendation System using Collaborative Filtering and Recurrent Neural Network
author:Fu-ze Zhong
Email: [email protected]
School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China.
abstract
The behavior of user in an e-commerce system can be modeled as a time series, and RNN performs well on a sequence model.
RNN is used to predict the time series data, and the behavior pattern of the user can use RNN model to mine and produce recommendations in a certain period of time.
In this paper, the recommendation system mainly proposal a recommendation model based on the user’s static data and user’s dynamic data,
combined Deep-RNN and collaborative filtering(CF) and produce recommendations.
The model can indicate how the user interacts with the system and makes his/her purchases.
I built a deep recurrent neural network model to solve the real-time recommendation problem. The network tracks how users interact with the recommendation server.
Each hidden layer simulating how the user accessed the item-combination and the user‘s purchase pattern. To reduce DRNN processing cost, the network only records a limited number of states. Once user’s behavior is updated, the model adjusted itself and refreshed the recommendation results. As a user behavior continues and increases, DRNN training to get more better recommendations results.The CF (Collaborative Filtering) method shows its effectiveness in many practical applications. It captures the correlation between users and projects and reveals the common interests of users. Users sharing the same interests may purchase the same set of items. My DRNN model is a good complement to the CF method. If the user follows the old purchase model, the CF method will produce good recommendations, and the RNN model can effectively predict the purchase pattern of a particular user. I integrated the recurrent neural network with a collaborative filtering, and my model showed a significant improvement over the previous recommendation service.
INTRODUCTION
Collaborative Filtering (CF) recommendations are widely used in recommendation systems based on correlations between users and projects, and predict the probability that a user will purchase a particular item. The assumption is that users sharing similar purchase histories may purchase the same set of items.
Although the Collaborative Filtering (CF) method works well in some cases, I found that it does not provide more accurate real-time recommendations because its model is built using stale data and lacks customization options. When the user enters the system, I can collect his/her basic information and I can get a complete profile of the user preferences for the CF algorithm. All of this information is stale data.
In addition to these basic properties, we have a special type of dynamic property —— timestamp. Timestamps come from the recommendation server. The server records the current user’s browsing history. The user has a long list of item records.
This list actually records how the user interacts with the system and performs the purchase, which was not available in previous CF recommendations.
More accurate real-time recommendations are needed that not only consider the user’s preference history, but also consider the user’s short-term behavior patterns, flexibly capture hotspots and points of interest, and can adjust their recommendation results during the referral process.
Based on the current viewing history and user interests, we should provide recommendations and we should continuously recommend the recommendation system model by guessing what the user really wants to buy.
Here are some challenges:
-
Amazon has thousands of items. If each item represents a state, the input to our predictive model is a huge vector indicating which item was accessed.
-
Each user timestamp list can be thought of as a combination of random number states, making learning more complex.
-
uses the user timestamp (the item record in the user purchase list) as the training set because it tracks how the user interacts with our server.
-
The training processes have to constantly updated, because as long as the server is online, we can get more records from the user, which can be further used to optimize our model and continue to use the user’s new records for training to get better recommendations. . That is, the recommendation model should update the model in real time to reflect the user’s new purchase model.
To address these challenges, in this paper, we use Deep Recurrent Neural Network (DRNN) to simulate the user’s browsing mode and provide real-time recommendation services. The DRNN consists of multiple hidden layers, each with a time feedback loop. The input layer represents the timestamp of the user accessing the project, and the parameters of the hidden layer may represent a combination of users browsing the project through the training process.
My DRNN model is used to track the user’s access mode. The item predicted by the recommendation system is displayed in the UI interface to guide the user to access his/her desired item. The purpose of the DRNN model is to return real-time recommendations. It is best to be able to click on the desired and feed back to the user with the shortest path.
My DRNN model is designed to be based on a sliding window approach that maintains a limited number of states. As the learning process continues, old states will be replaced by new states. Determining the correct window size is a very critical task. Larger windows can cause excessive computational overhead, while smaller windows can affect the accuracy of predictions.
I use a collaborative filtering algorithm that accepts the user’s purchase history as input and generates a prediction of the probability of purchasing the item. CF combines with DRNN to produce the final result. DRNN can be implemented on the recommendation server. The DRNN model updates the neural network parameters as the user continues to interact with the server. The server generates a list of new recommended items based on the recommended results from the combination of DRNN and CF. The server then responds to the customer and displays the new recommendation results.
Here are some contributions:
- Many studies are modeled on static data such as user ratings or historical purchase records, which reflects the user’s historical interest orientation, but is not time-sensitive. For example, a user is identified by the system as a favorite of all types of books, but a series of browsing records before the user’s purchase behavior indicates that he is selecting food.
- Collaborative filtering can also continuously update the rating matrix to complete the update of user preferences, but collaborative filtering can not mine the real-time behavior patterns of users. Therefore, using the DRNN method, the user performs time series-based modeling to complete the recommended selection for the user at a period of time. Combining different areas into a single learning model can help improve the quality of recommendations in all areas. The recommendation system I created is mainly for the modeling of the user’s static data and user dynamic data to jointly produce the recommendation results.
- The two algorithm are integrated together to produce the final prediction. Our results on real dataset show that the CF and DRNN approach outperforms previous CF (Collaborative Filtering) approach significantly.
OVERVIEW OF MODULE
The rest of this article is organized as follows. The II section gives
An overview of the recommended modules.The III section deals with the details of the CF and DRNN models and how to combine the results of the two algorithms. The IV section introduces the experimental results.I summarized this paper in the V section.
Personalized recommendations are a key feature of improving the user experience in an e-commerce system. Many companies collect user purchase history and apply CF algorithms to generate recommendation results for each user during the offline process. When the user logs in, we push the recommendation to him/her. In order to catch up with new buying trends, the CF algorithm is called periodically to update the recommendation results with new log data. Unfortunately, the accuracy (the probability that the user ultimately buys the recommended item) is low, and the off-line method cannot find the latest purchase pattern.
To address this challenge, I developed a new recommendation module, the DRNN model, to find a better solution. Figure 1 shows the recommended model architecture I designed. The user’s request is sent to the recommendation server, which can obtain profiles and associations about the user and the item. Timestamps are used as an input to the DRNN model to generate real-time predictions. The DRNN model works with the CF model.
In other words, we consider the user’s current interests and past interests.
Finally, the server returns the requested list of items to the user by presenting the recommendation results. As the user continues to interact with the server (and generates more requests), the model improves our predictions. Users are expected to find their items from the recommendation results with a higher probability.
COLLABORATIVE FILTERING AND DEEP RECURRENT NEURAL NETWORK
After the response is completed, the model actually gets the real result we predicted, that is, whether the user purchased the result of the recommendation system. The model can be adjusted to use new training samples. I create an index for each row’s user ID, project ID, and timestamp. For a specific user , we generate its configuration file as follows:
Let be the specific user’s list of item records.
Let be the timestamp of the last item record of .
is a predefined a timeout threshold (e.g., 30 minutes).
Denote for a specific user list of item records as
,where is the item record.
The user’s history documents are used as the input for the DRNN model to adjust the weights and bias values of the neural network. To reduce the learning cost,I use SGD to update parameters.
ITEM-BASED COLLABORATIVE FILTERING
The recommendation system essentially solves the problem of information overload, contacts users and information when the user’s needs are not clear. On the one hand, it helps users to find information that is valuable to them, on the other hand, the information can be displayed to users who are interested in it. In front of us, to achieve a win-win situation for information consumers and information producers (the meaning of information here can be very broad, such as consulting, movies and commodities, collectively referred to as item)
Collaborative filtering is mainly divided into neighborhood-based and implicit semantic models. Among the neighborhood-based algorithms, Item-based CF is widely used. The main idea is that “users who like item A mostly like user item B”, and use the group wisdom to generate item recommendation list of item by mining the operation log of user history.
The principle is to achieve the recommendation by comparing the data of the user and other users. The specific method of comparison is to calculate the similarity between two user data by calculating the similarity between two user data and calculating the similarity. The design of the similarity function must satisfy the three requirements of the metric space, namely non-negative, symmetry and triangular inequality. Commonly used similarity calculation methods are: Euclidean distance method, Pearson correlation coefficient method and angle cosine similarity method.
The basic idea of User-based is that if user A likes item a, user B likes items a, b, and c, and user C likes a and c, then user A is considered to be similar to users B and C because they all like a, but like The user of a also likes c, so recommend c to user A. The algorithm uses the nearest neighbor (neighbor-neighbor) algorithm to find a set of neighbors of a user. The user of the set has similar preferences to the user, and the algorithm predicts the user according to the preference of the neighbor.
There are two major problems with the User-based algorithm: 1. Data sparsity. A large e-commerce recommendation system generally has a large number of items. The user may buy less than 1% of the items. The overlap between the items purchased by different users is low, and the algorithm cannot find a user’s neighbor, that is, the preference. Similar users. 2. Algorithm scalability. The calculation of the nearest neighbor algorithm increases as the number of users and items increases, and is not suitable for use in situations where the amount of data is large.
The basic idea of Iterm-based is to calculate the similarity between items based on historical preference data of all users in advance, and then recommend the items similar to the items that the user likes to the user. Taking the previous example as an example, you can know that items a and c are very similar, because users who like a also like c, and user A likes a, so recommend c to user A.
Because the direct similarity of the items is relatively fixed, the similarity between different items can be calculated online in advance, and the results are stored in the table. When the recommendation is made, the table can be searched to calculate the possible scores of the user, and the above two can be solved simultaneously. Questions.
Item-based algorithm detailed process:
1, similarity calculation: Item-based algorithm is preferred to calculate the similarity between items, there are several ways to calculate similarity:
(1). Based on cosine-based similarity calculation, the similarity between items is calculated by calculating the cosine of the angle between two vectors.
(2). Calculating the Pearson-r correlation between two vectors based on Correlation-based similarity calculation
$$
\frac{\sum_{u \in U}^{ }(R_{u,i}-\bar{R}{i})(R{u,j}-\bar{R}{j}) }
{\sqrt{\sum{u \in U}^{ }(R_{u,i}-\bar{R}{i})^{2}} * \sqrt{\sum{u \in U} ^{ }(R_{u,j}-\bar{R}_{j})^{2}}}$$
2, predicted value calculation: weighted summation. Used to weight the sum of the scores of the items that the user u has scored, the weight is the similarity between each item and the item i, and then the sum of the similarities of all items Average, calculate user u to score item i
Collaborative filtering has also encountered challenges. Item-based collaborative filtering was tried on the dataset, but as shown in the “EXPERIMENT” section, it did not perform well in the test. especially:
- Due to the sparsity of the data, the most similar projects end up with just a project with a common user.
- Due to the cold start of the user, in the real world scenario, the new user does not have any records on the recommendation server, and collaborative filtering cannot give personalize recommendation result for user.
DEEP RECURRENT NEURAL NETWORK
The recurrent neural network is a general term for a recurrent neural network and a recursive neural network. RNN generally refers to time recurrent neural networks.
The RNN can solve the sequence in which the training samples are continuously input, and the sequence length is different, such as a time-based sequence. For example, the RNN modeling user generates a time series about the item when interacting with the e-commerce.
I use the user’s request behavior to promptly recommend the product issues that they may be interested in, use the user to browse the website page to record the information left by the recommendation server, explore the user’s behavior pattern, and recommend the product he is most interested in. In shortening the user’s decision-making process. Time-series modeling of the records generated by the user, predicting the posterior probability of the page that the user will click next time through the browsing data generated by the user in a short time, and completing the implementation recommendation process.
\subsubsection{Basic RNN}
Recurrent neural networks are the general term for recurrent neural networks and recursive neural networks. RNN generally refers to a time-recurrent neural network.
The RNN can solve the training sample is a continuous input sequence, and the sequence varies in length, such as time-based sequences, such as a series of conversations that occur when the user is shopping.
Assume that the sequence index is from . For any sequence index , the corresponding input is in the sequence, and the model is hidden in the sequence index . , is determined by and the hidden state at , at any sequence index , There is a corresponding model prediction output . Through the prediction output and the training sequence true output , the loss function is obtained, and the error is backpropagated. The model parameters are trained by the gradient descent method.
-
represents the input of the training sample at sequence index t. Similarly, and represent the input of training samples at sequence index numbers and .
-
represents the hidden state of the model at sequence index t. is determined by and .
-
represents the output of the model at sequence index t. is determined only by the model’s current hidden state .
-
represents the loss function of the model at sequence index t.
-
represents the true output of the training sample sequence at sequence index t.
-
These three matrices are the linear relationship parameters of our model. It is shared across the entire RNN network. This is very different from DNN. Just because it is shared, it embodies the idea of the “circular feedback” of the RNN model.
Forward propagation:
For any sequence index t, we hide the state from and :
is the activation function of RNN, and is the bias
The output of the model when the sequence index number is:
Our final output at the final sequence index is:
The activation function is generally softmax.
Through the loss function , such as the log-likelihood loss function, we can quantify the loss of the model at the current position, ie, the difference between and .
Backward propagation:
are shared at various positions in the sequence, and we update the same parameters when we propagate backwards. Assume that the activation function output from the model is a softmax function and the activation function of the hidden layer is a tanh function.
For RNN, since we have a loss function at each position in the sequence, the final loss of is:
Gradient calculations for
For gradient loss of at a certain sequence position t, back-propagation calculation is needed step by step. We define the gradient of the hidden state at the position of the sequence index as:
After finding , give the gradient calculation expression for :
Deep RNN
To improve the recommendation results quantities,we can build more hidden layers to improve the prediction accuracy.
The input is a list of items denoting a specific user, where user
viewed before . We consider each item as a state of user behavior. The RNN is turned into a state machine of the e-commerce system, simulating how users interact with the recommendation system.
The input layer of our RNN consists of n states. Each state refers to an item. More specifically, for item ,we generate an M-element vector (M is the number of items). is set to 1 and all the other elements are set to 0. Vector is used as the input at the state in the input layer.
Each RNN state includes L hidden layers, each with E neurons. First assume that all layers use the same number of neurons. Within the state, the neurons in the i-th layer are connected to the neurons in the i-th layer. Between two consecutive states, the neurons in the ith layer of state t are connected to the neurons in the ith layer of state t+1.
Finally, the output layer is the probability vector V out = . indicates the probability of purchasing the i-th item. For example, the i-th item will be purchased with a probability of . The last hidden layer is connected to the output layer of each state. Therefore, when the system accepts the new state, the recommendation system can return the results of the real-time prediction. Typically, the recommendation service will return the topk items with the greatest probability to the user as our prediction.
DRNN with IBCF
The CF (Collaborative Filtering) method shows its effectiveness in many practical applications. It captures the correlation between users and projects and reveals the common interests of users. Users sharing the same interests may purchase the same set of items. My DRNN model is a good complement to the CF method. If the user follows the old purchase model, the CF method will produce good recommendations, and the RNN model can effectively predict the purchase pattern of a particular user. Therefore, I combine the CF method with DRNN.
where Possibility(t) estimates the probability of buying at state t.
DRNN and CF are combined and will be trained using SGD (Stochastic Gradient Descent). In this way, we do not need to explicitly specify which model (CF model and RNN model) is more important.
EXPERIMENT
Description of Datasets
This dataset contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014.
This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).
- user review data - duplicate items,sorted by user
- product review data - duplicate items removed, sorted by product
- ratings and timestamp - same as above, in csv form without reviews or metadata
In the experiment, the training set and the test set are divided as follows. For each specific user, the timestamp is selected as the test set, and the timestamp is selected as the training set.
Comparison Experiments
Item-based CF solves these problems in systems with more users than projects. The item-item model uses the rating distribution for each item, not for each user. If the number of users is greater than the number of items, each item tends to score more than each user, so the average rating of the item usually does not change very quickly. This results in a more stable rating distribution in the model, so the model does not need to be rebuilt frequently. When a user uses and evaluates a project, similar projects for that project are selected from existing system models and added to the user’s personalized recommendations.
Slope One is a member of the family of project collaborative filtering algorithms designed to reduce model overfitting problems. It can be said that it is the simplest form of non-trivial item-based collaborative filtering based on ratings. Their simplicity makes them particularly easy to implement effectively, and their accuracy is usually comparable to more complex and computationally intensive algorithms, and it is also used as a building block to improve other algorithms.
SlopeOne is easy to implement and maintain, makes it easy to interpret all aggregated data, and the algorithms are easy to implement and test. It is real-time, can be updated at runtime, and a new rating item should have an immediate impact on the forecast results. Efficient query response, fast execution of queries, may require more space to occupy, and less demand for first-time visitors: for a user with few scoring items, you should also get a valid recommendation. Compared to the most accurate method, SlopeOne should be competitive, not only the algorithm is simple and efficient, but the effect is not bad.
DRNN with IBCF performs better.Collaborative filtering can also continuously update the rating matrix to complete the update of user preferences, but collaborative filtering can not mine the real-time behavior patterns of users. Therefore, using the DRNN method, the user performs time series-based modeling to complete the recommended selection for the user at a certain time. Combining different areas into a single learning model can help improve the quality of recommendations in all areas. The recommendation system I created is mainly for the modeling of the user’s static data and user dynamic data to jointly produce the recommendation results.The two algorithm are integrated together to produce the final prediction. Our results on real dataset show that the CF and DRNN approach outperforms previous CF (Collaborative Filtering) approach significantly.
In the DRNN model, a user profile is represented as a sequence of timestamps, denoting the path from the first item to the last item.
DRNN model extracts the common purchase patterns of users and tries to predicate which the next item will user want to purchase. A user profile may consist of an number of timestamps, while DRNN model can only maintain a limited number of states. The number of states affects model accuracy. If the window length is too short, truncation will not only result in data loss, but also may not capture the user’s behavior pattern. If the window is too long, gradients or gradient explosions are prone to occur, and the user’s point of interest is easily changed over a long period of time.
####Effect of the length of hidden layers
Parameter Setting
CONCLUSION AND FUTURE WORK
In this paper, I present a real-time recommendation approach by using Deep Recurrent Neural Network model. My approach is designed for the Amazon public product data(http://jmcauley.ucsd.edu/data/amazon/).
In the DRNN model, a user profile is represented as a sequence of timestamps, denoting the path from the first item to the last item.
DRNN model extracts the common purchase patterns of users and tries to predicate which the next item will user want to purchase. A user profile may consist of an number of timestamps, while DRNN model can only maintain a limited number of states. The number of states affects model accuracy. If the window length is too short, truncation will not only result in data loss, but also may not capture the user’s behavior pattern. If the window is too long, the user’s interest may change changed over a long period of time.
Finally, in the recommendation model, I also implement Collaborative Filtering to model the history of users. The two algorithm are integrated together to produce the final prediction. Our results on real dataset show that the CF and DRNN approach outperforms previous CF (Collaborative Filtering) approach significantly.
However, the model of DRNN combined with CF still has shortcomings, which can’t solve the user’s cold start problem. In the future, I will introduce Multi-view to solve the problem of user cold start. The model was extended by introducing Multi-view Learning to learn from the characteristics of projects and user features in different fields. Rich user behavior allows the model to learn relevant user behavior patterns and provide useful recommendations for users who do not have any interaction with the service, since users like them have enough search and browsing history. Combining different domains into a single learning model helps to improve the quality of recommendations in all areas, and has a more compact and richer user feature vector.
REFERENCE
[1] Music Recommendation using Recurrent Neural Networks,Ashustosh Choudhary
[2] Item-Based Collaborative Filtering Recommendation Algorithms,Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl
[3] A Novel Approach Based on Multi-View Content Analysis and Semi-Supervised Enrichment for Movie Recommendation,Wen Qu, Kai-Song Song
[4] Personal Recommendation Using Deep Recurrent Neural Networks in NetEase,
[5] Converged Recommendation System Based on RNN and BP Neural Networks,Zhaowei Qu, Shuqiang Zheng, Xiaoru Wang, Xiaomin Song, Baiwei Li, Xiaohui Song
本站文章如无特殊说明,均为本站原创,如若转载,请注明出处:协同过滤结合循环神经网络的推荐系统——期末作业 - Python技术站