RecCoNN

By
Akhil Babu Manam
Ashok Govinda Gowda
Kishan Hulkodu Sheshagiri
Shiva Kumar Pentyala

Introduction

Many of the world's recommender systems are based on collaborative filtering techniques.
The basic idea of these techniques is that people who share similar preferences in the past tend to have similar choices in the future.
But Collaborative Filtering techniques have a sparsity problem in that the number of items rated by users is insignificant to the total number of items.
We aim to reduce the sparsity problem and improve the quality of recommendations

We are using Yelp Dataset in this scenario to showcase our methods against methods like matrix factorization as Yelp dataset is very sparse and traditional methods like matrix factorization involves lots of computation when it comes to sparse dataset The current techniques in recommender systems use collaborative filtering, content based filtering or hybrid methods on ratings explicitly given by the users to each item. But such systems are not robust to problems due to sparsity in the ratings data.
It has been shown that reviews written by users can reveal some information on the customers’ buying and rating behavior, and also reviews written for items may contain information on their features and properties.
So we believe that a large amount of information exists in reviews written by users. This source of information has been ignored by most of the current recommender systems while it can potentially alleviate the sparsity problem and improve the quality of recommendations.
In Jan 2017, Lei et. al. have proposed a deep model (DeepCoNN) to learn item properties and user behaviors jointly from review text. But we noticed that they have completely neglected other user/item profile specific attribute information.

Consider the Yelp Dataset

as

While there are a lot of attributes in the dataset, Most collaborative filtering techniques consider only the user ratings

But there are attributes that are more relevant in restaurant recommender systems (for example, is time of ordering important, is user sequence of ordering important, is positivity or negativity of user comments important, in what ways can location of the restaurant be used to improve the recommendation quality etc.)?

We capture the contextual information in the review text as well as the restaurant attributes in order to improve the quality of recommmendations

RELATED WORK

DeepCoNN

Image of DeepCoNN taken from the Paper
Model taken from the below mentioned paper
  • The Base model is taken from Joint Deep Modeling of Users and Items Using Reviews for Recommendation By Lei Zheng, Vahid Noroozi, Philip S. Yu
  • A review based recommender system has been developed in this model which uses reviews collected from customers on an item to recommend new items.
  • For this model, they concatenate all the reviews by a user and then represent using word2vec embeddings for each word. Similarly, they find the embedding for all reviews of each item.
  • They train a CNN based model for each user and item separately and obtain user features and item features respectively using a fully connected layer after a CNN in each case.
  • Since, these features would not be in the same space, they then calculate parameters of the shared layer that connect the user and item features.

Goal

  • DeepCoNN model uses the review text to recommend items to users but does not take other attributes into consideration
  • Context based information such as answers to questions like cuisine availability, parking availabilty is present in the Yelp Dataset but has not been taken into consideration
  • We in this project have worked on ways to append the context information either through input for the model already present or in the form of additional layers that can add information into the recommendation process by affecting the predicted rating.
  • We have developed two models DeepCoNN+attr and RecCoNN
  • DeepCoNN+Attr

    DeepCoNN+Attr.jpg
    Fig: Architecture of DeepCoNN+attr
  • Architecture of this model is same as that of DeepCoNN
  • Changes have been made in the way input is given to the model
  • Attribute information has been appended to the input and converted to word2vec while training
  • Ex :If a user has given a response “ABC … XYZ” to an item which has attributes {Parking : True, Delivery : True}, then the input into the model would be “ABC … XYZ Parking True Delivery True” for that user item pair respectively.
  • RecCoNN

    DeepCoNN++.jpg
    Fig: Architecture of RecCoNN
    • In this work, we have proposed and evaluated two recommender systems, first, we have used the same DeepCoNN model but appended user and item attributes into the reviews of each user or item to train the model and the second model (RecCoNN) which uses the additional attribute information (user profile specific) and trains two parallel neural networks coupled in the last layers.
    • One of the networks focuses on learning user behaviors exploiting reviews written by the user, considering additional user profile features, and the other network learns item properties from the reviews written for the item.
    • The following attributes are used in RecCoNN: Alcohol, OutdoorSeating, RestaurantsDelivery, RestaurantsTakeOut, BusinessParking/Valet and the cuisine data for profiling users.
    • Later, a shared layer is introduced on the top to couple these two networks together. The shared layer enables latent factors learned for users and items to interact with each other to map them to a common feature space.
    • Similar to MF techniques, user and item latent factors can effectively interact with each other to predict the corresponding rating. Experimental results demonstrate that RecCoNN outperforms our baseline system DeepCoNN on Yelp dataset.
    • We used Word2vec pretrained on GoogleNews(300 d) and also Glove embeddings(100 d). Furthermore, we found that word embedding could be helpful to capture semantic meaning of review text by comparing it with a variant of RecCoNN bag-of-words or TF-IDF representations for reviews.
    • At last, we conducted experiments by sampling the Yelp data with respect to the US states and investigated the impact of the number of reviews. We have used sampling because of the lack of computational resources.

    Discussions

    • Experimental results showed that, RecCoNN obtains more reduction in MSE than DeepCoNN.
    • It validates that RecCoNN can effectively alleviate the sparsity problem and also outperform DeepCoNN with the help of additional Attribute information.
    • Each US state data trained separately provides 30% improvement in MSE compared to full data. This proves the impact of localization of the review data and we plan to incorporate location of the restaurants in our training.

    Attacks

    Attacks can happen due to bots creating new accounts and writing bad reviews (nuke attacks) or good reviews (push attacks) for targeted products.
    • Text analysis : Suppose bots try to manipulate by duplicating specific type of review, text analysis can be done and similarity can be found out. Confidence of each review can be established.
    • Attributes such as usefulness of a review can be estimated using feedback by other users
    Bots create reviews which are similar to a particular user profile where the user profile is designed similar to specific group of people like (Chinese restaurant visitor, Indian food lover etc) and influence that group of recommendations.
    • Using Network Analysis , we can only suggest people who have higher weights in the network embedding which bots have difficulty in achieving.

    Results

    Mean Absolute Error (MAE)
    # Baseline(DeepCoNN) DeepCoNN+attr RecCoNN
    Arizona 0.9374 0.9917 0.9502
    North Carolina 0.8495 0.8706 0.8475
    Pennsylvania 0.8252 0.8263 0.8232
    Mean Square Error (MSE)
    # Baseline(DeepCoNN) DeepCoNN+attr RecCoNN
    Arizona 1.1831 1.204 1.181
    North Carolina 1.0872 1.12 1.0929
    Pennsylvania 1.0625 1.0578 1.0502

    Need for Network Embedding

    Traditional recommendation methods (e.g., matrix factorization) mainly aim to learn an effective prediction function for characterizing user-item interaction records (e.g., user-item rating matrix). With the rapid development of web services, various kinds of auxiliary data (side information) become available in recommender systems. Although auxiliary data is likely to contain useful information for recommendation, it is difficult to model and utilize these heterogeneous and complex information in recommender systems. Furthermore, it is more challenging to develop a relatively general approach to model these varying data in different systems or platforms.

    Heterogeneous Information Network Embedding

    As a promising direction, heterogeneous information network (HIN), consisting of multiple types of nodes and links, has been proposed as a powerful information modeling method. Due to its flexibility in modeling data heterogeneity, HIN has been adopted in recommender systems to characterize rich auxiliary data. During literature review, we observed multiple movie recommendation papers characterized by HINs. We can see that the HIN contains multiple types of entities connected by different types of relations. Under the HIN based representation, the recommendation problem can be considered as a similarity search task over the HIN.

    HIN.jpg
    Fig: Heterogeneous Information Network Embedding

    HIN is a more general model which contains more comprehensive relations among objects and much richer semantic information. It is challenging to develop effective methods for HIN based recommendation in both extraction and exploitation of the information from HINs. Most of HIN based recommendation methods rely on path based similarity, which cannot fully mine latent structure features of users and items.

    Due to lack of resources, we have restricted our work to the basics of network embedding. We studied the structure and topology of the Yelp social network. We found evidence of a small world network. A small subset of users act as hubs and the main connected component has a diameter of 6. Lastly, we have generated network graph and did some visualizations.

    Based on visualizations we observed that Yelp essentially has two classes of users:
    • Class 1 - uses Yelp purely to look for food/entertainment but do not engage in the social networking much.
    • Class 2 - exhibits the power law, has many friends, and treats yelp as a social network so mainly relying on their network of friends to recommend them Restaurants, along with other interactions.

    Observations

    # Number of Nodes Number of Edges Average Degree Diameter
    Top 10 Users 5859 11057 3.7744 4
    Top 100 Users 10928 39759 7.2765 5
    Top 1000 Users 17104 94994 11.1078 6
    All Users Graph 30255 151516 10.0159 Not available Disconneted Components
    Top 10 Users
    Top 10 Users
    Top 20 Users
    Top 20 Users
    Top 50 Users
    Top 50 Users
    Top 100 Users
    Top 100 Users
    Top 500 Users
    Top 500 Users
    Top 1000 Users
    Top 1000 Users

    Future Work

    To embed HINs, we would like to use a random walk strategy to generate meaningful node sequences for network embedding. We characterize nodes from HINs with low-dimensional vectors, i.e., embeddings. Instead of relying on explicit path connection, we would like to encode useful information from HINs with latent vectors. The learned node embeddings are first transformed by a set of fusion functions and once we obtain the representations for user u and item i, we would like to integrate them to our RecCoNN model.

    References

    • L. Zheng, V. Noroozi, and P. S. Yu. Joint deep modeling of users and items using reviews for recommendation. CoRR, abs/1701.04783, 2017.
    • Chuan Shi, Member, IEEE, Binbin Hu, Wayne Xin Zhao Member, IEEE and Philip S. Yu, Fellow, IEEE Heterogeneous Information Network Embedding for Recommendation
    • H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis on review text data: A rating regression approach. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10, pages 783–792, New York, NY, USA, 2010. ACM
    • C. Zhang, L. Yu, Y. Wang, C. Shah, and X. Zhang. Collaborative User Network Embedding for Social Recommender Systems, pages 381–389.
    • J. McAuley and J. Leskovec. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conference on Recommender Systems, RecSys ’13, pages 165–172, New York, NY, USA, 2013. ACM.
    • G. Ling, M. R. Lyu, and I. King. Ratings meet reviews, a combined approach to recommend. In RecSys, 2014.
    • Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, Aug. 2009.

    Links