RecCoNN

By

Akhil Babu Manam

Ashok Govinda Gowda

Kishan Hulkodu Sheshagiri

Shiva Kumar Pentyala

Introduction

Many of the world's recommender systems are based on collaborative filtering techniques.
The basic idea of these techniques is that people who share similar preferences in the past tend to have similar choices in the future.
But Collaborative Filtering techniques have a sparsity problem in that the number of items rated by users is insignificant to the total number of items.
We aim to reduce the sparsity problem and improve the quality of recommendations

We are using Yelp Dataset in this scenario to showcase our methods against methods like matrix factorization as Yelp dataset is very sparse and traditional methods like matrix factorization involves lots of computation when it comes to sparse dataset The current techniques in recommender systems use collaborative filtering, content based filtering or hybrid methods on ratings explicitly given by the users to each item. But such systems are not robust to problems due to sparsity in the ratings data.
It has been shown that reviews written by users can reveal some information on the customers’ buying and rating behavior, and also reviews written for items may contain information on their features and properties.
So we believe that a large amount of information exists in reviews written by users. This source of information has been ignored by most of the current recommender systems while it can potentially alleviate the sparsity problem and improve the quality of recommendations.
In Jan 2017, Lei et. al. have proposed a deep model (DeepCoNN) to learn item properties and user behaviors jointly from review text. But we noticed that they have completely neglected other user/item profile specific attribute information.

Consider the Yelp Dataset

While there are a lot of attributes in the dataset, Most collaborative filtering techniques consider only the user ratings

But there are attributes that are more relevant in restaurant recommender systems (for example, is time of ordering important, is user sequence of ordering important, is positivity or negativity of user comments important, in what ways can location of the restaurant be used to improve the recommendation quality etc.)?

We capture the contextual information in the review text as well as the restaurant attributes in order to improve the quality of recommmendations

RELATED WORK

DeepCoNN

Model taken from the below mentioned paper

The Base model is taken from Joint Deep Modeling of Users and Items Using Reviews for Recommendation By Lei Zheng, Vahid Noroozi, Philip S. Yu

Link to the Paper

A review based recommender system has been developed in this model which uses reviews collected from customers on an item to recommend new items.

For this model, they concatenate all the reviews by a user and then represent using word2vec embeddings for each word. Similarly, they find the embedding for all reviews of each item.

They train a CNN based model for each user and item separately and obtain user features and item features respectively using a fully connected layer after a CNN in each case.

Since, these features would not be in the same space, they then calculate parameters of the shared layer that connect the user and item features.

Goal

DeepCoNN model uses the review text to recommend items to users but does not take other attributes into consideration

Context based information such as answers to questions like cuisine availability, parking availabilty is present in the Yelp Dataset but has not been taken into consideration

We in this project have worked on ways to append the context information either through input for the model already present or in the form of additional layers that can add information into the recommendation process by affecting the predicted rating.

We have developed two models DeepCoNN+attr and RecCoNN

DeepCoNN+Attr

Fig: Architecture of DeepCoNN+attr

Architecture of this model is same as that of DeepCoNN

Changes have been made in the way input is given to the model

Attribute information has been appended to the input and converted to word2vec while training

Ex :If a user has given a response “ABC … XYZ” to an item which has attributes {Parking : True, Delivery : True}, then the input into the model would be “ABC … XYZ Parking True Delivery True” for that user item pair respectively.

RecCoNN

Fig: Architecture of RecCoNN

In this work, we have proposed and evaluated two recommender systems, first, we have used the same DeepCoNN model but appended user and item attributes into the reviews of each user or item to train the model and the second model (RecCoNN) which uses the additional attribute information (user profile specific) and trains two parallel neural networks coupled in the last layers.

One of the networks focuses on learning user behaviors exploiting reviews written by the user, considering additional user profile features, and the other network learns item properties from the reviews written for the item.

The following attributes are used in RecCoNN: Alcohol, OutdoorSeating, RestaurantsDelivery, RestaurantsTakeOut, BusinessParking/Valet and the cuisine data for profiling users.

Later, a shared layer is introduced on the top to couple these two networks together. The shared layer enables latent factors learned for users and items to interact with each other to map them to a common feature space.

Similar to MF techniques, user and item latent factors can effectively interact with each other to predict the corresponding rating. Experimental results demonstrate that RecCoNN outperforms our baseline system DeepCoNN on Yelp dataset.

We used Word2vec pretrained on GoogleNews(300 d) and also Glove embeddings(100 d). Furthermore, we found that word embedding could be helpful to capture semantic meaning of review text by comparing it with a variant of RecCoNN bag-of-words or TF-IDF representations for reviews.

At last, we conducted experiments by sampling the Yelp data with respect to the US states and investigated the impact of the number of reviews. We have used sampling because of the lack of computational resources.

Discussions

Experimental results showed that, RecCoNN obtains more reduction in MSE than DeepCoNN.

It validates that RecCoNN can effectively alleviate the sparsity problem and also outperform DeepCoNN with the help of additional Attribute information.

Each US state data trained separately provides 30% improvement in MSE compared to full data. This proves the impact of localization of the review data and we plan to incorporate location of the restaurants in our training.

Attacks

Attacks can happen due to bots creating new accounts and writing bad reviews (nuke attacks) or good reviews (push attacks) for targeted products.

Text analysis : Suppose bots try to manipulate by duplicating specific type of review, text analysis can be done and similarity can be found out. Confidence of each review can be established.

Attributes such as usefulness of a review can be estimated using feedback by other users

Bots create reviews which are similar to a particular user profile where the user profile is designed similar to specific group of people like (Chinese restaurant visitor, Indian food lover etc) and influence that group of recommendations.

Using Network Analysis , we can only suggest people who have higher weights in the network embedding which bots have difficulty in achieving.

Results

**Mean Absolute Error (MAE)**
#	Baseline(DeepCoNN)	DeepCoNN+attr	RecCoNN
Arizona	0.9374	0.9917	0.9502
North Carolina	0.8495	0.8706	0.8475
Pennsylvania	0.8252	0.8263	0.8232

**Mean Square Error (MSE)**
#	Baseline(DeepCoNN)	DeepCoNN+attr	RecCoNN
Arizona	1.1831	1.204	1.181
North Carolina	1.0872	1.12	1.0929
Pennsylvania	1.0625	1.0578	1.0502

Need for Network Embedding

Traditional recommendation methods (e.g., matrix factorization) mainly aim to learn an effective prediction function for characterizing user-item interaction records (e.g., user-item rating matrix). With the rapid development of web services, various kinds of auxiliary data (side information) become available in recommender systems. Although auxiliary data is likely to contain useful information for recommendation, it is difficult to model and utilize these heterogeneous and complex information in recommender systems. Furthermore, it is more challenging to develop a relatively general approach to model these varying data in different systems or platforms.

Heterogeneous Information Network Embedding

As a promising direction, heterogeneous information network (HIN), consisting of multiple types of nodes and links, has been proposed as a powerful information modeling method. Due to its flexibility in modeling data heterogeneity, HIN has been adopted in recommender systems to characterize rich auxiliary data. During literature review, we observed multiple movie recommendation papers characterized by HINs. We can see that the HIN contains multiple types of entities connected by different types of relations. Under the HIN based representation, the recommendation problem can be considered as a similarity search task over the HIN.

Fig: Heterogeneous Information Network Embedding

HIN is a more general model which contains more comprehensive relations among objects and much richer semantic information. It is challenging to develop effective methods for HIN based recommendation in both extraction and exploitation of the information from HINs. Most of HIN based recommendation methods rely on path based similarity, which cannot fully mine latent structure features of users and items.

Due to lack of resources, we have restricted our work to the basics of network embedding. We studied the structure and topology of the Yelp social network. We found evidence of a small world network. A small subset of users act as hubs and the main connected component has a diameter of 6. Lastly, we have generated network graph and did some visualizations.

Based on visualizations we observed that Yelp essentially has two classes of users:

Class 1 - uses Yelp purely to look for food/entertainment but do not engage in the social networking much.

Class 2 - exhibits the power law, has many friends, and treats yelp as a social network so mainly relying on their network of friends to recommend them Restaurants, along with other interactions.

Observations

#	Number of Nodes	Number of Edges	Average Degree	Diameter
Top 10 Users	5859	11057	3.7744	4
Top 100 Users	10928	39759	7.2765	5
Top 1000 Users	17104	94994	11.1078	6
All Users Graph	30255	151516	10.0159	Not available Disconneted Components