Recommender Systems: How Amazon makes money?

A Recommender System (RS) consists of two basic entities: users and items, where users provide their opinions (ratings) about items. We denote these users by U = {u_1, u_2,..., u_M}, where the number of users using the system is |U| = M, and denote the set of items being recommended by I = {i_1,i_2,..., i_N}, with |I| = N. We can represent each element of user space U and item space I with a profile. We usually represent a user's profile by de fining their characteristics like age, gender, geographical location, etc.; however, in simple cases we represent it by a unique user Identifi er (ID). Similarly, we represent each item by de fining some characteristic; for example in a book recommender system, each book can be represented by author, topic, year of release, etc.

Recommender systems store the history of the user's interactions with the system; for example, user purchase history, types of items they purchase together, their ratings, etc. Most of the recommender systems require users to rate some item, in order to recommend unknown items; for example, in the Netflix movie recommender system, when a new user registers they have to rate some movies in order to get proper recommendations from the system. The users will have given ratings of some but not all of the items. An example of rating matrix is given below:

Figure1: User Item Rating (Feedback) Matrix

Typically, the ratings are de fined on a subset of I x U and not on the whole space. The task of the recommender systems then becomes to extrapolate rating (by a function) to the whole space I x U in order to make recommendations. There are di fferent ways to extrapolate the utility function over the whole I x U space. We can use data mining and machine learning algorithms, approximation theory, and some heuristics for prediction.

Defining Users' and Items' Profiles:

The main building elements of the recommender systems, i.e. users and items, need to be modelled in such a way that recommendation algorithms can exploit them. Recommender systems usually get initial information about users when they first register with the system. The simplest way is to create an empty user's profile, which is updated as the system gathers the user's feedback. This method, however, would not be able to recommend any items unless it gathers some information about the user's preferences. An alternative approach is where the user manually creates a profile. The user might need to give their interests (e.g. types of domain they are interested in), demographic (e.g. age, genre, etc.) information, and geographical (e.g. country) information. Another approach, used by the MovieLens video recommender system and iLike music recommender system (ilike.com), requires user to provide ratings on a prede ned set of items. For example, when a new user registers with the iLike web-site, the system presents them a list of artists they need to rate before getting the recommendations.

Figure2: User feedback in Amazon (in terms of the number of stars)

After getting the initial information, the system maintains the user's profile, as they provide feedback. The feedback can be explicit or implicit. Explicit feedback, where the user provides their opinions about certain items, can be positive or negative and usually comes in the forms of ratings. Rating scales can be discrete, although most of the recommender systems use discrete scales. Explicit feedback can also be gathered by allowing users to write comments and opinions about certain items. In implicit feedback the user's interaction with the item is observed; for example, web usage mining (e.g. time spent in a web page), analysing the listening/watching habits in media player (e.g. in YouTube the system might store how a user plays, re-plays, skips, and stops videos), and observing the history of the transactions in the e-commerce website (e.g. items purchased or returned by a user).

An item's profile can be defined in different ways: (1) by getting features (or meta data) about the item , (2) by using the ratings provided by users on that item, (3) by using the domain-specific Ontologies, and (4) by using demographic information (category) about items.

The Most Famous Recommendation Algorithm: Collaborative Filtering:

The most famous and simple type of algorithm used to make recommendation is Collaborative Filtering. Collaborative Filtering recommend items by taking into account the taste (in terms of preferences of items) of users, under the assumption that users will be interested in items that users similar to them have rated highly. Examples of these systems include Amazon and Ringo. Collaborating filtering recommender systems are based on the assumption that people who agreed in the past will agree in the future too. There are three main steps to make a prediction (whether user will like it or not) for an item a user has not seen/purchased/rated before as follows:

In the first step, users rate some items they have experienced previously.
In the second step, an active user (the user for whom the recommendations are computed)'s profile is matched with other users' profiles in the system. A set of similar users also called neighbours of the active user are found.
In the last step, predictions are made for items that the active user has not rated based on the ratings provided by its nearest neighbours. Finally, these items are presented to the active user in a suitable order.

Example Based on Figure1: User Musi has not seen the movie "The Godfather" and he is in a dilemma---whether or not to rent this movie. Only two users, Hamza and Adam have already seen this movie. He knows that Hamza has the same taste in movies as he has, as both of them have liked "Troy" and disliked "Forest Gump" movies. Furthermore, he knows that Adam has quite opposite tastes to his, as Adam has liked the movies he disliked (i.e. "Forest Gump") and vice versa. Considering this he asks Hamza's opinion and discards (or acts opposite to) Adam's opinion and makes the decision accordingly. It must be noted that Fahime has exactly the same taste as Musi; however, her opinion cannot be taken into account, as she has not rated the "The Godfather".

Figure3: Amazon Recommendations

What Recommendation Algorithm the Amazon Uses:

The above mentioned approach is called User-Based Collaborative Filtering (UBCF). The Amazon uses a slightly different approach---by selecting the most similar items rather than users---Item-Based Collaborative Filtering (IBCF). The steps in the IBCF are as follows:

In the first step, all items rated by an active user are retrieved.
In the second step, the target item's similarity is computed with the set of retrieved items. A set of most similar items are selected.
In the last step, prediction for the target item is made by computing the weighted average of the active user's rating on the most similar items.

The rationale behind using the IBCF rather than UBCF is that, the similarities between the items can be calculated using an off-line stage and that the items set is non-volatile, i.e. mostly in e-commerce web-site, the items set changes very often compared with the users (which keeps registering with the passage of time).

In-fact, Amazon is using much simpler approach to make huge money by giving personalized recommendations to users, but the truth is it is working very well!

In the next post I will show, what sort of algorithms Google News, Netflix, and other web-giants and using to make huge money by giving users the personalized recommendations.

Follow me on Twitter: https://twitter.com/musiali007

Malala YusufJai Shooting --- Who is who?

I was busy in my office, writing a software program to predict the stock market future trend using the historical data. The prediction problem is multifaceted and really hard, as the trend depends on a number of factors rather than the historical data; for example, on the news, blogs, social media campaign, a nd politics for example the stock market gambler can intentionally turn the whole market upside down to exploit some shares . The clock struck 12.00 noon, and to relax a bit, I opened the BBC website and was shocked to read that: “Malala has been gunshot by unknown people in Swat region ”. It was really disturbing--- Malala YusufJai , a young 14-years old bright student and activist, is considered as a national brilliance as she has earned international fame for raising voice against Taliban oppression in Swat. She became the voice of all Swati girls by writing a diary in the BBC Web-site, famous for her pen name “Gul Makai.” Who is responsib...

Arshad Ali said…

Musi bhai,
I didn't know you have an interest in this kind of stuff. I am quite interested to know more about this and am wondering if you have thought about putting into a model what you have said above? I mean, can this (users, items and interest) be expressed into a relationship based on knowledge formalism? I am not working on something similar but the theme could be related to my research which is about conceiving and developing an ontology based on a formal model which could be used for automatic annotation and subsequently knowledge acquisition. will be interesting to learn more from you about your interest in the topics on your blog at some point.
Arshad

13 October 2012 at 03:13

DRMag said…

Thanks for the comment,

This area is v saturated. The Ontological recommender system are there from last 10 years. Amazon uses a sort of Ontology to recommender items to the users.

For more info, we can have some discussion at some point

13 October 2012 at 11:34

DrMAG 's Blog

Search This Blog