Monday, May 24, 2010

An Analysis of Amazon Reviews

I often wondered how Amazon list reviews from customers on their web site. Thus I spent a while looking at one which is did a few months ago and reported here. What got me thinking was that my review had the most votes but it was a critical review and after a few weeks Amazon went and hid the review in the detritus on their reviews.

Let me start with the data (from How Markets Fail):



















Now I wondered how they rank their reviews given the information. For example is the total number of votes important, the rating of the book, the comments, the age of the review, and the like. So I did a small analytical study.

Let me plot a few of the statistics:



















The above is the rating versus ranking on the site. Clearly there is no correlation here.



















Then I plotted percent positive comments versus the ranking by Amazon. It seems clear that there is some correlation. Yet it is not totally obvious.



















Then I plotted the percent who thought negatively of the review as above. Now it is clear that there is a strong correlation., If a review did not help then it was ranked lower than if it helped.



















I then considered if there were any comments. There were comments but they did not seem to play any role.



















Finally I looked at the days it was posted. For the top 14 reviews there seemed little correlation. By the way mine was ranked 14th. However for those after mine the correlation is quite strong.

I then developed a model to determine a least squares estimate of the wights. I started as follows:



















which leads to the following estimator:



















Now I applied this and found that the negative evaluation, namely that a review did not help them decide dominated the selection metric. However it was a very poor fit. Perhaps the missing data point, namely if the customer bought after reading the review was a factor. However I also suspect that the publisher may have influence as well yet that would be impossible to decide.

This analysis is interesting in that the presence of a less than flattering review, as mine was in the body but not in the numbers may get read and pressure placed upon Amazon. Perhaps Amazon just wants sales not matter what? One cannot really tell what is happening other than concluding that it is clear that Amazon doe not use the metrics I looked at.

I think this is an interesting way to look at who "controls" what we are reading.