In April 2009, Google published a draft research paper “Predicting the Present with Google Trends,” by Google’s Chief Economist Hal Varian and Decision Support Engineering Analyst Hyungyoung Choi. The paper is available for download in an April 2 posting by Varian and Choy on the Google Research Blog that has stirred a lot of commentary.
The paper describes using search query volumes to predict economic activity – such as Ford car sales. The authors contend that by incorporating relevant search query volume in their models, considerable accuracy improvements are obtained over standard auto-regressive models. Note that the authors are focused on very short-term economic prediction.
If this approach really works, the search data (publicly available through Google Insight) could be a boon to near-term forecasting. But does it work? My colleague Udo Sglavo, Solution Architect in the SAS Technology Global Practice, attempted to reproduce the paper’s results, and found some problems with the authors’ claims. Here are Udo’s findings:
Guest Blogger: Udo Sglavo, SAS Technology Global Practice
1. The authors write: “We are not claiming that Google Trends data help predict the future. Rather we are claiming that Google Trends may help in predicting the present. For example, the volume of queries on a particular brand of automobile during the second week in June may be helpful in predicting the June sales report for that brand, when it is released in July.”
• To me it does not matter how they call it – it is still a forecast I think, as it is about predicting the future. In the first example (predicting Ford sales) the sales data is monthly, while the Google Trend data is weekly. What is illustrated in the example is that one could use the Google’s trend data (in fact the first week of a given month) to increase accuracy of the monthly prediction. With other words – the trend data is used as a leading indicator for sales. Even if they are measured on different frequencies (monthly vs. weekly) they are still using a future week to predict the upcoming month. In their example: use trend information of first week in September 2008 to predict September 2008 sales.
• I have tried to replicate the Ford Sales example however, when using SAS Forecast Server I don’t seem to manage to outperform the ESM – compared on in-sample performance, which is what the authors are suggesting to do. Unfortunately I don’t seem to be able to draw the same conclusion as the authors that the transfer model (using an input) is superior to a smoothing model (without input). Even when flagging July 2005 as an outlier (as suggested by the authors). I’d be interested if somebody else is able to replicate their findings.
2. As we it has been stated by Mike earlier there are lots of misconceptions about forecasting accuracy – and sometime even researchers are not fully aware of them – for example:
•The authors draw a lot of the conclusions on statistics based on in-sample data – not out-of-sample. Not recommended by Armstrong’s Standard's & Practices for Forecasting (13.26).
• In chapter 2.1 the authors claim: “Note that the R-squared moves from 0.6206 (Model 0) to 0.7852 (Model 1) to 0.7696 (Model 2).” However, R-squared should not be used to compare accuracy of forecasting models (Standards & Practices for Forecasting (13.28), or see the Forecasting Principles website).
• The author uses standard regression diagnostics plots to make his points – this seems not correct to me.
Our SAS colleague Terry Woodfield, Statistical Services Specialist, also weighed in, arguing there is a fundamental flaw in the behavioral model underlying the Google approach:
Guest Blogger: Terry Woodfield, SAS Education and Training
The premise is appealing, but if true, could we not make millions in the stock market? The behavioral model suggests that a person visits the Web because of interest in making a purchase, learns neat stuff, and then goes out to make the purchase. The negative side is not mentioned. Consider….
Before the Internet, I go to an automobile dealership because I am interested in making a purchase. I succumb to high pressure sales tactics, and I make a purchase.
In the Internet age, I first go to the Internet when I am interested in making a purchase. From the Internet, I learn that the vehicle I am interested in: (1) harms the environment, (2) has a terrible safety record, (3) has a terrible frequency of repair record, (4) costs more than anticipated, and (5) angers God. I choose not to go to the dealership, thus preventing an opportunity to be persuaded by the evil, devious salesperson. Because of the Internet, I DO NOT make a purchase.
Consider also that while I am browsing porn sites on the Internet, I am not watching slick television ads that might convince me to buy a light duty truck. (Hypothetically browsing porn sites for the purpose of discussion, of course.)
If the Internet has a positive effect on sales, I benefit from Google data. If the Internet has a negative effect on sales, I benefit from Google data. If the Internet has both positive and negative effects that cancel each other out, I do not benefit from Google data.
Where does this leave us? If we grant that adding the Google Trends data improves the authors’ seasonal autoregressive model, we are left with Udo’s finding that in this case the exponential smoothing model (that doesn’t use the Google data) is simpler and performs better! Of course, we can’t draw any broad brush conclusions based on just the few examples exhibited in this paper. However, this reminds us that when it comes to forecasting (and not just fitting models to history), simpler is often better -- and simpler is always preferred.