I am noticing a trend. At the ASSA meetings in January (where economics, sociology and finance academics and practitioners gather to discuss their research) I was surprised to see how much “machine learning” was trending with economists. The session “Machine Learning Methods in Economics and Econometrics,” with papers by Susan Athey (Microsoft and Stanford) and Pat Bajari (Amazon and University of Washington), was one of the most popular at the conference. Both authors are joining a group of economists that are cautiously dipping their toes into the area of predictive modeling known as machine learning (ML). While ML tools have been becoming increasingly popular with computer scientists, only recently have economists embraced the value of some of these methods.
The second piece of evidence suggesting that machine learning is trending with economists came from a conference I recently attended at the National Academy of Sciences called, “Drawing Causal Inference from Big Data”. The more than 400 attendees from academia, government and industry heard papers from top academics working to merge the field of statistical inference or causality with the tools typically used with “big data.” Here were my highlights from the conference, with recordings of the talks included as links.
- Michael Jordan talked on the intersection of statistical computing and inference. He reviewed the literature of inference under constraints. My favorite part was the concept of “Bag of Little Bootstraps,” which is a method to assess quality of estimators from a large dataset.
- Judea Pearl talked on the importance of the causality story with “big data.” For example, “subjects of the big data system (patients for instance) will attempt to pull causality from the users of big data (doctors)”…. So correlation won’t be enough for long. He spent a lot of time talking about the problems of observational data and causality (with a lot of reliance on DAGs).
- Thomas Richardson spent his time talking on using observational pharmaceutical data to inform efficacy. David Heckerman from Microsoft talked about personalized medicine based on genetic information. He spent some time explaining his work he calls FaST-LMM.
- Bernard Scholkopf explained how causal models can help machine learning and gave an overview of his research in this area.
- Susan Athey’s talk was about how using trees to improve causal inference. This talk was my favorite, because it was an overview of many different ML methods that can assist an economist in model specification as well as lots on cross-validation and heterogeneous treatment effects.
The third piece of evidence for this trend is a shameless plug for an upcoming talk I am giving. At an event hosted by NABE, Big Data Analytics at Work: New Tools for Corporate and Industry Economics, Patrick Hall, Senior Machine Learning Scientist at SAS, and I will give economists an introduction to the methods and technology needed to get started with ML methods. We plan to discuss many of the methods that can be used to glean insight from large data sets, whether they are long or wide. There are very few seats left for the conference, which is June 16-17, so if you haven’t already done so, sign up now! It will be an excellent introduction to both methods and potential applications. I will make sure to post our slides after the conference. Let’s see if the trend toward machine learning in economics continues!