Machine Learning in Economics… A Fad?

Over the holiday weekend (in the United States) The Economist ran an article with the title: “Economists are Prone to Fads, and the Latest is Machine Learning“. As I am currently taking a class on ‘Big Data for Economists’, this article peaked my interest. The following chart was shown to visualize some recent trends within economics, but a fad implies something has short-term enthusiasm that ultimately dies off. This chart doesn’t show any decay in the use (erm… mentions in NBER working-paper abstracts) of these methodological innovations. In fact, all of these methods – besides DSGE, are continuing to trend upwards. So are these methods a fad or are they here to stay?


The article makes the (correct) observation that these methodological innovations are sometimes misused by economists. The recent paper by Angus Deaton and Nancy Cartwright was mentioned to show that sometimes RCTs are run without a very good understanding of the systematic differences between the treatment and control groups or without an underlying theoretical model to be tested. The result is research that may not be very helpful for policymakers, business-owners, or those who want to learn a little bit about how the world works. This critique is well-received by me and I’d expect many others, but it is not anything close to a reason to give up on using RCTs as one available research method within the toolkit of a social scientist. A much more nuanced discussion of this topic seems to be forthcoming in Tim Ogden’s book Experimental Conversations

Additionally, the article doesn’t mention much about the quasi-experimental methods – difference in differences and regression discontinuity – displayed in the chart above. For a number of reasons these methods seem potentially less faddish than lab or field experiments. Of course these methods have been misused as well, but it seems wrong to claim that these methods are at all similar to fashion trends. On the contrary, similar to fixed/random effects models or instrumental variables, these methods can be powerful tools useful for learning about how X causes Y.

Finally, machine learning and big data. A lot of the conversations that surround the use of machine learning and big data in economics are riddled with buzzwords, but each of these buzzwords describe things that I think are valuable additions to the toolkit of economists. Take “big data”: the term can mean a number of things from simply working with a extraordinarily large dataset (i.e. large N or K) to working with unorganized or unruly data (i.e. data from online interactions or data from images). Or take “machine learning”: this term encompasses everything from text analysis to regression trees to random forests to satellite image analysis. Most of the time machine learning is used with big data and the combinations of possibilities are nearly endless. Many machine learning and/or big data methods simply make this sort of analysis possible and therefore seem more like an innovation that is here to stay rather than a fad.

As Hal Varian’s excellent 2014 JEP article describes, data analysis in econometrics can be broken down into four categories (1) prediction, (2) summarization, (3) estimation, and (4) hypothesis testing. Machine learning is primarily concerned with prediction (i.e. just one part of the task of an empirical economist). 

Improved methods of predictions could prove important for the work of many empirical economists, especially in cases when machine learning is used in conjunction with other empirical methodologies popular in economics today. For example, machine learning methods can help with targeting pro-poor policies. Improved targeting or diagnosis in the context of a field experiment could increase the statistical power of an experiment. Another example is using machine learning in conjunction with a propensity score matching exercise which could improve the efficiency of the estimation of causal effects. With this distinction in mind I think it is clear that the methods of machine learning and big data are not a fad, but instead are worthwhile innovations for almost any economist today.

[P.S. Before I was able to post this, Noah Smith wrote about some similar thoughts on his own blog.]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s