Home / Chroniques / Researchers are using Google to forecast economic activity
π Digital π Economics

Researchers are using Google to forecast economic activity

Anna Simoni
CNRS Director of Research at CREST and Professor of Econometrics and Statistics at ENSAE (IP Pairs)
Key takeaways
  • For some years now, Google's search data has been used to monitor or forecast economic activity.
  • This data, available on a weekly basis responds to a need for speed as it takes longer for traditional indicators such as GDP to become available.
  • Google searches are an interesting indicator of economic health as they provide real-time information on Google users' perception of the economy and their willingness to consume.
  • Indicators from Google are particularly relevant in times of crisis, as they react quickly to changes in the economy.

Why have researchers and insti­tu­tions, such as the OECD, turned to Google data to fore­cast coun­tries’ eco­nom­ic activ­i­ty? What needs does it address? 

Usu­al­ly, to make macro­eco­nom­ic fore­casts, we use data from cen­tral banks or sta­tis­ti­cal insti­tutes such as INSEE. This data is very infor­ma­tive, but it is not imme­di­ate­ly avail­able. That’s why peo­ple are inter­est­ed in oth­er data sources that can pro­vide infor­ma­tion in real-time. 

If an eco­nom­ic pol­i­cy­mak­er needs to boost the econ­o­my, for exam­ple, they need to know what the cur­rent eco­nom­ic sit­u­a­tion is. This is not pos­si­ble using only the offi­cial data. GDP data is pub­lished quar­ter­ly, on aver­age one and a half months after the end of the quar­ter in ques­tion. It is there­fore impos­si­ble to adjust eco­nom­ic poli­cies instant­ly. The idea of using alter­na­tive sources, of which Google is one, is real­ly to address this prob­lem of delayed offi­cial data. 

What Google tools are used in this macro­eco­nom­ic fore­cast­ing work?

There are two types of data from Google: Google Trends and Google Search. The pri­ma­ry source of both data­bas­es is the same, how­ev­er: words typed into the Google search engine. Most researchers use Google Trends: a web page that every­one has access to. The data cor­re­sponds to search trends by coun­try and by cat­e­go­ry (enter­tain­ment, busi­ness, health, sci­ence, sports). Google assigns the search key­word to a category. 

Google search­es can be seen as a sum­ma­ry of how peo­ple per­ceive the economy.

Google Search, on the oth­er hand, pro­vides data sets from inter­net search­es, made avail­able by Google and giv­en to the Euro­pean Cen­tral Bank. The two data­bas­es are con­struct­ed dif­fer­ent­ly, Google Trends looks at the vol­ume of search­es while Google Search gives infor­ma­tion on the change in vol­ume. In my study, we worked with the Google Search data. 

So, the idea is to pre­dict eco­nom­ic health by analysing what Google users type into the search engine. Where did this idea come from and why is this data relevant? 

The first papers on the sub­ject were pub­lished by Hal Var­i­an, chief econ­o­mist at Google. This data is fair­ly new: it has been avail­able since 2004, but I start­ed my project using Google data in 2016. The intu­ition and pre­sup­po­si­tion behind using this data for macro­eco­nom­ic fore­cast­ing is that Google search­es can be seen as a sum­ma­ry of how peo­ple per­ceive the econ­o­my. If the econ­o­my is healthy, peo­ple tend to search for cul­ture, trav­el, etc. On the oth­er hand, if there are prob­lems with unem­ploy­ment, there will be more job-relat­ed searches. 

How effec­tive is this data in pre­dict­ing a coun­try’s eco­nom­ic activ­i­ty? Is it use­ful for pre­dict­ing peri­ods of growth and recession?

What I have observed in my research is that these tools are par­tic­u­lar­ly use­ful in times of cri­sis. Dur­ing the cri­sis of 2008–2009, for exam­ple, Google data antic­i­pat­ed eco­nom­ic activ­i­ty well because it is more respon­sive to change, com­pared to offi­cial data.

How­ev­er, the data from Google has very lit­tle cor­re­la­tion with GDP. Except in times of cri­sis, offi­cial infor­ma­tion is still more infor­ma­tive. It is also essen­tial to make a pre-selec­tion, as there are about 300 cat­e­gories per coun­try. If you use them all, it can make the esti­ma­tion process less clear. Before mak­ing a fore­cast, it is there­fore nec­es­sary to select the most cor­re­lat­ed Google cat­e­gories to pre­dict GDP. If this is done, the results can be very infor­ma­tive, even for sta­ble peri­ods, and when no offi­cial infor­ma­tion is available. 

Which research cat­e­gories are most use­ful for fore­cast­ing eco­nom­ic activity? 

The most cor­re­lat­ed cat­e­gories are often con­sumer-relat­ed, such as leisure and enter­tain­ment. This is eas­i­ly explained: dur­ing times of eco­nom­ic sta­bil­i­ty, peo­ple are more inclined to buy. We should also con­sid­er cat­e­gories relat­ed to social net­works. Peo­ple may be more or less active in their use of social net­works, depend­ing on the state of the econ­o­my: get­ting infor­ma­tion on plat­forms or con­sult­ing sites like LinkedIn to find job offers, for example. 

What are the advan­tages and lim­i­ta­tions of Google data com­pared to offi­cial data? 

The main advan­tage in com­par­i­son to offi­cial data is the issue of speed. In fact, the overview of the econ­o­my is almost instan­ta­neous: we lis­ten to the news, we see that there is a war or a polit­i­cal cri­sis, we react imme­di­ate­ly, we adapt our behav­iour. How­ev­er, indus­tries take much longer to adapt to an eco­nom­ic cri­sis, it does­n’t hap­pen overnight. Most eco­nom­ic actors are slow­er to react. 

The main lim­i­ta­tion is that this data is dif­fi­cult to use. In my study we tried a num­ber of meth­ods, and some of them did not work at all. For exam­ple, the method of pre-select­ing research cat­e­gories only works dur­ing peri­ods of sta­bil­i­ty: in times of cri­sis, you should not pre-select. 

How do you see the future of eco­nom­ic fore­cast­ing, in terms of data sources? Do you think that the use of Google will con­tin­ue to grow? 

I don’t imag­ine that any one data source will become bet­ter than oth­ers: we will con­tin­ue to use sev­er­al sources and sev­er­al mod­els. Depend­ing on the eco­nom­ic con­text, we will have bet­ter fore­casts with cer­tain data. What we need to do now, in addi­tion to fur­ther automat­ing the meth­ods we have put in place, is to com­pare the per­for­mance of Google data in terms of fore­cast­ing against oth­er alter­na­tive sources of data, such as news­pa­per arti­cles that fol­low eco­nom­ic and finan­cial news: this is what I am try­ing to apply in my research.

Interview by Sirine Azouaoui

Our world explained with science. Every week, in your inbox.

Get the newsletter