Home / Chroniques / Researchers are using Google to forecast economic activity
π Digital π Economics

Researchers are using Google to forecast economic activity

SIMONI Anna
Anna Simoni
CNRS Director of Research at CREST and Professor of Econometrics and Statistics at ENSAE (IP Pairs)
Key takeaways
  • For some years now, Google's search data has been used to monitor or forecast economic activity.
  • This data, available on a weekly basis responds to a need for speed as it takes longer for traditional indicators such as GDP to become available.
  • Google searches are an interesting indicator of economic health as they provide real-time information on Google users' perception of the economy and their willingness to consume.
  • Indicators from Google are particularly relevant in times of crisis, as they react quickly to changes in the economy.

Why have researchers and insti­tu­tions, such as the OECD, turned to Google data to fore­cast coun­tries’ eco­nom­ic activ­i­ty? What needs does it address? 

Usu­al­ly, to make macro­eco­nom­ic fore­casts, we use data from cen­tral banks or sta­tis­ti­cal insti­tutes such as INSEE. This data is very infor­ma­tive, but it is not imme­di­ate­ly avail­able. That’s why peo­ple are inter­est­ed in oth­er data sources that can pro­vide infor­ma­tion in real-time. 

If an eco­nom­ic pol­i­cy­mak­er needs to boost the econ­o­my, for exam­ple, they need to know what the cur­rent eco­nom­ic sit­u­a­tion is. This is not pos­si­ble using only the offi­cial data. GDP data is pub­lished quar­ter­ly, on aver­age one and a half months after the end of the quar­ter in ques­tion. It is there­fore impos­si­ble to adjust eco­nom­ic poli­cies instant­ly. The idea of using alter­na­tive sources, of which Google is one, is real­ly to address this prob­lem of delayed offi­cial data. 

What Google tools are used in this macro­eco­nom­ic fore­cast­ing work?

There are two types of data from Google: Google Trends and Google Search. The pri­ma­ry source of both data­bas­es is the same, how­ev­er: words typed into the Google search engine. Most researchers use Google Trends: a web page that every­one has access to. The data cor­re­sponds to search trends by coun­try and by cat­e­go­ry (enter­tain­ment, busi­ness, health, sci­ence, sports). Google assigns the search key­word to a category. 

Google search­es can be seen as a sum­ma­ry of how peo­ple per­ceive the economy.

Google Search, on the oth­er hand, pro­vides data sets from inter­net search­es, made avail­able by Google and giv­en to the Euro­pean Cen­tral Bank. The two data­bas­es are con­struct­ed dif­fer­ent­ly, Google Trends looks at the vol­ume of search­es while Google Search gives infor­ma­tion on the change in vol­ume. In my study, we worked with the Google Search data. 

So, the idea is to pre­dict eco­nom­ic health by analysing what Google users type into the search engine. Where did this idea come from and why is this data relevant? 

The first papers on the sub­ject were pub­lished by Hal Var­i­an, chief econ­o­mist at Google. This data is fair­ly new: it has been avail­able since 2004, but I start­ed my project using Google data in 2016. The intu­ition and pre­sup­po­si­tion behind using this data for macro­eco­nom­ic fore­cast­ing is that Google search­es can be seen as a sum­ma­ry of how peo­ple per­ceive the econ­o­my. If the econ­o­my is healthy, peo­ple tend to search for cul­ture, trav­el, etc. On the oth­er hand, if there are prob­lems with unem­ploy­ment, there will be more job-relat­ed searches. 

How effec­tive is this data in pre­dict­ing a coun­try’s eco­nom­ic activ­i­ty? Is it use­ful for pre­dict­ing peri­ods of growth and recession?

What I have observed in my research is that these tools are par­tic­u­lar­ly use­ful in times of cri­sis. Dur­ing the cri­sis of 2008–2009, for exam­ple, Google data antic­i­pat­ed eco­nom­ic activ­i­ty well because it is more respon­sive to change, com­pared to offi­cial data.

How­ev­er, the data from Google has very lit­tle cor­re­la­tion with GDP. Except in times of cri­sis, offi­cial infor­ma­tion is still more infor­ma­tive. It is also essen­tial to make a pre-selec­tion, as there are about 300 cat­e­gories per coun­try. If you use them all, it can make the esti­ma­tion process less clear. Before mak­ing a fore­cast, it is there­fore nec­es­sary to select the most cor­re­lat­ed Google cat­e­gories to pre­dict GDP. If this is done, the results can be very infor­ma­tive, even for sta­ble peri­ods, and when no offi­cial infor­ma­tion is available. 

Which research cat­e­gories are most use­ful for fore­cast­ing eco­nom­ic activity? 

The most cor­re­lat­ed cat­e­gories are often con­sumer-relat­ed, such as leisure and enter­tain­ment. This is eas­i­ly explained: dur­ing times of eco­nom­ic sta­bil­i­ty, peo­ple are more inclined to buy. We should also con­sid­er cat­e­gories relat­ed to social net­works. Peo­ple may be more or less active in their use of social net­works, depend­ing on the state of the econ­o­my: get­ting infor­ma­tion on plat­forms or con­sult­ing sites like LinkedIn to find job offers, for example. 

What are the advan­tages and lim­i­ta­tions of Google data com­pared to offi­cial data? 

The main advan­tage in com­par­i­son to offi­cial data is the issue of speed. In fact, the overview of the econ­o­my is almost instan­ta­neous: we lis­ten to the news, we see that there is a war or a polit­i­cal cri­sis, we react imme­di­ate­ly, we adapt our behav­iour. How­ev­er, indus­tries take much longer to adapt to an eco­nom­ic cri­sis, it does­n’t hap­pen overnight. Most eco­nom­ic actors are slow­er to react. 

The main lim­i­ta­tion is that this data is dif­fi­cult to use. In my study we tried a num­ber of meth­ods, and some of them did not work at all. For exam­ple, the method of pre-select­ing research cat­e­gories only works dur­ing peri­ods of sta­bil­i­ty: in times of cri­sis, you should not pre-select. 

How do you see the future of eco­nom­ic fore­cast­ing, in terms of data sources? Do you think that the use of Google will con­tin­ue to grow? 

I don’t imag­ine that any one data source will become bet­ter than oth­ers: we will con­tin­ue to use sev­er­al sources and sev­er­al mod­els. Depend­ing on the eco­nom­ic con­text, we will have bet­ter fore­casts with cer­tain data. What we need to do now, in addi­tion to fur­ther automat­ing the meth­ods we have put in place, is to com­pare the per­for­mance of Google data in terms of fore­cast­ing against oth­er alter­na­tive sources of data, such as news­pa­per arti­cles that fol­low eco­nom­ic and finan­cial news: this is what I am try­ing to apply in my research.

Interview by Sirine Azouaoui

Our world explained with science. Every week, in your inbox.

Get the newsletter