Home / Chroniques / Researchers are using Google to forecast economic activity
π Digital π Economics

Researchers are using Google to forecast economic activity

SIMONI Anna
Anna Simoni
CNRS Director of Research at CREST and Professor of Econometrics and Statistics at ENSAE (IP Pairs)
Key takeaways
  • For some years now, Google's search data has been used to monitor or forecast economic activity.
  • This data, available on a weekly basis responds to a need for speed as it takes longer for traditional indicators such as GDP to become available.
  • Google searches are an interesting indicator of economic health as they provide real-time information on Google users' perception of the economy and their willingness to consume.
  • Indicators from Google are particularly relevant in times of crisis, as they react quickly to changes in the economy.

Why have research­ers and insti­tu­tions, such as the OECD, turned to Google data to fore­cast coun­tries’ eco­nom­ic activ­ity? What needs does it address? 

Usu­ally, to make mac­roe­co­nom­ic fore­casts, we use data from cent­ral banks or stat­ist­ic­al insti­tutes such as INSEE. This data is very inform­at­ive, but it is not imme­di­ately avail­able. That’s why people are inter­ested in oth­er data sources that can provide inform­a­tion in real-time. 

If an eco­nom­ic poli­cy­maker needs to boost the eco­nomy, for example, they need to know what the cur­rent eco­nom­ic situ­ation is. This is not pos­sible using only the offi­cial data. GDP data is pub­lished quarterly, on aver­age one and a half months after the end of the quarter in ques­tion. It is there­fore impossible to adjust eco­nom­ic policies instantly. The idea of using altern­at­ive sources, of which Google is one, is really to address this prob­lem of delayed offi­cial data. 

What Google tools are used in this mac­roe­co­nom­ic fore­cast­ing work?

There are two types of data from Google: Google Trends and Google Search. The primary source of both data­bases is the same, how­ever: words typed into the Google search engine. Most research­ers use Google Trends: a web page that every­one has access to. The data cor­res­ponds to search trends by coun­try and by cat­egory (enter­tain­ment, busi­ness, health, sci­ence, sports). Google assigns the search keyword to a category. 

Google searches can be seen as a sum­mary of how people per­ceive the economy.

Google Search, on the oth­er hand, provides data sets from inter­net searches, made avail­able by Google and giv­en to the European Cent­ral Bank. The two data­bases are con­struc­ted dif­fer­ently, Google Trends looks at the volume of searches while Google Search gives inform­a­tion on the change in volume. In my study, we worked with the Google Search data. 

So, the idea is to pre­dict eco­nom­ic health by ana­lys­ing what Google users type into the search engine. Where did this idea come from and why is this data relevant? 

The first papers on the sub­ject were pub­lished by Hal Vari­an, chief eco­nom­ist at Google. This data is fairly new: it has been avail­able since 2004, but I star­ted my pro­ject using Google data in 2016. The intu­ition and pre­sup­pos­i­tion behind using this data for mac­roe­co­nom­ic fore­cast­ing is that Google searches can be seen as a sum­mary of how people per­ceive the eco­nomy. If the eco­nomy is healthy, people tend to search for cul­ture, travel, etc. On the oth­er hand, if there are prob­lems with unem­ploy­ment, there will be more job-related searches. 

How effect­ive is this data in pre­dict­ing a coun­try’s eco­nom­ic activ­ity? Is it use­ful for pre­dict­ing peri­ods of growth and recession?

What I have observed in my research is that these tools are par­tic­u­larly use­ful in times of crisis. Dur­ing the crisis of 2008–2009, for example, Google data anti­cip­ated eco­nom­ic activ­ity well because it is more respons­ive to change, com­pared to offi­cial data.

How­ever, the data from Google has very little cor­rel­a­tion with GDP. Except in times of crisis, offi­cial inform­a­tion is still more inform­at­ive. It is also essen­tial to make a pre-selec­tion, as there are about 300 cat­egor­ies per coun­try. If you use them all, it can make the estim­a­tion pro­cess less clear. Before mak­ing a fore­cast, it is there­fore neces­sary to select the most cor­rel­ated Google cat­egor­ies to pre­dict GDP. If this is done, the res­ults can be very inform­at­ive, even for stable peri­ods, and when no offi­cial inform­a­tion is available. 

Which research cat­egor­ies are most use­ful for fore­cast­ing eco­nom­ic activity? 

The most cor­rel­ated cat­egor­ies are often con­sumer-related, such as leis­ure and enter­tain­ment. This is eas­ily explained: dur­ing times of eco­nom­ic sta­bil­ity, people are more inclined to buy. We should also con­sider cat­egor­ies related to social net­works. People may be more or less act­ive in their use of social net­works, depend­ing on the state of the eco­nomy: get­ting inform­a­tion on plat­forms or con­sult­ing sites like Linked­In to find job offers, for example. 

What are the advant­ages and lim­it­a­tions of Google data com­pared to offi­cial data? 

The main advant­age in com­par­is­on to offi­cial data is the issue of speed. In fact, the over­view of the eco­nomy is almost instant­an­eous: we listen to the news, we see that there is a war or a polit­ic­al crisis, we react imme­di­ately, we adapt our beha­viour. How­ever, indus­tries take much longer to adapt to an eco­nom­ic crisis, it does­n’t hap­pen overnight. Most eco­nom­ic act­ors are slower to react. 

The main lim­it­a­tion is that this data is dif­fi­cult to use. In my study we tried a num­ber of meth­ods, and some of them did not work at all. For example, the meth­od of pre-select­ing research cat­egor­ies only works dur­ing peri­ods of sta­bil­ity: in times of crisis, you should not pre-select. 

How do you see the future of eco­nom­ic fore­cast­ing, in terms of data sources? Do you think that the use of Google will con­tin­ue to grow? 

I don’t ima­gine that any one data source will become bet­ter than oth­ers: we will con­tin­ue to use sev­er­al sources and sev­er­al mod­els. Depend­ing on the eco­nom­ic con­text, we will have bet­ter fore­casts with cer­tain data. What we need to do now, in addi­tion to fur­ther auto­mat­ing the meth­ods we have put in place, is to com­pare the per­form­ance of Google data in terms of fore­cast­ing against oth­er altern­at­ive sources of data, such as news­pa­per art­icles that fol­low eco­nom­ic and fin­an­cial news: this is what I am try­ing to apply in my research.

Interview by Sirine Azouaoui

Support accurate information rooted in the scientific method.

Donate