Home / Chroniques / Covid-19 death rates are higher than reported
nursing home
π Health and biotech π Society

Covid-19 death rates are higher than reported

Nicolas Chopin
Nicolas Chopin
Professor of data science and machine learning at the ENSAE Centre for Research in Economics and Statistics*

Since the begin­ning of the pan­de­mic the world has tur­ned its atten­tion to Covid-19 data. The most often cited are the counts for new cases or num­ber of deaths. Yet, nei­ther of these figures are com­ple­te­ly reliable. During the lock­down in March 2020, Pro­fes­sor of data science and sta­tis­tics at the ENSAE, Nico­las Cho­pin took it upon him­self to look dee­per into the figures pro­vi­ded by French autho­ri­ties. As others have done around the world, he found that true Covid-19 mor­ta­li­ty rates were like­ly to be higher than those reported. 

During the lock­down, you took a look at the pan­de­mic data dif­fe­rent­ly. What did you find ? 

At the begin­ning of the pan­de­mic eve­ryone was tal­king about Covid-19 and I wan­ted to be help­ful in some way. I’m a data scien­tist so I sought out Open Data. Ori­gi­nal­ly, I wan­ted to com­pare the evo­lu­tion of the pan­de­mic bet­ween coun­tries. But very qui­ck­ly I was confron­ted with the rea­li­ty that this isn’t pos­sible. Tes­ting stra­te­gies varied enor­mous­ly bet­ween nations – espe­cial­ly at the begin­ning of the pan­de­mic – so it was obvious that the num­ber of new cases was an unre­liable comparison. 

Ins­tead, I loo­ked at the num­ber of deaths. Access to the data was actual­ly very easy in France because the two main sta­tis­tics ins­ti­tutes pro­vide pre­cise data on a dai­ly basis – San­té Publique France (SPF) and Ins­ti­tut Natio­nal de la Sta­tis­tique et des Etudes Eco­no­miques (INSEE). As the pan­de­mic pro­gres­sed, they also pro­gres­si­ve­ly impro­ved the data made available. 

Never­the­less, these figures didn’t quite match up either. What was mis­sing from the data ? 

First, the SPF pro­vides exten­sive amounts of data coming from hos­pi­tals. Howe­ver, this is limi­ted to hos­pi­tal-rela­ted infor­ma­tion and so only offers data with regards to patients trea­ted at hos­pi­tals. Second, the INSEE releases exhaus­tive data on deaths that have occur­red since March 2020, but there is no infor­ma­tion in these data on the cause of death. So alone, the data is dif­fi­cult to inter­pret. Toge­ther, howe­ver, they pro­vide valuable insight into the pro­gres­sion of the pan­de­mic in France. 

I star­ted by com­pa­ring the num­ber of deaths pro­vi­ded by the INSEE in pre­vious years to those of this year. Of course, there was an increase – this is the excess mor­ta­li­ty rate, which shows the num­ber of extra deaths com­pa­red to the num­ber of expec­ted deaths. And it was 60% higher on ave­rage than the recor­ded Covid-19 deaths pro­vi­ded by the SPF. It is like­ly that these deaths are most­ly due to Covid-19 deaths in elder­ly reti­re­ment homes. But they could be due to rela­ted causes, such as lack of treat­ment for other health pro­blems due to the crisis.

Later, I ana­ly­sed the same data by sepa­ra­ting men and women. Here, I found that the fac­tor for men alone was 1.56 and women was 2.4. This makes sense as there are three times more women than men in reti­re­ment homes on ave­rage across the coun­try. These ins­ti­tu­tions are much less equip­ped to col­lect and share data, hence the dis­cre­pan­cy in the numbers. 

Excess mor­ta­li­ty ver­sus covid deaths in hos­pi­tal during weeks 13 to 15 in 2020. Each dot cor­res­ponds to one week, one region, and one sex (men, women).

In the end, did this pro­vide you with a way to com­pare French data to other coun­tries as you had ori­gi­nal­ly hoped for ? 

I contac­ted col­leagues of mine in Ita­ly, Spain, England and Ger­ma­ny. But no other coun­try could pro­vide me with up to date data like ours. The English equi­va­lent to the SPF, Public Health England (PHE), strug­gled to keep up to date so much less data was avai­lable. In France, we have encoun­te­red pro­blems like this in the past. During the sum­mer heat wave in 2003, many elder­ly people lost their lives. At the time, there was lit­tle data avai­lable about mor­ta­li­ty in the public domain and it was fune­ral direc­tors who soun­ded the alarm as they were inun­da­ted with fune­rals. As a result, now the data is rea­di­ly available. 

Your work was only publi­shed on your blog, but it still made an impact. Can you tell us more about what happened ? 

An inter­es­ting point is that many people could have done it, not just me. The ana­ly­sis that I ran is basic enough to have been done by an under­gra­duate student. Yet, it high­lights how much of data science is about having the right data set to stu­dy in the first place. There were others like me who came to the same conclu­sion about how to ana­lyse the data – inde­pen­dent­ly of my ana­lyses. Spe­cia­li­sed jour­na­lists at the New York Times ran a paper about the excess mor­ta­li­ty rate – a sto­ry which was also picked up the The Guar­dian and Le Monde

For my part, I was invi­ted to a round table at the Aca­dé­mie des Sciences along with other mathe­ma­ti­cians who wor­ked on dif­ferent aspects of the pan­de­mic. Also, ano­ther data scien­tist in the USA, Gau­rav Sood, contac­ted me. He went fur­ther with the ana­lyses for his own blog, cal­cu­la­ting the ave­rage num­ber of years per death in dif­ferent coun­tries. The idea behind this is that many people could say “the ones who die are old and they will have died any­way”. On the contra­ry, he sho­wed that those who died of Covid-19 lost 9 years on average. 

There are very few coun­tries where the data regar­ding the num­ber of cases are com­ple­te­ly reliable – except per­haps in Ger­ma­ny. In cer­tain coun­tries with lack of tes­ting like Boli­via, the only way to fol­low the pan­de­mic is via excess mor­ta­li­ty. In France, we are somew­here in the middle. We have data for the num­ber of cases, but the excess mor­ta­li­ty rate can give a guide of a truer ver­sion of the figures for com­pa­ri­son. In the USA the Covid-19 case figures are not well mea­su­red. This is a poli­ti­cal­ly sen­si­tive issue. For me, making data like this open and trans­pa­rent is part of a heal­thy demo­cra­tic society.

Do your fin­dings on the ana­ly­sis of excess mor­ta­li­ty still hold for the second wave ?

Not all the data are avai­lable yet : the second wave is still under way and all-cause mor­ta­li­ty data are made avai­lable with a two-week delay. Howe­ver, on the basis of the data avai­lable so far, it seems to me that the phe­no­me­non obser­ved during the first wave is now less mar­ked. In other words, the dif­fe­rence bet­ween Covid-19 mor­ta­li­ty obser­ved in hos­pi­tal, on the one hand, and excess mor­ta­li­ty on the other, is still there but is less signi­fi­cant. At this stage, any expla­na­tion is neces­sa­ri­ly very hypo­the­ti­cal, but it is sim­ply pos­sible that the reti­re­ment homes are bet­ter able to cope with the Covid-19.

To find out more, you can visit Nico­las Cho­pin’s blog here.

Interview by James Bowers

Contributors

Nicolas Chopin

Nicolas Chopin

Professor of data science and machine learning at the ENSAE Centre for Research in Economics and Statistics*

Nicolas Chopin is the author or co-author of one book and more than 60 research papers on topics such as computational statistics, Bayesian inference, and probabilistic machine learning. He is a fellow of the IMS (Institute of Mathmetical Sciences), a member and fomer secretary of the research section of the RSS (Royal Statistical Society), a current or former associate editor for Annals of Statistics, Biometrika, Journal of the Royal Statistical Society, Statistics and Computing, and Statistical Methods & Applications.
*CREST: a joint research unit of CNRS, École Polytechnique - Institut Polytechnique de Paris, ENSAE Paris - Institut Polytechnique de Paris, GENES

Support accurate information rooted in the scientific method.

Donate