Last year’s Day of the Dead marked a grim milestone. On November 1, the global death toll from the Covid-19 pandemic passed five million, official data suggested. It has now reached 5.5 million. But that figure is a significant underestimate.
Records of excess mortality, a metric that involves comparing all deaths recorded with those expected to occur, show many more people than this have died in the pandemic.
Working out how many more is a complex research challenge. It is not as simple as just counting up each country’s excess mortality figures. Some official data in this regard are flawed, scientists have found. And more than 100 countries do not collect reliable statistics on expected or actual deaths at all, or do not release them in a timely manner.
Demographers, data scientists and public-health experts are striving to narrow the uncertainties for a global estimate of pandemic deaths. These efforts, from both academics and journalists, use methods ranging from satellite images of cemeteries to door-to-door surveys and machine-learning computer models that try to extrapolate global estimates from available data.
Among these models, the World Health Organization (WHO) is still working on its first global estimate, but the Institute for Health Metrics and Evaluation in Seattle, Washington, offers daily updates of its own modelled results, as well as projections of how quickly the global toll might rise.
And one of the highest-profile attempts to model a global estimate has come from the news media. The Economist magazine in London has used a machine-learning approach to produce an estimate of 12 million to 22 million excess deaths – or between two and four times the pandemic’s official toll so far.
The uncertainty in this estimate is a discrepancy the size the population of Sweden. “The only fair thing to present at this point is a very wide range,” says Sondre Ulvund Solstad, a data scientist who leads The Economist’s modelling work. “But as more data come in, we are able to narrow it.”
The scramble to calculate a global death toll while the pandemic continues is an exercise that combines sophisticated statistical modelling with rapid-fire data gathering. Everyone involved knows any answer they provide will be provisional and imprecise. But they feel it is important to try. They want to acknowledge the true size and cost of the human tragedy of Covid-19 and they hope to counter misleading claims prompted by official figures such as China’s count of just under 5,000 Covid-19 deaths.
According to some estimates of excess deaths, the Covid-19 pandemic is the largest since the 1918-20 H1N1 influenza pandemic when scaled to 2020 populations.
Death and taxes are famously the only certainties in life, but countries account for each of them in vastly different ways. Even superficially similar places can have varying approaches to recording Covid-19 deaths. Early in the pandemic, countries such as the Netherlands counted only those individuals who died in hospital after testing positive for the coronavirus SARS-CoV-2.
Neighbouring Belgium included deaths in the community and everyone who died after showing symptoms of the disease, even if they weren’t diagnosed.
That is why researchers quickly turned to excess mortality as a proxy measure of the pandemic’s toll. Excess-death figures are seemingly easy to calculate: compare deaths during the pandemic with the average recorded over the previous five years or so. But even in wealthy countries with comprehensive and sophisticated systems to report deaths, excess-mortality figures can be misleading.
That’s because the most obvious way to calculate them can fail to account for changes in population structure.
“We should be careful about this issue, because looking at the average raw data is really flawed,” says Giacomo De Nicola, a statistician at Ludwig Maximilian University of Munich, Germany.
When De Nicola and colleagues worked on a 2021 study to calculate excess mortality caused by the pandemic in Germany, they found that comparing deaths to average mortality in previous years consistently underestimated the number of expected deaths, and so overstated excess deaths.
The reason was a rise in annual national mortality, contributed to by a surge in the number of people aged 80 and above – a generation too young to fight and die in the Second World War.
The difference for Germany is significant. Press-released raw data from the German statistical office last year reported five per cent more deaths in 2020 compared with 2019. But after taking the age structure into account, De Nicola’s group reduced this to just one per cent.
“Due to the lack of a generally accepted method for age-adjustment, I’m pretty certain this issue extends to many more countries,” he says.
Some demographers agree. “It concerns me that some so-called excess-deaths estimates by national statistical offices just use an average of the past five years of deaths as the expected deaths. In ageing populations, this is unlikely to be the best estimate,” says Tom Wilson, a demographer at the University of Melbourne, Australia.
Responding to De Nicola’s work, Felix zur Nieden, a demographer at Germany’s statistical office, says he agrees that raw numbers should be adjusted to take age structure and other subtleties into account.
More-sophisticated analyses adjust the expected deaths baseline to account for such biases, for example by raising the number of expected deaths as a population ages.
Probably the most comprehensive of these excess-mortality estimates come from Ariel Karlinsky, an economist at the Hebrew University of Jerusalem in Israel, and Dmitry Kobak, a data scientist at the University of Tübingen, Germany.
Since January 2021, Karlinsky and Kobak have produced a regularly updated database of all-cause mortality before and during the pandemic (2015–21) from as many sources and for as many places as possible — currently some 116 countries and territories.
Called the World Mortality Dataset (WMD), the bulk of the information comes from official death statistics collected and published by national offices and governments. The duo then works with these data to estimate excess mortality, including trying to take into account death tolls associated with armed conflict, natural disasters and heatwaves.
For example, they assumed that 4,000 lives were lost in both Armenia and Azerbaijan during the 2020 Nagorno-Karabakh war.
Karlinsky, who previously worked on health economics, recognized that even the best epidemiological models were based on official reported Covid-19 numbers that, for many places, were clearly too low or missing entirely. “Many people had been throwing around their conjectures about excess mortality without basing it on data,” he says.
In many cases, Karlinsky and Kobak’s estimates of excess deaths diverge significantly from Covid-19 mortality statistics released by governments. Russia, for instance, reported more than 300,000 COVID-19 deaths by the end of 2021, but is likely to have exceeded 1 million excess deaths in that time (see ‘Excess deaths’).
- A Wired report