It’s So Hard to Know Who’s Dying of COVID-19—and When
It’s So Hard to Know Who’s Dying of COVID-19—and When

It’s So Hard to Know Who’s Dying of COVID-19—and When

It can take days for each death to be recorded in official statistics. “Nowcasting” estimates the actual occurrence of deaths, and the true peak of the pandemic.

David Adam
May 18, 2020

ABOVE: © ISTOCK.COM, SUDOK1

Every day for seven weeks since the end of March, the British government published a comparison of COVID-19 deaths in different countries. Last week, it stopped. A government spokesperson claimed that’s because ministers “varied the content and format” of the information they present to the public, according to news reports. Critics point out this need for variety only arose after the line on the graph to indicate cumulative UK deaths climbed above that of every other European country.

According to the most recent tallies from May 12, 32,692 people in the UK had lost their lives to the disease. That compares with 30,911 in Italy, 26,920 in Spain, 26,643 in France, and 7,667 in Germany. In the United States the recorded toll on that date was 81,779 deaths.

You can draw some very misleading conclusions if you don’t take into account that most of the data is by date of [death] registration and not date of [death] occurrence.

—David Leon, London School of Hygiene and Tropical Medicine

Those data look precise but they are anything but. Epidemiologists are fond of saying that all models are wrong but some are useful. The same is true for mortality statistics. Public health officials have long battled with inconsistent and delayed reporting of deaths, even within different regions of the same country. And that long-standing issue now poses a problem for researchers trying to track the pandemic and understand its implications.

“Every country’s vital registration system is different and no country’s vital registration system is perfect,” says Mark Hayward, a sociologist at the University of Texas at Austin who advises the US Centers for Disease Control and Prevention on mortality statistics. “They all have their own built-in quirks and lags.”

See “Watch the Spread of COVID-19

The problems start with identifying the cause of death and attributing it—or not—to COVID-19. Countries such as the Netherlands count only those individuals who died in the hospital after testing positive for the virus. Neighboring Belgium includes deaths in the community and everyone who died after showing symptoms of the disease, even if they didn’t undergo a diagnostic PCR test.

To be included in a country’s national count, each death must be registered locally and then recorded in a more centralized accounting system. Or systems. Different places do this in their own way too. At the start of the outbreak, UK government figures presented only deaths recorded electronically in the National Health Service (NHS) hospital network. At the end of April, they were extended to include a separate set of figures collected by the Office of National Statistics of community deaths, chiefly in care homes.

Reports of community deaths tend to take longer to collate than those from hospitals do. “In the United States they are based on counts from counties that are provided,” Hayward says. There are around 3,000 counties in the US. “You can imagine the different types of counties and how fast they can get mortality information, collect it, code it, and report it. It takes a while to roll this information up.”

In some national systems, the delay is because reports are still sent in by fax or post

Sometimes it can take several days to identify and inform family members, who tend to be told first, before the death report is processed And in some countries—England but not Scotland, for example—any coroner-led inquiries opened to determine cause of death, common when a healthcare worker dies, must reach a conclusion before the death is officially recorded. Such a procedure can take weeks, months, even years.

What many people don’t realize, says Sheila Bird, a statistician at the MRC Biostatistics Unit at the University of Cambridge, is these delays introduce a significant lag in mortality data. So, reported deaths for a given day or week—say, the week ending today—don’t typically reveal the number of people who died in the past seven days. They show only the number of people whose deaths were recorded that week. And in some cases these people died weeks or even months before.

“It is potentially important for some occupations that we may be undercounting the number of deaths,” Bird tells The Scientist—specifically, work-related deaths that would be referred to a coroner.

For some research, the delay—while not ideal—does not prevent useful analysis on mortality data. Maher El Chaar, a bariatric surgeon at St. Luke’s University Health Network in Pennsylvania, has used death statistics from New York City to compare fatalities among COVID-19 patients from different ethnic groups. He found that African Americans and Hispanics had the highest death rates, which he says is linked to higher occurrence of obesity in those communities.

“They may not be the most up-to-date figures and in our paper we said those figures are true up to a certain date,” he says. “That makes some comparisons very difficult. But in our study the way they were collecting the data was consistent among the different ethnic groups. However, we could not take those numbers are compare them to the same ethnic groups in different states or different countries.”

Nowcasting points to the true peak of COVID-19 deaths

Rolling now-casts of hospitalized deaths in England with 95 percent error bands from data released May 15. The sums of announced deaths are indicated with black crosses. 
Sheila Bird, University of Cambridge and Bent Nielsen, University of Oxford 

The lag in mortality data poses more of a problem for policy makers who want to, say, ease or enforce lockdown restrictions at short-notice based on real-time information on rising or falling death rates. (Confirmed new cases of the disease aren’t much use for that either, because those figures are heavily influenced by the numbers of tests carried out from week-to-week.)

At the end of April, researchers in the UK and Germany re-analysed hospital deaths in England for that month by the date of actual death, not recorded death. They found that 10 percent of the deaths took longer than five days to appear in the official statistics, and that for many days only a fraction of the deaths reported for a 24-hour period actually happened in that timeframe. For example, 801 people died of COVID-19 in English hospitals on April 8, but only 140 of those deaths were reported quickly enough to be included in the daily deaths announced for that day.

David Leon, an epidemiologist at the London School of Hygiene and Tropical Medicine who worked on that study, says, “You can draw some very misleading conclusions if you don’t take into account that most of the data is by date of [death] registration and not date of [death] occurrence.” Data arranged by date of actual death show a distinct peak and decline in hospital deaths after April 8, according to Leon’s analysis—a feature obscured in the official figures that show instead a far fuzzier plateau, with ups and downs.

Behind the scenes, Bird says, policy makers and their advisers will be using statistical tricks to account for the lag in the death data to present a more reliable, real-time indication of fatalities. A common technique is called nowcasting, which uses the typical statistical distribution of time lags in past recording of deaths to add expected delayed reports to a daily figure.

In the UK, the government has not published the results of such real-time death estimates. But Bird’s group calculates and posts online its own version of nowcasted daily English hospital deaths. “It’s a response to the fact that these are such important data they ought to be in the public domain,” she says. “And the version that is in the public domain ought to be statistically competent.”