Infographic: Meet R, the Shaky Metric Guiding Pandemic Forecasts
Infographic: Meet R, the Shaky Metric Guiding Pandemic Forecasts

Infographic: Meet R, the Shaky Metric Guiding Pandemic Forecasts

The basic reproductive R0, along with the more malleable effective reproduction number Re, are centerpieces of most epidemiological models that are informing government responses to COVID-19.

Katarina Zimmer
Katarina Zimmer
Jul 25, 2020


The reproductive number R describes the average number of individuals that a person infected with a particular pathogen infects. It depends on how that pathogen is transmitted as well as how often people come into contact with each other—factors that could vary depending on a pathogen’s strain and on the time and location of an outbreak. Scientists typically distinguish between R0, the basic reproductive number that describes disease transmission at the very beginning of an outbreak in a fully susceptible population, and Re, the effective reproductive number that describes transmission once measures such as social distancing or vaccination campaigns have been introduced. Re is typically much lower than R0.

The Scientist Staff


Researchers across the world have developed countless epidemiological models to project the future of the COVID-19 pandemic, and the effect of different public health policies on the spread of the causative virus, SARS-CoV-2. Most, but not all, models being used today give the two versions of R—R0 and Re—a central role. The basic reproductive number R0 describes the spread of a disease at the beginning of an outbreak, and Re, an “effective” version of the metric, describes spread later on.

Statistical models

Statistical techniques can predict the likely trajectory of an outbreak based on observed data. For example, an early iteration of a model developed by the University of Washington’s Institute of Health Metrics and Evaluation (IHME), which helped inform the White House’s response to the pandemic, works by characterizing the curve of death numbers in Wuhan and a number of European cities, and projecting those curves onto US data. 

various ©; The Scientist Staff

Relationship with R: Such models don’t typically use R, but are sometimes used to make quick estimates for R. 

Performance: Statistical techniques can be useful for making very short-term predictions, but they do not capture the dynamics of disease transmission or changing contact rates between people due to social distancing measures. Likely for these reasons, early predictions of the IHME model were off. As of early May, IHME has been using a new “hybrid” model that uses both statistical and susceptible-infectious-recovered (SIR) modeling techniques.

Susceptible-Infectious-Recovered (SIR) models

SIR models subdivide populations into compartments such as “susceptible,” “infectious,” or “recovered,” and sometimes other compartments such as “exposed but not yet infectious,” “asymptomatic,” or “dead.” Data on cases, hospitalizations, or deaths can inform estimations of the sizes of those compartments, and equations describe the speed at which people move from one compartment to the next.   

various ©; the scientist staff
various ©; the scientist staff

Relationship with R: SIR models calculate R using several parameters, including the probability of infection, contact rate, and the period over which an individual is infectious. Once calculated, R helps determine how quickly susceptible people become infected, and thus shapes how fast a disease spreads across a population.

Performance: SIR-type models capture the fundamental dynamics of disease transmission and the effects of public health interventions, but they are often criticized for ignoring differences in contact rates across a population. More-refined SIR-type models, however, do account for varying contact rates, and some correctly predicted the fade-out of the SARS-CoV-2 outbreak in Wuhan earlier this year.

Agent-based models

Agent-based models simulate individuals—or “agents”—interacting in various social settings and can estimate the spread of disease as these agents come into contact with others. Such simulations are often based on activity surveys, census data, de-identified mobile phone location data, and information from public transportation or airlines.

various ©; the scientist staff
various ©; the scientist staff

Relationship with R: Some researchers compute R separately and then plug it into their agent-based models, while others use these models to generate estimates of R and predict how R changes based on different interventions. In both cases, agent-based models typically calculate R per agent, unlike SIR-type models that calculate R over whole populations or demographics.

Performance: Several research groups prefer using agent-based models because they can simulate human behavior more accurately than SIR-type models and can predict how individuals’ decisions, such as staying at home, lead to collective or aggregate behavior, and thereby affect disease spread. However, such models require a lot of detailed data about human movement, and an enormous amount of computing power.

Read the full story.