STEMMING THE SPREAD: People wear face masks in Mexico City as a precaution against the 2009 swine flu pandemic that began in April of that year.© ELIANA APONTE/REUTERS/CORBIS

Two weeks into 2013, the governor of New York declared a statewide public-health emergency. Only halfway through the flu season, almost 20,000 cases of influenza had already been reported in the state, a fourfold increase over the previous winter. To tackle the epidemic, Governor Andrew Cuomo increased the availability of vaccines and allowed pharmacists to vaccinate people under 18.

As well as protecting individuals, better vaccination coverage can also help curb disease transmission. But there are still many outstanding questions regarding how infections like flu actually spread. Where do most transmissions occur? What dictates an individual’s risk of infection? Which groups should we target with control measures?

For many pathogens, social contacts are an important component of disease transmission. From...

There are still many outstanding questions regarding how infections like flu actually spread.

In recent years, researchers have begun to unravel how infections spread through a population. Plummeting sequencing costs have made it possible to sequence viral and microbial genomes from infected patients, and new survey methods are providing estimates of social interactions. Using these sources of information, epidemiologists are beginning to build a detailed picture of how social contact patterns influence outbreaks. Combining theoretical tools with large-scale data collection, these projects are providing valuable clues about how diseases spread, and how they might be stopped.

The 20/80 rule

PLOS MED, 5(3): e74. doi:10.1371/journal.pmed.0050074, 2008Much of the early work on disease transmission via social networks focused on sexually transmitted diseases. When HIV/AIDS emerged in the 1980s, there was an urgent need to find out how such infections travelled through the population as a result of sexual encounters. Researchers therefore used questionnaires to piece together networks of sexual partners, asking participants to list their contacts. One of the things that stood out was how uneven individual social networks were: while most people had few partners, some had lots.

High-risk individuals often play an important role in an outbreak. As well as being more likely to pick up infection, they are also more likely to pass it on. The degree to which people mix with those who are similar to themselves—known as assortative mixing—can also influence the speed and duration of an epidemic. Results from mathematical models suggest that highly assortative networks cause infection to spread more quickly initially, whereas less-assortative mixing, with more interactions between different risk groups, results in slower spread but in a larger overall epidemic.

Early studies of HIV transmission indicated that sexual-contact networks generally follow a “20/80 rule “—20 percent of cases are responsible for 80 percent of transmission. Therefore, targeting treatment at high-risk groups could in theory help control the spread of such infections. However, given limited resources, it can be challenging to work out how much focus should be placed on different risk groups. In particular, we still need to fully understand the effects of over- or under-targeting certain groups.

Schools and pandemics

As early as the 19th century, physicians noted that the distinctive cyclic occurrence of measles epidemics might be the result of school terms: not only were many children susceptible to infection, they also gathered together every day in a close-contact environment. When mathematical models of disease became popular in the 1980s, researchers began to study how the high number of contacts between children might shape an outbreak. However, there was very little information on transmission rates between different age groups when these models first emerged.

The problem was one of dimension. Suppose we have a model in which the population is divided into two groups: children and adults. There are four ways an infection could spread in such a model: a child could infect another child; an adult could give the disease to a child; a child could infect an adult; or infection could pass between two adults. If we had three age groups instead of just two—for example, children 5 and under, middle school students, and adults—we would need to account for nine possible routes and rates of transmission. For n groups, we have n x n unknown rates.

The 2008 POLYMOD study was the first large-scale questionnaire survey to quantify how different populations mix.

Unfortunately, most age-stratified sources of data that gave clues about transmission rates—such as the proportion of people that test positive for infection—were one-dimensional: the status of each age group was summarized by a single value. Given n age groups, researchers therefore had to estimate the n x n rates of transmission that might occur between these groups from only n pieces of empirical information. This inevitably meant making some strong assumptions about how different demographics interact.

That changed in 2008, when the results of a major European Union–funded project were published. The aim of the POLYMOD study1 was to find out how individuals in eight different European countries mixed with each other. Such information would improve mathematical models of disease spread, and therefore help public-health agencies explore different interventions. During the study, researchers recruited more than 7,000 participants from a wide range of age groups, giving them contact diaries to find out whom they met. Each participant recorded their contacts during a single randomly assigned day, noting which were physical in nature (such as kissing or handshakes) and which were conversational, involving a two-way conversation of at least three words.

The diary method was first pioneered in the late 1990s. Until then, nobody had demonstrated that questionnaires could be both a practical way of gathering information and a method that could produce data relevant to the spread of respiratory infections. Building on this work, POLYMOD was the first large-scale survey to quantify how different populations mix. Surveys were filled out by 7,290 participants of different ages who recorded the characteristics of their interactions with a total of 97,904 secondary contacts. As well as producing some interesting cultural observations—for example, Germans had the fewest daily contacts, with an average of 8, whereas Italians had the most, with almost 20—the results showed how the nature of social encounters can vary. Around 70 percent of all daily contacts lasted longer than an hour, while half of the interactions reported in schools involved physical contact. This was the sort of information epidemiologists had long wanted; it meant that simulations of diseases such as measles could finally include data on detailed and real social-contact patterns.

Initial simulations based on the POLYMOD data suggested that schoolchildren would have the highest risk of infection during a respiratory virus epidemic. The risk would be lower for adults and preschool children, and the chance of picking up infection as a result of social contacts would be smallest in the oldest age groups studied. Simulations also implied that the spread of infection would be reduced during a school vacation.

With children appearing to drive many disease epidemics, governments have previously considered closing all schools as a possible way of controlling a major outbreak. However, such measures could have severe economic costs owing to the expense of extra day care, or of parents missing work to stay home with their children. It is therefore important to understand how effective closures might be. Would sending children home at the first signs of a serious outbreak actually ease the burden on hospitals?

INTERACTIVE NETWORKS: A map of social interactions between French school children aged 6–12 years old, who were outfitted with proximity-sensing devices in a similar manner to the high school students in the Stanford study mentioned below. Each colored circle represents a pupil, with the color denoting their grade (1st through 5th) and class. Grey circles represent teachers and the size of the circle indicates that person’s number of interactions. The thickness of the grey line between two individuals shows the length of time they were in contact.PLOS ONE, 6(8): e23176. doi:10.1371/journal.pone.0023176, 2011.In 2011, a study led by researchers from Warwick University in the United Kingdom used the POLYMOD data to estimate how much school closures would reduce influenza transmission, and how that might affect demand for intensive-care beds in different regions of the U.K.2 They found that although coordinated school closures might reduce the number of fully occupied intensive-care units during the peak of the epidemic, the variation in demand for each bed meant that even widespread school closures would probably not be enough to stop demand for beds from surpassing capacity in many hospitals. Moreover, closures would need to be carefully timed and coordinated to have any chance of reducing the need for intensive-care beds. During the 2009 influenza pandemic, the British government decided not to close schools—a conclusion supported by the researchers’ results.

By collecting data on social contacts, studies such as the POLYMOD project have made it possible to see how different age groups contribute to an outbreak. Although it has long been presumed that children and schools drive epidemics, social surveys have now quantified just how important they are. The resultant data can be used to make better predictions about the effects of control measures.

Electronic sensors

Whereas diaries rely on the diligence and memory of survey respondents, another approach is to use wireless proximity sensors to track interactions. (See this page for a visual representation of social connections from a similar sensor study.) In 2010, a group led by Marcel Salathé of Stanford University kitted out students and teachers in a US high school with electronic sensors, which automatically recorded information on individuals’ locations, as well as their distances from other people.3 Analysis of the movements of nearly 800 participants—94 percent of the total school population—found that most had repeated but brief interactions, and that the closeness and number of contacts did not vary significantly between individuals. This suggests that within environments such as schools, people behave similarly to particles of gas, bumping into each other randomly without any particular individuals dominating the social network. Although the POLYMOD data suggest there is much variation in contact patterns across the population as a whole, the Stanford researchers noted that when looking at communities such as high schools, it might be reasonable for disease modelers to assume that everyone mixes homogeneously.

The sensor study was the first time the number of close interactions in a school had been precisely quantified. The results, which came from a typical school day, showed there were more than 760,000 separate occasions when participants were within 3 meters of each other. This provided an estimate of how much close interaction—and, hence, opportunity for infection—there is among school-age individuals. As with any method, however, there are limitations to the sensor approach. In particular, the sensors only record information about people who participate. To be effective, studies must therefore be limited to specific communities, such as schools and hospitals, that can ensure a high participation rate.

Relating contacts to infection

INFECTION PATHWAYS: Equine influenza virus (EIV) transmission mapped from viral sequences collected from 48 horses. Each horse is represented by a numbered circle and colored according to the training yard to which the horse belonged. The size of the circle represents the degree of mixed infection, as measured by the genetic diversity among different viruses found within that horse. Arrows show the probable path of the infection: the thickness of the line between two horses indicates how similar the viruses in the two animals were, and hence how much evidence there is that one infected the other.
View full size JPG | PDF
PLOS PATHOGENS, 8(12):e1003081. doi:10.1371/journal.ppat.1003081, 2012.
Contact surveys and sensors tell us a lot about how people interact, but we also need to know which of these interactions are important for disease transmission. Should we assume that airborne diseases like flu are generally passed on during close contacts that involve touching or kissing? Or is a sneeze or cough or even a conversation sufficient? Analyzing blood samples in combination with data from sensors or surveys can help us answer these questions.

Antibody levels against a particular flu strain provide an indication of who has previously been infected with that strain (or one closely related to it). We can then look at antibody measurements in different age groups, and compare this with the values predicted from flu models using social-contact data. Together with my colleague Julia Gog, I recently compared antibody levels against a number of different influenza viruses with predictions from simulations using the POLYMOD data. Based on our results, transmission by physical contacts appears to capture observed patterns better than disease spread through conversations.4

In recent years, researchers have also started to use sequence data to reconstruct transmission routes. Pathogens such as HIV and influenza evolve within a host, and by sequencing viruses isolated from patients it is possible to establish who might have given the infection to whom. Last year, research led by the University of Glasgow used this approach to trace the path of disease transmission during an outbreak of flu in horses.5 As well as discovering frequent mixed infections, with certain animals infected by viruses from several different horses, the researchers noted the benefits of combining sequence data with traditional epidemiological methods: using only epidemiological data could have led to an underestimate of the size of an outbreak. (See illustration on this page.) Sequencing techniques can be useful even if transmission occurred years earlier. In 2008, evolutionary biologists based at the University of Edinburgh managed to reconstruct the contact networks that drove HIV outbreaks during the 1990s.6

Although we have information about how people mix together on a normal day, we currently know little about how a major epidemic might affect interactions.

Such methods can also help reveal transmission routes while an outbreak is still occurring. In 2011, three babies picked up a MRSA infection in a hospital in Cambridge, England. Curious to see whether the cases were related, geneticists at the nearby Wellcome Trust Sanger Institute decided to sequence samples of the superbug taken from these patients.7 In addition to samples from the three babies, they also looked at samples taken from 12 other MRSA patients over the previous 6 months. The results showed that the new cases, as well as eight of the earlier ones, had been part of a single outbreak. Soon afterwards, the entire ward was sterilized, and the infections appeared to end. Two months later, however, there was another MRSA case. Wanting to find out where it had come from, the researchers screened more than 150 hospital staff members, and sequenced any MRSA bacteria found. Eventually the team came across someone who, despite having no symptoms, was a carrier of the same MRSA strain that had circulated through the hospital. Once the staff member had received treatment, the disease subsided. It was the first time genome sequencing had been used to stop an infectious disease outbreak in a hospital, and the project highlighted the value of sequence data in uncovering paths of transmission.

Future work

There is still more left to discover about how diseases spread through social networks. Contact diaries and sensors have helped us uncover how populations mix, but for diseases such as influenza, we need to pin down how these contacts affect actual transmission. We must also establish the proportion of transmission that comes from social contacts, and how much comes from other sources, such as airborne viruses or pathogens left on surfaces touched by infected individuals. Contact with animals is also likely to be important during the emergence of a novel infection. For example, the H7N9 virus that has recently infected a number of people in China was of avian origin. Understanding precisely what types of social encounter influence a person’s risk of infection—and by how much—would help improve predictions about the behavior of epidemics. Such information could also be useful for future studies, ensuring that surveys focus on the most important kinds of interaction.

Another remaining puzzle is the effect of immunity. For many diseases, from flu to dengue fever, prior exposure can change a person’s risk of getting—and passing on—a subsequent infection. As well as having different patterns of social contacts, different age groups will have varying infection histories, and therefore different levels of immunity. Unless we account for preexisting immunity in the population, it can be difficult to work out which parts of the social network are responsible for infection as a result of their contact patterns, and which cause transmission because they lack immunity.

Outbreaks can also change people’s behavior. Although we have information about how people mix together on a normal day, we currently know little about how a major epidemic might affect interactions. For example, people might take time off work with illness, or to look after family. One of the few studies to examine this issue, via a survey during the 2009 influenza pandemic, suggested that people make far fewer contacts when sick, and that this number decreases further the more serious the symptoms are.8 The next step will be to explore these changes in a larger, more detailed study.

If we want to understand disease outbreaks such as the flu epidemic earlier this year in New York State, we must understand how pathogens are transmitted. In addition to uncovering the precise relationship between social networks and transmission, we must also work out how immunity and behavior change over time. The increasing availability of social contact data, as well as sequence and serological data, should enable us to make progress on both problems over the coming years. Such work could prove crucial in a future outbreak: if we can find out more details about how diseases such as influenza spread, it might help us work out how they can be controlled. 

Adam Kucharski is a researcher in the MRC Centre for Outbreak Analysis and Modelling at Imperial College London. His work looks at the relationship between disease transmission, immunity, and social structure.


  1. J. Mossong et al., “Social contacts and mixing patterns relevant to the spread of infectious diseases,” PLOS Med, 5:e74, 2008.
  2. T. House et al., “Modelling the impact of local reactive school closures on critical care provision during an influenza pandemic,” Proc R Soc B, 278:2753-60, 2011.
  3. M. Salathé et al., “A high-resolution human contact network for infectious disease transmission,” PNAS, 107:22020-25, 2010.
  4. A.J. Kucharski, J.R. Gog, “The role of social contacts and original antigenic sin in shaping the age pattern of immunity to seasonal influenza,” PLOS Comput Biol, 8:e1002741, 2012.
  5. J. Hughes et al., “Transmission of equine influenza virus during an outbreak is characterized by frequent mixed infections and loose transmission bottlenecks,” PLOS Path, 8:e1003081, 2012.
  6. F. Lewis et al., “Episodic sexual transmission of HIV revealed by molecular phylodynamics,” PLOS Med, 5:e50, 2008.
  7. S.R. Harris et al., “Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study,” Lancet Infect Dis, 13:130-36, 2013.
  8. K.T. Eames et al., “The impact of illness and the impact of school closure on social contact patterns,” Health Technol Assess, 14:267-312, 2010.

Interested in reading more?

Magaizne Cover

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member?