FLICKR, MOONLIGHTBULBIn the past week, researchers and journalists have scrambled to map the spread of H7N9 bird flu through China to identify its source and highlight at risk areas. Mapping is a common response to outbreaks, especially of new diseases, but some scientists believe it must become a more proactive part of disease control
Such mapping techniques have been used since the turn of the 19th century to track outbreaks in real time and understand their causes. In 1854, for example, John Snow penned a famous diagram of a cholera outbreak in London, pinpointing a water pump as the source.
Despite this long history, however, efforts to plot the locations of infectious diseases still tend to be reactive rather than proactive. And while local outbreaks are regularly and thoroughly mapped, the broader landscape is far murkier. According to a team of scientists led by Simon Hay from the University of Oxford in the United Kingdom, only 4 percent of important infectious diseases have been comprehensively mapped at a global scale. The rest are plagued by patchy data.
“We have very little information on where in the world diseases are,” said David Pigott, who is part of Hay’s team at Oxford. Such information is crucial when it comes to planning surveillance, risk assessments, vaccination programs, and outbreak responses. “[For example], if you get cases outside of a known distribution, you can rapidly see if there’s a genuine range expansion or a misdiagnosis,” said Pigott. “It’s such an integral part of disease control.”
The team audited existing maps for 174 infectious diseases of clinical importance. Following a huge systematic review, they scored the maps for each disease according to how much of the known global range is covered and the quality of the data—whether they were up-to-date and whether they relied on accurate measures like molecular diagnostics or GPS coordinates, rather than unverified expert opinion.
“It’s a very impressive study,” said Tom Koch, an expert on medical maps from the University of British Columbia, who was not involved in the study. “It brings a whole mass of data together and presents a portrait from which we can do interesting work.”
With a score of 75 out of 100 considered a passing grade, only 7 diseases met that criterion, including dengue fever, monkeypox, and two types of malaria. Most infections, including some intensely studied diseases like HIV, failed to meet the benchmark because of a trade-off between quality and scale. “There was really detailed clinical data where someone had gone to a village and done a map at a small scale,” said Pigott. “But maps that did cover the world were of lower quality and relied on an expert saying, ‘I know it’s in this country.’”
And even the highest-scoring diseases have room for improvement. After an exhaustive review of the whereabouts of dengue fever, recently published in Nature, Hay’s team concluded that there are 390 million infections every year, more than three times the number estimated by the World Health Organization.
Despite shortfalls like this, Hay and his colleagues optimistic. They argue that technology can help to plug the gaps in our maps in the future, and they point to several untapped sources of data. For example, both PubMed and GenBank, which collect biomedical literature and gene sequences respectively, contain geospatial information for the majority of diseases that the team reviewed. And social networks like Twitter can provide invaluable real-time clues about spreading symptoms and illnesses, often tagged with geographical information. During the 2009 outbreak of H1N1 swine flu, for example, Twitter predicted outbreaks 1 or 2 weeks ahead of traditional surveillance measures.
However, Koch cautions that disease data are not as freely available as the team suggests. In some cases, “privacy concerns and the proprietary attitudes of governments have made it harder for us to get some types of data, such as mortality data from an outbreak,” he said. “[John] Snow would never have got his data today.”
John Brownstein from Boston Children’s Hospital, one of Hay’s team, faced these problems during his PhD work on West Nile Virus and Lyme disease. “I struggled because governments or researchers wouldn’t share their information,” he said. “But there was all this incredible knowledge on the web being discussed through professional networks or news media.”
To collate those rich but disparate data sources, Brownstein created HealthMap—a website that automatically monitors, organizes and maps information on infectious diseases from unconnected sources. These include Google news, mailing lists like ProMED Mail, and bulletins from organizations such as the World Health Organization. Another site called Biocaster, developed by Nigel Collier at the National Institute of Informatics and international collaborators, works along similar lines.
Hay’s team believes that the problem now is not a lack of data but a deluge of it. Sites like HealthMap and BioCaster are already using learning algorithms to filter online sources for information relevant to infections. They are also using crowdsourcing tools that ask online volunteers to check if flagged social media chatter actually relates to the disease of interest.
These solutions are not panaceas, however. Brownstein emphasizes the need to build regional contacts to get the right data in the first place. “The local aspect is critical,” he said. “Our team is mining Weibo [a prominent Chinese social network] for information on H7N9. The things we’ve been able to get from that information are unbelievable,”—such as a few reports of H7N9 cases that emerged well before they were officially reported. “This wasn’t available during SARS.”