ISTOCK, JOHAN63University of Pittsburgh researcher Seth Weinberg first got in touch with personal genetics company 23andMe to talk about data in September, 2015. He and Pittsburgh colleague John Shaffer were studying the genetic factors underlying earlobe attachment—whether that fleshy part at the bottom of the ear hangs loose or is fixed to the side of the head. By a few years ago, they and their collaborators had assembled genetic data from around 10,000 volunteers, and identified six loci in the human genome related to earlobe variation.

During the project, Weinberg recalled a 2010 research paper from 23andMe that had analyzed genetic data from several thousand customers who had consented to be involved in research and had entered information about themselves online. “This paper was about a bunch of different traits,” Weinberg tells The Scientist. “One of them was earlobe attachment.”

Weinberg reached out to see whether 23andMe...

The data that 23andMe had available by then, it turned out, included the genetic details and self-reported earlobe features from nearly 65,000 customers. The paper that resulted from that collaboration, published yesterday (November 30) in the American Journal of Human Genetics, not only replicates the six loci highlighted by the Pittsburg researchers’ original 10,000-person cohort, it contributes another 43.

“Our data confirmed all of their findings, and included additional findings because our sample size was quite a bit larger,” says 23andMe principal scientist and statistical geneticist David Hinds. “The confirmation of their earlier findings gave them some confidence we were really measuring the same thing.”

23andMe advertises a customer base of more than 2 million people from all over the world.

The results implicated several transcription factor genes whose disruption is already known to be involved in certain developmental disorders. “This fits in with this greater theme that when a certain gene is severely disrupted, it can lead to a syndrome that has manifestations including in the ear,” says Shaffer. “But more-subtle genetic variation in and around that gene—those regulatory variants are what help shape interpersonal variation in morphology.”

The study highlights the genetic complexity of a morphological trait that has historically been considered relatively simple. It also becomes the 87th scientific article listed on 23andMe’s publication page since the company’s first appearance in the scientific literature seven years ago. Now a little more than a decade old, 23andMe advertises a customer base of more than 2 million people from all over the world—85 percent of whom have agreed to lend their personal data to 23andMe’s research team—and signs suggest that more and more researchers are taking notice.

A research company from the start

Founded in Mountain View, California, in 2006, 23andMe got its start by offering to analyze customers’ DNA from a spit sample for a few hundred dollars, and return personalized health-related interpretations, including risk estimates for various diseases. The model attracted some serious investment from companies such as Google, which had poured around $7 million into the fledgling company by 2009. (At the time, 23andMe cofounder Anne Wojcicki was married to Google cofounder Sergey Brin.)

In 2013, though, the company got into trouble with the US Food and Drug Administration (FDA) for providing health-based information without regulatory clearance. Since then, 23andMe has made a point of working with the FDA, and in April this year became the first company to achieve regulatory approval for a direct-to-consumer test offering health risk information—in this case, for 10 conditions, including Parkinson’s and Celiac diseases, for a cost of $199.

Throughout that trajectory, 23andMe has kept a firm focus on using its data set for scientific and medical research, generally to a greater extent than its competitors. That’s not to say it’s the only company pursuing research—its main rival is personal genetic testing company Ancestry.com, which has around 5 million customers. Earlier this year, for instance, Ancestry.com contributed data to a study on the post-colonial population structure of North America. And the smaller, 800,000-person database collected from personal tests marketed by National Geographic has been used to inform research on migration paths.

But 23andMe is by far the most prominent scientific research player among those with consumer databases. “The sheer numbers give you a very powerful approach to scientific discovery,” says Robert Green, director of the Genomes to People translational genomics program at Brigham and Women’s Hospital, the Broad Institute, and Harvard Medical School. “I think they realized that early on, and that was a very conscious part of the plan. It continues to be a huge part of what they’re doing.”

*As of November 30SOURCE: 23andMe

As the data set has grown, so has the number of people in 23andMe’s research team, which now comprises 42 people, more than half of whom joined since 2015. So too has interest from the greater scientific community. While the company has a number of collaborations with industry—an ongoing project with Pfizer, for example, aims to pinpoint the causes of lupus—increasingly, academic groups are getting in on the act. Researchers approach the company “interested in replicating a finding, [or] seeking out additional data to make their findings stronger,” says Hinds, who joined 23andMe in 2009.

You basically get a lot from the 23andMe population that you would never get from somewhere else.

—John Perry,
University of Cambridge

That interest is such that 23andMe has instituted a more “formal process” for soliciting academic research proposals directly, Hinds notes. The company now holds biannual meetings to decide which projects to pursue. “We get anywhere from 30 to 50 proposals, twice a year,” he says, adding that the team tries to select the projects in which 23andMe’s data can provide the most benefit. “We have more than 50 active collaborations at some point in that process.”

For most of those collaborations, 23andMe’s in-house research team contributes analyses (the raw data never leaves the company without customers’ explicit consent, Hinds says), and helps researchers answer questions on topics as complicated as the genetic basis of depression. This fall alone, in addition to this week’s paper, the company has contributed statistical power to genetic studies investigating susceptibility to restless leg syndrome, spontaneous preterm birth, Parkinson’s disease, and a host of infectious diseases including everything from chickenpox to bacterial meningitis.

What it’s like to collaborate with 23andMe

Certainly, the volume of data that 23andMe has at its disposal offers a large-scale approach that’s simply not feasible for many individual research projects. The company’s most recent contribution to Parkinson’s disease research, for instance, involved customer data from nearly 6,500 Parkinson’s patients and more than 300,000 controls.

But there’s more to the company’s data set than just the number of people signed up to the service, says John Perry, a human geneticist at the University of Cambridge who has worked with 23andMe to investigate genetic links between age at puberty and susceptibility to cancer and other diseases. An ongoing online relationship with customers allows 23andMe’s team to collect data in a much more dynamic way than what is feasible for many academic research groups, he explains.

“They can just, ad hoc, ask any questions they want,” Perry says. Compared to a researcher handing out brief questionnaires to a group of study participants at a research site, 23andMe’s researchers have “more flexibility in what they can ask, when they can ask it—you basically get a lot from the 23andMe population that you would never get from somewhere else.”

Of course, that flexibility comes with limitations. Researchers can’t necessarily validate the self-reported data they are receiving, Perry notes, and many traits—including disease-related features such as blood levels of particular metabolites—may be impossible for participants to observe, let alone report, accurately without medical testing.

But the value of such self-reported data sets is perceived more highly than it used to be, in part thanks to the success of 23andMe’s research contributions, notes Weinberg. “Historically, people were very skeptical you’d be able to collect data in this relatively simplistic way and still yield the results,” he says. “But I think they’ve proven again and again that you can do that. There is strength in numbers.”

We found them to be very good partners in this endeavor.

—Seth Weinberg,
University of Pittsburgh

23andMe’s research team has also made a habit of being flexible in its partnerships, according to collaborators who spoke to The Scientist. Anna Shcherbina, a PhD student in cardiologist Euan Ashley’s lab at Stanford University, notes that 23andMe modified its analyses to meet her requirements for a study on the physiological effects of time spent sitting down. She needed categorical data on how often people were standing at work—“always,” “sometimes,” “never,” for example—not how many hours they’d spent standing or sitting, as the company had asked its customers. “They modified their query system to do that,” Shcherbina says.

Weinberg and Shaffer had a similarly positive experience. “We found them to be very good partners in this endeavor,” agrees Weinberg. “They’ve been relatively flexible and accommodating. . . . They didn’t present us with any undue demands. I would definitely work with them again.”

Privacy concerns and popularity

Last Sunday, Senator Chuck Schumer (D-NY) singled out genetic testing companies such as 23andMe for what he described as unclear privacy policies, and called for more scrutiny of the industry as a whole. “Here’s what many customers don’t realize, that their sensitive information can end up in the hands of unknown third-party companies,” he told a news conference. (A 23andMe spokesman took to NBC News that evening to explain that the company does not sell or share data without customers’ explicit, optional consent.)

Customers don’t appear to share Schumer’s concerns. The company’s home DNA test was one of Amazon’s five best-selling items on Black Friday this year. Its database is continuing to grow, and researchers who have worked with 23andMe suspect that the company’s research model has staying power. “This company is collecting all this self-reported data, [and] can outsource some of the science to leaders in the field,” says Shaffer, who has also worked with 23andMe to study the genetics of appendicitis. “Through collaborations they’re able to say, ‘Hey, we have this great and growing resource, we’d like to team up with you to help make the best science possible.’ I think that’s a model, really, for the future for how genetics can be done.”

Green, whose work helped show that customers of personal genetic testing companies have a higher comprehension of their results than some governmental organizations had feared, takes a similar view. “We have to hold companies to certain standards in terms of quality, validation, accurate measuring—all of those things,” he says. “But once we do that, I think we should really celebrate the creativity and innovation that goes into these new models.”

Interested in reading more?

The Scientist ARCHIVES

Become a Member of

Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member?