A study published Monday (August 30) by the National Bureau of Economic Research finds that new ideas in biomedical research are less likely to spread when they are generated by women and minorities than when generated by men.

The authors of the study, which was not peer-reviewed, used a computational technique called natural language processing to scan titles and abstracts in MEDLINE for novel one, two, or three-word phrases originating in biomedical research papers published between 1980 and 2008. The researchers ranked these phrases by the total number of mentions they received in the year when they first appeared and analyzed the top 0.1 percent of phrases for each year to assess whether each represented an actual new idea or scientific innovation. 

“We tracked specific ideas by using the information in the title and abstract in the biomedical literature, so that we know which specific idea travels from one scientist to another scientist, from one location to another location,” says coauthor Wei Cheng, an economist at East China University of Science and Technology. 

After homing in on a list of top innovations (which included developments such as “polymerase chain reaction” and “HIV/AIDS”), Cheng and her former PhD advisor, Ohio State University economist Bruce Weinberg, analyzed the social networks of the authors of the papers to coin the innovative phrases (called innovators), determining who was a collaborator, a collaborator of a collaborator, and so on.

From this, they determined a pool of potential adopters of the new idea: people who had published in similar areas or who had similar MeSH (medical subject headings) terms in their publications. They then calculated how the likelihood of a potential adopter picking up the idea related to factors such as their network distance to the innovator, and on the innovators’ race or gender. 

Ideas generated by teams of mostly male innovators were mentioned just over one percent more frequently in subsequent titles and abstracts than ideas generated by mostly female teams over a period of five years after they were first published. “We find that women’s ideas are less adopted,” says Cheng, in part because “women are not as well-connected in networks as men,” at least when it comes to short-range connections. Cheng and Weinberg found that teams of mostly female innovators had fewer close network connections—people one to five steps away. The analysis shows that ideas are more likely to be adopted when the potential adopter is close to the innovator, so if women have fewer short-range connections, says Cheng, that could be bringing down the total adoption rate for their ideas.  

Even when smaller networks are corrected for, she says, women’s ideas were less likely to be adopted, even by other women. For closer members of their networks, women were more likely to adopt ideas from women than from men, but as the distance from the innovator increased, women were actually less likely to adopt new ideas from other women than they were from men.

“So when we just increase the proportion of female researchers in biomedical [research], it does not actually increase the overall adoption rates for an idea from a female innovator,” says Cheng. 

Their analysis also shows that Black and Hispanic innovators have fewer close-range network connections than white and Asian innovators, and that ideas with a substantial number of Black or Hispanic authors were less likely than average to be mentioned by researchers more than one step away (ideas from teams with some Hispanic authors had the highest adoption rate of any group at one step away, however). 

Despite early-career researchers having fewer network connections, their ideas were more likely to be adopted than mid- or late-career researchers. 

According to Weinberg, one reason he and Cheng focused on idea spread among biomedical scientists—instead of, say, geologists or astronomers—was because the questions biomedical scientists study are intrinsically tied to human health and often pertain to specific populations. 

“We think about researchers as frequently focusing on conditions that are relevant for their groups,” he says. Thus, the identity of the innovator and the nature of the idea could have substantial effects on what kinds of medical conditions get attention. 

“Women’s health conditions are more studied by female researchers,” says Cheng. “So when the research conducted by female scientists is getting less attention, that might lead to some disparity in terms of female and male health conditions.” 

Some limitations

The authors note a few caveats to their study. For one thing, says Weinberg, race and gender were determined by algorithms from authors’ names and were not self-reported—meaning that they weren’t always accurate. He adds that the algorithms they used weren’t set up to account for gender fluidity and only included a handful of racial and ethnic categories. 

Weinberg also notes that their study wasn’t designed to determine causality, since it was not a controlled experiment—rather, it could only find associations. Cheng says that the current study is part of a larger project they’re working on in which they hope to get a more solid grasp on possible causality. 

According to University of Colorado Boulder statistician Aaron Clauset, who was not involved in the study, it’s hard to really know why someone chooses to adopt one idea over another without directly asking them in an interview, but since that approach wouldn’t be feasible at the scale used in the paper, he says the computational approach the authors took was reasonable. 

Being able to quantify the flow of ideas at [the scale of the new analysis] is just plain difficult.

—Aaron Clauset, University of Colorado, Boulder

“Being able to quantify the flow of ideas at [the scale of the new analysis] is just plain difficult,” he says.

Even though we can’t directly see the full picture of bias in idea adoption, he says we can see the shadows it casts in the data the researchers gathered. “The more shadows you can put together, the better reconstruction you can get at the phenomenon,” says Clauset.

To explore more of these “shadows” and their impact on idea diffusion, Clauset suggests analyzing other forms of social networks in scientific communities, such as who went to graduate school together or who acknowledges whom at the end of their papers. 

Quantitative researcher Mathijs De Vaan of University of California, Berkeley, says he’d like to see validation of the new ideas the analysis extracted via natural language processing. Sometimes, he says, it can be difficult even for him to unpack the novelty of a paper just by reading the abstract, “let alone a machine that doesn’t have the knowledge I have as a researcher.” But if validated, he says the concept of using a machine to identify new ideas from scientific abstracts is “really cool” and would be a “huge innovation in innovation research.”

“Inequality is encoded into our social networks”

Even with its limitations, the study shows that “in some ways, inequality is encoded into our social networks,” says University of California, Berkeley, computational social scientist Douglas Guilbeault, who was not involved in the work. 

“If you can identify clear inequalities by gender and race in terms of network structure, it creates new opportunities for network-based interventions,” he says.

These could include institutions reshuffling research teams to have more race or gender balance, as well as promoting more diverse teams, says Guilbeault, such as by awarding them grants or helping researchers from underrepresented groups make connections at meetings. 

“I guess the self-help tip is to be aware of your implicit biases,” says Weinberg, especially when reviewing proposals, papers, or CVs. As scientists, he says, “our stock-in-trade is being at the frontier of knowledge.” And challenging one’s biases when a great idea is on the line “is a way of getting closer and staying closer to the frontier of knowledge.”