Can Publication Records Predict Future PIs?

Researchers present a tool that uses a scientist’s PubMed data to estimate the probability of becoming a principal investigator in academia.

Jun 2, 2014
Tracy Vence

FLICKR, MOONLIGHTBULBThe odds of a scientist becoming an academic principal investigator (PI) can be predicted with publication data, according to Lucas Carey from Spain’s Pompeu Fabra University and his colleagues. The team developed an online tool, dubbed PIPredictor, which uses a machine-learning approach to analyze a user’s PubMed data and that has already churned out more than 800 career-success estimates to date. Carey and his colleagues describe their tool in Current Biology today (June 2).

“We show that becoming a research professor is highly predictable, [and] we analyze the features that are predictive” of success, Carey told The Scientist in an e-mail. The algorithm can predict who might become a PI and how long it could take for them to do so with an area under the curve (AUC), a measurement of accuracy, of 0.83 and 0.38, respectively.

While he was a postdoc in Eran Segal’s lab at the Weizmann Institute of Science in Rehovot, Israel, Carey and two then-PhD students, David van Dijk and Ohad Manor, decided to apply the machine learning-based approaches they were using to predict molecular mechanisms from gene expression data to guess who among them might one day become a PI. “There was quite a bit of debate at the beginning over if it would work, or if becoming a PI would turn out to be entirely non-predictable,” said Carey. “I was quite skeptical that we would be able to predict much, and we are all quite surprised [by] how predictable the entire process turned out to be.”

While having papers in Nature and Science can certainly help, it turns out that high-profile publications are not the only factors that determine whether an early-career scientist will one day lead her own academic lab, the team found. Rather, it’s the total number of publications, the impact factors of the journals in which they’re published, and whether each paper meets or exceeds the average number of citations for a given manuscript in that journal that seem to matter most. In other words, quantity and quality count. Overall, the researchers noted, higher h-indices—metrics that attempt to quantify the productivity and impact of a scientist’s publications—are predictive of a greater chance of academic career success, lending support to a concept first proposed in 2012 by Rehabilitation Institute of Chicago’s Daniel Acuna and his colleagues in Nature.

“However, both the scientist’s gender and the rank of their university are also of importance, suggesting that non-publication features play a statistically significant role in the academic hiring process,” Carey and his colleagues wrote in their paper. The researchers found that, given the same publication record and all else being equal, male authors are more likely to become PIs than their female counterparts. Their model controls for both gender and institution rank.

Randall Ribaudo, CEO and cofounder of the career training firm SciPhD.com, spent five years as a PI at the National Cancer Institute’s Laboratory of Immune Cell Biology in Bethesda, Maryland, before moving on to work in industry. According to PIPredictor, he currently has a 59 percent chance of becoming a PI, Ribaudo told The Scientist. He questioned whether the team’s tool could account for factors such as the academic job market, which has changed considerably during the last few decades. “Over the past 20 years, hiring for tenure-track positions has gone down a lot,” he wrote in an e-mail. “If [the authors] are using longitudinal data and not considering that downward trend over time, the better statistical likelihoods in earlier years could be giving artificially high numbers.”

Paula Stephan, who studies the scientific workforce at Georgia State University, agreed. “The times are changing, and with that, the underlying probability of becoming a PI,” she wrote in an e-mail to The Scientist.

Stephan added that while the model is “based on a well-constructed bibliometric database,” publication records alone cannot account for “factors that reflect the scientist’s ability to produce the type of research that may be funded, such as innovativeness [and] creativity.”

Even so, in an increasingly competitive job market, it could be beneficial for a young scientist to get a feel for where her publication record stands. “The tool provides one benchmark that early career scientists can use to see their relative position based on publication metrics and reputation of university,” said Stephan. “I suspect that many young scientists already have a good idea of where they are, and some a more accurate idea than this [tool] can provide within their narrowly defined research world.”

D. van Dijk et al., “Publication metrics and success on the academic job market,” Current Biology, 24(11), 2014.