Modified from ©, the scientist staff

Australia’s government drug safety watchdog sounded the alarm about the oral antifungal agent terbinafine in 1996. The drug, sold under the brand name Lamisil by pharma giant Novartis, had come onto the market in 1993 for the treatment of fungal skin infections and thrush. But three years later, the agency had received a number of reports of suspected adverse reactions, including 11 reports of liver toxicity. By 2008, three deaths from liver failure and 70 liver reactions had been pinned on oral terbinafine.

Researchers in Canada identified the biochemical culprit behind terbinafine’s liver toxicity—a compound called TBF-A that appeared to be a metabolite of terbinafine—in 2001. Clinicians quickly learned to monitor and manage this potential toxicity during treatment, but no one could work out how the compound actually formed in the liver, or could experimentally reproduce its synthesis from terbinafine in the...

Then, in 2018, graduate student Na Le Dang at Washington University in St. Louis hit upon a way to use artificial intelligence (AI)—specifically, a machine learning algorithm—to work out the possible biochemical pathways terbinafine takes when it is metabolized by the liver. Trained on large numbers of known metabolic pathways, the algorithm had learned the most likely outcomes when different types of molecules were broken down in the organ. With that information, it was able to identify what no human could: that the metabolism of terbinafine to TBF-A was a two-step process.

It’s really our ambition to automate as much as possible, so the chemists are just focusing on the much higher level, the difficult problems, the strategy.

—Adrian Schreyer, Exscientia

Two-step metabolites are much more difficult than direct metabolites to detect experimentally, which is likely why this potentially lethal outcome wasn’t flagged until after the product was on the market, says S. Joshua Swamidass, a physician scientist and computational biologist at Washington University and Dang’s supervisor. The discovery not only shed light on a long-standing biochemical mystery, but showed how the use of AI could more broadly aid drug discovery and development.

Given enough data, machine learning algorithms can identify patterns, and then use those patterns to make predictions or classify new data much faster than any human. “A lot of the questions that are really facing drug development teams are no longer the sorts of questions that people think that they can handle from just sorting through data in their heads,” Swamidass says. “There’s got to be some sort of systematic way of looking at large amounts of data . . . to answer questions and to get insight into how to do things.”

AI is well positioned to handle the complexity of the rules that must be applied to understand these data, too, says Regina Barzilay, a computer scientist at MIT and a scientific advisor for drugmaker Janssen, a subsidiary of Johnson & Johnson. “When we study chemistry, we definitely study a lot of rules and we understand the mechanism, but sometimes they’re really, really complex,” she says. “If [a] machine is provided with a lot of data, and the problem is formulated correctly, it has a chance to capture patterns which humans may not be able to capture.”

Machine learning couldn’t have matured at a better time for the pharmaceutical industry, says Shahar Keinan, cofounder and chief scientific officer at Cloud Pharmaceuticals, a company focusing on AI–based drug discovery. She argues that the steady decline in the number of new drug targets, new mechanisms, and novel first-in-class drugs coming to market each year is an indication that the current system of drug discovery is insufficient to meet modern challenges. “Right now, you need to do more work to get to those kind of first-in-class [drugs],” she says. “The way to overcome that . . . is to find something new in what [data] we have, and this is where artificial intelligence will come in.”

The pharmaceutical industry and its investors seem to agree. Last year Cloud Pharmaceuticals entered into a drug discovery collaboration with pharma giant GlaxoSmithKline. And UK-based BenevolentAI, which describes its mission as “accelerating the journey from data to medicine” in areas ranging from drug discovery to clinical development, recently earned an eyebrow-raising multibillion-dollar valuation.

Using artificial intelligence to design better drugs

More than 450 medicines worldwide have been withdrawn from the market post-approval in the last half century as a result of adverse reactions, with liver toxicity the most common side effect. But the metabolism of compounds by organs such as the liver is extremely complex and, as in the case of terbinafine, difficult to anticipate.

This is exactly the sort of problem that machine learning can help solve—and the data are already available to help that process. For example, the US federal government’s Tox21 program, a collaboration among the Environmental Protection Agency, the National Institutes of Health, and the Food and Drug Administration, maintains a large data set of molecules and their toxicity against key human proteins—perfect fodder for AI to digest in search of patterns of association between structure, properties, function, and possible toxic effects, Keinan says.

Cloud Pharmaceuticals is one company that makes use of these data as part of their workflow. “Now you can train a machine learning algorithm on this data set, then a new molecule comes along, and you can just use your prediction to say, ‘Is this molecule toxic or not?’” says Keinan.

As well as identifying potential toxicities, machine learning algorithms could predict how a candidate molecule will respond to different physical and chemical environments, and so help drug developers understand how that molecule might behave in various tissues in the human body.

If a machine is provided with a lot of data, and the problem is formulated correctly, it has a chance to capture patterns which humans may not be able to capture.

—Regina Barzilay, MIT

A group led by physical chemist Scott Hopkins of the University of Waterloo, in collaboration with researchers at Pfizer, has been training an algorithm to do just that, using data on 89 small-molecule drug candidates obtained by performing a type of spectrometry that measures how quickly molecules absorb or lose water.

“If our drug molecule absorbs a lot of water very quickly and doesn’t give it up, it tells us that that drug is going to be very soluble in water,” Hopkins says. “It’s going to dissolve easily in your stomach and enter your bloodstream fairly quickly.” After the algorithm learned the associations between certain molecular structures and solubility from those 89 molecules, it was able to accurately predict the key properties of a similar molecule, the team reported at the end of last year.

Although screening for potential toxicities and for biochemical properties are essential sub-tasks in the process of drug development, a more tantalizing question is whether AI could suggest the structure of a new therapeutic molecule from scratch.

At Maryland-based biotech Insilico Medicine, CEO Alexander Zhavoronkov and colleagues are using a type of algorithm called a generative adversarial network to help develop entirely novel small-molecule organic compounds aimed at treating everything from cancer to metabolic disease to neurodegenerative disorders. This algorithm consists of two deep neural networks pitted against one another, Zhavoronkov explains.

The first deep neural network has the challenge of coming up with outputs—molecular structures—in response to a series of inputs, namely the desired functional and biochemical characteristics of those structures, such as solubility, targets, or bioavailability. The other deep neural network has the job of critiquing those outputs. 

“They engage in this adversarial game,” Zhavoronkov says. “They basically kind of compete with each other . . . and in this process, after many, many iterations, they learn to generate something new.”

Insilico Medicine recently synthesized its first molecule from this process in partnership with China-based pharmaceutical company Wuxi AppTec. The two companies have now entered into a collaboration agreement to test other candidate molecules designed by the adversarial network approach for multiple orphan drug targets.

Other companies, such as Exscientia, a UK-based, AI-driven drug discovery company, are bringing machine learning to bear on the entire process of drug discovery. Exscientia is building a platform that aims to mimic the decision making of a medicinal chemist while also learning from human medicinal chemists’ input. “It’s really our ambition to automate as much as possible, [so] the chemists are just focusing on the much higher level, the difficult problems, the strategy,” says Adrian Schreyer, chief technology officer at the company. This means the biological targets and the molecules designed to bind them will all be influenced by the outputs of their AI platform.

But there is still a human hand on the tiller. The algorithm “is fairly autonomous in terms of generating compounds, [but] the final selection is done by chemists,” Schreyer says. This is partly because the patent system demands a human name to an invention, but also because the selections that the chemists make from the AI-generated options can be fed back into the algorithm to fine-tune its decision-making protocol for the next time. “If you see twenty to thirty compounds and [humans] pick ten . . . by virtue of the decision making you can see what’s preferable, you can use that process itself to build a machine learning model.”

Exscientia’s approach has generated considerable interest from the pharmaceutical industry, with Roche investing up to $67 million in a drug-discovery collaboration with the company. Another recent financing round has earned the company $26 million in investment from companies including Celgene Corporation and GT Healthcare Capital Partners.

The limitations of artificial intelligence

Despite progress in using AI for drug development, it’s not time for humans to step out of the process just yet, says Barzilay. For now, she explains, AI in pharma is analogous to a smart kitchen: “You have a microwave, and you have a coffee machine, and you have this and that, but none of it actually cooks you the dinner. . . . You need to put it all together to make a dinner, even though all these things can help you to do it faster and better.” 

While Barzilay and colleagues are part of a pharma-funded research collaboration with MIT to bring AI into drug discovery, she points out that machine learning has not yet brought a molecule to market. “It just assists humans to do a faster and better job in various sub-tasks, but we still don’t have AI coming up with new drugs at this point,” she says.

Artificial intelligence is also limited by the quality of the inputs—in particular, the quality of the data that it learns from, says Schreyer. “If the data is flawed, then the results are likely to be flawed as well, which can be a problem if you get data from third parties,” he says. Experimental data are not perfect measurements of the real world, and there are always assumptions and fitting involved. If the algorithm and its users do not sufficiently take those biases and weightings into account, then the outputs will also be biased.

One tantalizing question is whether AI could suggest the structure of a new therapeutic molecule from scratch.

Sometimes data can be lacking altogether, particularly for recently discovered molecules. At Cloud Pharmaceuticals, researchers are getting around this problem by using computational chemistry to design a range of molecules that could fit a particular biological target, then using that range as the data set that the machine learning algorithm can learn from to help design new candidate molecules.

Another issue for some researchers is the so-called “black box” nature of the algorithms themselves: data goes in, answers come out, but the internal workings remain a mystery. Swamidass says that this presents a fundamental conflict for scientists. “What does it mean to be applying black-box methods in fields that are all about understanding what’s inside of a black box?” he says. “We don’t want to just have something that works in science, we want to understand the underlying processes there.”

But Barzilay says that the relationships between inputs and outputs might often go beyond human comprehension. “When we’re talking about complex organic reactions with different solvents and different temperatures and other things, then it becomes really difficult even for the experienced human to predict what is the right outcome, so what does it mean [to] give an explanation?”

There is unquestionably a lot of hype around the potential of AI in drug discovery—Swamidass likens this period to the internet boom of the late 1990s. But many, such as Schreyer, are excited about the possibilities that this new technology offers, particularly when it comes to finding novel therapeutics for difficult-to-treat diseases.

“If you can improve . . . how efficiently you do drug discovery fivefold, tenfold, or even more,” he says, “then from an economic perspective . . . you’re able to take on more risky projects because the cost of failure is much lower.”

Bianca Nogrady is a freelance science writer based in Sydney, Australia.

Interested in reading more?

May 2019 The Scientist Issue

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member?