© GRAPHICNOI/ISTOCKPHOTO.COMSmall molecules are the ultimate drug candidates: they are relatively easy to manipulate chemically, and many of them readily enter cells after oral ingestion.
Traditionally, researchers suss out potentially useful drugs of this type by making or ordering a small-molecule library and then painstakingly screening each molecule in its own well of a multiwell plate to spot those that bind to a cellular target of interest. Although such screening is amenable to high-throughput setups, it requires a lot of time, plastics, and robotic equipment. Over the past decade, drug-discovery researchers in both industry and academia have increasingly begun to embrace a new method, in which small molecules are tagged with DNAs that serve as tracking devices.
Such DNA-encoded libraries allow the construction and testing of many drug candidates in one pass, which eases the process of evaluating large collections of small molecules. Small molecules are generated by combining chemical building blocks, including amino acids, amines, and carboxylic acids, each of which is covalently linked to a DNA fragment with a unique sequence. Incubating the resulting library with your target and sequencing the DNA tags linked to any molecules that stick complete the process.
Depending on the types of building blocks (BBs) you want to work with and how easily they react with each other, however, some approaches for making a DNA-encoded library might be better than others. The technology also comes with some challenges. First, you or your collaborators will need a firm grasp of molecular biology and organic chemistry to manipulate the DNA and building blocks. And analyzing the DNA sequences to pick out possible hits requires bioinformatics know-how. “It’s a lot of skills that don’t always come together,” says Brian Paegel, a chemist at The Scripps Research Institute in Jupiter, Florida, who is setting up DNA-encoded library technology in his lab.
The Scientist spells out the basics of DNA-encoded library synthesis, and highlights the strengths and shortcomings of the different approaches.
Overview: One of the most common approaches for making a DNA-encoded library is called “tagged split-and-pool.” It works particularly well for constructing large libraries of millions to billions of small molecules. Hongfeng Deng and her colleagues at GlaxoSmithKline used tagged split-and-pool to create the largest DNA-encoded library reported to date, which contains 4 billion small molecules (J Med Chem, 55:7061-79, 2012).
How it works: From an initial set of BBs of known chemical composition, each BB type is covalently linked to a different DNA tag through a chemical reaction. Then, the BB-DNA compounds are pooled, and the mixture, which contains some of each BB-DNA compound, gets split into different wells. Next, from a second set of BBs, a different BB type is added to each of the wells, so that all of the BB-DNA compounds from the first round get linked to each of the BB types in the second round. After the second BB, a new, unique DNA tag is added to each well and joins up, either by hybridization or ligation, to the DNA pieces tagging the second BB set.
Technology in action: Deng and colleagues were able to generate their large DNA-encoded library by using a relatively large number of BBs (five) for each small molecule generated. The researchers incubated the library with a target protein called ADAMTS-5, a metalloprotease that degrades cartilage and is implicated in osteoarthritis. After identifying molecules that bound to ADAMTS-5 by sequencing their DNA tags, they identified several potent inhibitors of ADAMTS-5 (J Med Chem, 55:7061-79, 2012).
Advantages: By repeating the steps of pooling, splitting, and adding BBs and DNA tags, you can make large molecules and diverse, large libraries. Also, this approach is straightforward because after you design the DNA tags and choose the BBs, you can order them ready-to-use.
Challenges: Determining which small molecules bind the target requires not only sequencing their DNA tags but also quantifying them. The idea is that the tags that occur most commonly in the mix are probably linked to the molecules that bind the target with the highest affinity.
Deng’s group took extra steps to enrich the concentration of bound molecules to make them easier to quantify: after incubating the library with the target protein, the team eluted the bound molecules and re-exposed these molecules to the target. However, even with the most advanced next-generation sequencing technology and additional rounds of incubation, quantifying the DNA tags in libraries of 10 million or more molecules can be a challenge. You’ll likely encounter some ambiguous, low-frequency hits, which you can either toss out, or—as Deng’s group did—remake and test them individually.
Company help: BBs can be purchased from Sigma-Aldrich, VWR, Novabiochem, and Bachem and cost $10–$300 each. DNA tags are sold by Sigma-Aldrich and IBA GmbH and cost $20–$100 apiece. Nuevolution in Copenhagen and Hitgen in Chengdu, China, contract with researchers to conduct screens using such libraries. Nuevolution sometimes does the work pro bono for researchers who forgo ownership of the compounds, but charges fees and royalties for partners who want to retain ownership. Hitgen’s prices also vary based on factors such as the amount of follow-up analysis requested, but generally range between $100,000 and $200,000 for each target.
Going with the flow
Overview: A second approach, called “programmed split-and-pool,” is similar to tagged split-and-pool but requires a more complex setup. Instead of adding a DNA tag along with each BB, you design “template” DNA strands containing sequences of barcodes that will each represent a different BB that gets added to the DNA template. The main payoff is that after you identify the molecules that bind your target, you can easily make more of them.
How it works: After designing a DNA library of templates for each small molecule you plan to make, you will need to position those templates into an array-like format. To do this, first pump the template mixture through a fluidic device over an array of oligonucleotides, each of which is complementary to the first set of barcodes that you use in the library templates. Hybridization to their appropriate anti-barcode oligos separates the templates into an array. Then the templates are released from the oligos, using a mild base to denature the template-oligo duplexes, and the template array is transferred to a sheet of filter paper that binds DNA. Finally, the first set of BBs is added one by one to each spot on this template array. After the BBs link to the templates, you can release the template-BB compounds from the paper, pool all of them together, and split them again using an array of oligos that are complementary to the next set of barcodes before the addition of the second set of BBs to the templates.
Technology in action: Pehr Harbury and his lab at Stanford University developed this take on the split-and-pool approach. They used it to generate a library of 100 million compounds called peptoids, peptide-like molecules composed of artificial amino acids that are modified to be resistant to degradation, potentially making them better drug candidates (J Am Chem Soc, 129:13137-43, 2007). Harbury’s team incubated the library with the target of interest, a cancer-promoting protein called Crk. After the first round of binding, the researchers amplified the DNA templates of bound molecules by PCR. Then they used the amplified templates to make a second library that included just the binders, again incubating them with the target. They repeated these three steps—incubate with target, remake the library, incubate with target again—a total of five times. Each round enriched the concentration of true binders in the population and diminished the level of nonspecific binders. In the end, they identified six peptoids with high target-binding affinity.
Advantages: The ability to make and test ever-smaller subsets of the molecules that bind the target, a process called “directed evolution,” sets this approach apart from tagged split-and-pool. By amplifying the DNA templates and using them to make more of the promising small molecules, the method has a limitless capacity to retest subsets of molecules for their ability to bind the target.
Another benefit of programmed split-and-pool is that the BB addition takes place on filter paper after the templates have separated from the complementary oligos. This means that the BBs can react with each other in a range of solvents, not just the water-based solutions that DNA hybridization requires. “There’s a ton of chemistry you can’t do in water,” Harbury says. That includes peptoid chemistry.
Challenges: Before you start making compound libraries, you must first make an array of the anti-barcode oligos. You also need to order the fluidic device, which must be custom-made and requires some assembly. Harbury’s team recently detailed both the array and fluidic device components (PLOS ONE, 7:e32299, 2012).
Company help: DiCE Molecules in Palo Alto, California, is gearing up to commercialize Harbury’s technology, although it has not started working with researchers or established its pricing yet. The fluidic device can be custom ordered from any company that offers machining services, such as Custom Machining Service and Coorstek, and typically costs about $1,000 a pop.
Overview: Just like programmed split-and-pool, this approach—called DNA-templated synthesis—uses DNA templates to guide synthesis of the molecules in the library. However, it affords you the ease of making the entire library in a single reaction vessel.
How it works: As in the previous approach, you’ll need a DNA template for each molecule that you plan to make, as well as a set of anti-barcode oligos corresponding to each barcode in the templates. However, in this approach, all the templates are mixed together in one pot, and each of the oligos is linked to a different BB through a chemical reaction. To add the first BBs to the templates, mix the templates with the oligos that are complementary to the first set of barcodes they contain. When they hybridize, the BBs attached to them get transferred to the templates through a second chemical reaction. Repeat these steps for the second set of BBs and so on, to generate a chain of BBs linked to the templates.
Technology in action: David Liu at Harvard pioneered this approach, and in 2008 his group created a 13,824-molecule library (J Am Chem Soc, 130:15611-26, 2008). The researchers used four synthetic amino acid BBs and, during the addition of the last BB, performed a chemical reaction to link the first and last BB in the chain. The resulting circular molecules, called macrocycles, are larger and more rigid than their linear counterparts, boosting their ability to interact with multiple sites on the target protein, and thus their specificity. Using this library, Liu and his colleagues identified the first potent inhibitor of an enzyme that destroys insulin without affecting related enzymes (submitted for publication).
Advantages: The use of DNA templates means this approach, like programmed split-and-pool, could enable directed evolution. (However, Liu’s library is small enough that it could be analyzed by sequencing without the need for enrichment.) In addition, the method of adding BBs could make it easier to synthesize some types of molecules, including macrocycles. “The base pairing [between templates and oligos] holds the building blocks together to allow macrocyclization to occur in a very efficient manner,” Liu says.
Challenges: Not only do you have to design templates for all the molecules and complementary oligos, you also have to make sure the template sequences won’t hybridize with each other in the one-pot reaction. Liu’s group uses a program called VisualOMP to design template and oligo sequences (available at www.dnasoftware.com). In addition, this is the only approach that requires the added step of coupling the oligos with BBs before library synthesis.
Company help: While this approach can also make linear small molecules, Ensemble Therapeutics in Cambridge, Massachusetts, licensed this technology from Harvard to make macrocycle libraries. The company routinely partners with pharma companies (no academic groups yet), and, like Nuevolution and Hitgen, does all the steps from selection to validation in-house.