BEAUTY AND DIVERSITY: With 150 billion base pairs, Paris japonica boasts the largest known eukaryotic genome—50 times the size of the human genome.ALPSDAKE/WIKIMEDIA

What do cells, genes, mutations, transposons, RNA silencing, and DNA recombination have in common? All were discovered first in plants.

It sounds grandiose, but it’s true, and plant biologists delight in reminding others of these plant-derived breakthroughs. The first cell observed under a microscope, back in the mid-1660s by physicist Robert Hooke, was a plant cell in a slice of cork. Botanist Robert Brown first named the nucleus after observing opaque spots inside orchid cells. The saintly father of genetics, Gregor Mendel, defined the laws of inheritance by studying pea plants. The list goes on.

Today, with the advent of high-throughput sequencing, that legacy of firsts in the plant field is extending to genomics research. In the tens of millions of nucleic acids of familiar...

In the last two years, researchers have stumbled upon some “mind-blowing” phenomena in plant genomics, including genomes so strange that “we didn’t think [they] could be like that,” says R. Keith Slotkin, a geneticist at Ohio State University. Examples include the peaceful coexistence of two different genomes in a single nucleus and the willy-nilly way plants swap genes among species. And just as with Hooke’s, Brown’s, and Mendel’s fundamental discoveries in plant biology, the bizarre behavior of plant genomes often applies to animals as well.

Size matters

GENOMIC DOUBLING AND GENE SHUFFLING: There are many drivers of genomic diversity in plants, including polyploidy, or the existence of multiple copies of a genome within an organism; the movement of transposable elements (TEs), which sometimes insert into promoter regions or even genes to create new variants; and the exchange of DNA between individuals.
See full infographic: JPG | PDF
In 2010, a young technician picked his way through a sea of plants at the Royal Botanic Gardens, Kew, in London and clipped a leaf from a Japanese canopy plant (Paris japonica), a pretty, umbrella-like perennial with a perky white flower at its apex. Jaume Pellicer was conducting a survey of plant genome sizes, removing small pieces of leaves to stain for the cells’ nuclei and estimate the amount of DNA inside.

There was nothing extraordinary about the Japanese canopy plant—that is, until Pellicer analyzed the stained cells using flow cytometry, a high-throughput technique to detect features of cells suspended in liquid. To Pellicer’s eye, the balls of DNA inside P. japonica’s nuclei looked “really, really big,” he recalls. Soon, he confirmed that P. japonica carries the largest known eukaryotic genome on the planet, with a whopping 150 billion base pairs—50 times the size of the human genome.1 “We were astonished,” says Pellicer. Plants in general are known to have sizable genomes, often as a result of whole-genome duplications, so “we were expecting to find big genomes, but nothing that big.” (See “Jaume and the Giant Genome,” The Scientist, April 2011.)

Two years later, Victor Albert at the University of Buffalo decided to investigate a group of plant cells with nuclei that looked quite different: they appeared to harbor especially tiny genomes. When Albert and colleague Luis Herrera-Estrella sequenced the genome of Utricularia gibba, a bladderwort that forms free-floating mats with hidden underwater bladders that suck in unsuspecting prey, they found that it has one of the smallest genomes ever to be sequenced from a plant—just 82 million base pairs. Even more interesting, the genome is small not because it has fewer genes than other plants. In fact, it has more genes than grapes, papaya, or Arabidopsis. Rather, 97 percent of the plant’s genome is protein-coding genes and gene regulatory regions, with only 3 percent having no known function—what scientists often call “junk” DNA.2 That’s the complete opposite of the human genome, which is made up of 98 percent junk and 2 percent protein-coding genes.

“The implications are that you can make a perfectly good complex, multicellular plant with a gigantic genome or a tiny genome,” says Albert. “You probably need approximately the same number of genes, but the ‘junk’ or lack of ‘junk’ probably doesn’t matter much, if it even matters at all.”

Albert’s findings challenge recent papers from the ENCODE project published in September 2012, which concluded that 80 percent of the human genome contains “functional elements.” Even non-protein-coding sequences should be considered “functional” if they are transcribed into RNA, the ENCODE researchers argued. But Albert and other comparative genomicists don’t buy it. “Biologically active is not the same thing as functional,” says Albert. Just because DNA is transcribed doesn’t mean it is being used.

But much research on noncoding DNA has focused on mammalian genomes, which are all similar in size, around 3 to 4 billion base pairs, says T. Ryan Gregory, who studies animal genomes at the University of Guelph in Ontario, Canada. “Throw a salamander in there, or a pufferfish, and now you’ve got 200- to 300-fold variation in genome size. Now explain [noncoding DNA],” he challenges.

Carbon copies

From the vast range of genome sizes within plants, researchers are getting a better grasp not only of noncoding DNA, but of how and why a genome grows or shrinks in the first place.

An increase in genome size is typically a consequence of one of two mechanisms: the duplication of the entire genome, or the multiplication of transposable elements (TEs) within a genome. The former is common in plants and results in polyploidy, a state in which an organism harbors multiple copies of a genome. But the latter is a far more common cause of size increase in animal genomes. The human genome, for example, is swollen with more than 1 million copies of a single, typically nonfunctional TE called ALU.

But how and why TEs, which are often compared to parasites, multiply in genomes had remained mysterious, despite decades of study in animals. Then Sue Wessler decided to study rice.

To study how TEs might be influencing genome evolution, Wessler, a molecular biologist now at the University of California, Riverside, sought out organisms harboring TEs that were still moving around and increasing their copy number in the genome. They weren’t easy to find. Most plants and animals have elaborate and strict mechanisms for keeping TEs quiet. If they didn’t, these upstart elements could pop themselves into important promoters and genes all over the place, throwing cellular processes awry. (See illustration.)

Polyploidy is clearly important, but finding the smoking gun of how it’s important is really hard.—­Jonathan Wendel, Iowa State University

Then, in the early 1990s, Wessler discovered a new type of TE. These small elements, called MITEs (for miniature inverted repeat transposable element), were peppered throughout noncoding regions of plant genomes, including the DNA of rice (Oryza sativa). And in 2003, her team found that one particular MITE, called mPing, was increasing its copy number by 25 to 40 new insertions per rice plant every generation.3 Wessler immediately wondered how this could happen without disrupting the physiology of the plant harboring the rapidly enlarging genome.

She found her answer in 2009: although mPing tends to insert into or nearby genes, it avoids exons, meaning it rarely disrupts gene function. Specifically, the element has an insertion preference for AT rich sequences, and rice exons are GC rich.4 The same is not true in other plants, however, and when Wayne Parrott’s lab at the University of Georgia inserted mPing into the soybean genome, the TE inserted into gene exons far more frequently. “It points out the intimate association between a successful element and its host,” says Wessler, who collaborated on the study.

But are TEs just miniparasites, or could they serve a biological purpose in the genome? Some have suggested TEs work as built-in diversity-generating factors in stressed populations. Wessler’s laboratory, for example, has found that the insertion of mPing in the rice genome has, in several instances, made the transcription of an adjacent gene stress-inducible. By inserting themselves into promoter regions and even genes themselves, TEs can create new alleles in populations, Wessler adds. And this phenomenon is not just occurring in plants. There are also clear examples of TEs generating diversity within animal genomes. The human adaptive immune system, for example, only has enough genes in a given cell to produce about 20 different antibodies. Thanks to TEs that reshuffle our small deck of antibody genes, however, human immune cells can produce up to 2 million different antibodies to fight foreign invaders. Animal TEs are also particularly active during early brain development, which may help create genetic diversity within a population, despite every member of that population using essentially the same DNA as raw material.

Double the fun

GENOMES BIG AND SMALL: Utricularia gibba, has one of the smallest known plant genomes, with just 82 million base pairs. It still carries as many genes as other plants, however, having streamlined its genome by cutting out the vast majority of so-called “junk” DNA.ALEX POPOVKIN/WIKIPEDIAThough TEs can contribute to genome size, the main cause of behemoth plant genomes is polyploidy: when an organism contains more than two sets of paired chromosomes. In most cases, polyploidy occurs due to an error in cell division, resulting in a whole genome, rather than half, being retained in a gamete. When two such diploid gametes join to form a zygote, it yields tetraploid offspring. (See illustration.) For reasons that are unclear, this appears particularly common among plant species. All flowering plants, in fact, have had a genome doubling sometime in their history, and most have had more than one. The Japanese canopy plant, for example, is an octaploid, says Pellicer—with four genome duplications in its history. “There are lots of [plant] groups where you find polyploidy,” says Jonathan Wendel, who studies cotton’s tetraploid genome at Iowa State University. “It’s clearly important, but finding the smoking gun of how it’s important is really hard.”

In some cases, genome doubling is a misnomer—it’s not a doubling of a single genome, but a genomic union of two different species in a single organism. When a diploid gamete from one species combines with a diploid gamete from another species and the resulting organism survives, a new species is born—one with two full genomes. (See illustration.) This is speciation at its most dramatic, and it is far more common than one might think.

Cotton, the soft, malleable fiber you may be wearing as you read this, is the result of one such “illicit affair,” as Wendel likes to call it. At some point in history, the genome from an old-world cotton species and the genome from a new-world cotton species joined together, resulting in cotton as we know it today, with double the number of chromosomes of either parent species. Although such cross-species hybridization is much rarer in animal species, says Wendel, it happens there too. For example, the Lonicera fly (genus Rhagoletis), discovered in the U.S. in 2005, is a hybrid of two existing insects, the blueberry maggot and the snowberry maggot.

The effect of such genomic combining on the fitness and phenotype of an organism, however, had been unclear. “It’s really difficult to catch this ecological angle in action, to show differential fitness between a polyploid and a diploid in the same environment under the same conditions,” says Wendel. But just in the last few years, this has been accomplished.

In 1949, while walking around the campus of Washington State University (WSU) in Pullman, botanist Marion Ownbey noticed an oddly colored species of Tragopogon, a weedy flower also known as salsify or goatsbeard. Upon further investigation, he discovered that the species, which he named T. mirus, was a hybrid of two Tragopogon species, and that it had double the chromosomes of either parent. And since the parent species had not arrived in the Pacific Northwest from Europe until the 1920s, Ownbey realized that T. mirus was a brand-new polyploid species.

In the mid-1990s, while working at WSU, husband-and-wife team Douglas and Pamela Soltis began to analyze the new Tragopogon species, realizing they had a chance to watch the combination of two genomes in action. When they moved across the country in 2000 to take jobs at the University of Florida, they brought Tragopogon with them, and remade the polyploid in the lab.5 Within a few generations, the Soltises observed a genomic shakedown in the nucleus of T. mirus, including changes in gene expression, rapid reshuffling of chromosomes, translocations of genes, and changes in methylation to activate or deactivate genes.

In addition to observing molecular changes, the Soltises observed some of the only direct evidence of polyploidy being favored by natural selection. Although T. mirus individuals have been created from its two parent species multiple times in the western U.S., and in numerous locations where both parents once thrived, now only the polyploid lives: it outcompeted its parents. “It’s like the North American success story: the parents get here, eke out an existence, give rise to children of North American origin, and those children are highly successful,” says Douglas Soltis.

This and other evidence suggest that a large genome can be beneficial. In many cases, polyploid plants, especially cultivated crops, appear to be hardier than their diploid relatives. This may be because if a gene is accidentally deleted or mutated in an individual, a spare copy can take over.

Evolution in the fast lane

Imagine borrowing a few genes from a lion to improve your night vision, sneaking a couple from a salmon to breathe underwater, and swiping one or two more from a salamander in case you need to grow back a finger.

Yes, it sounds crazy, because animals don’t normally swap genes. But plants do, even between species as different as humans and salamanders. Plants that intermingle physically can trade DNA—typically mitochondrial DNA—but not always. This gene swapping can happen when a parasitic plant latches onto a host, like a vine wrapping around the trunk of an oak tree, or when two plants grow close together and graft onto each other, says Indiana University’s Jeffrey Palmer, who studies horizontal gene transfer in plants.

Recently, Palmer demonstrated that land plants take up foreign mitochondrial DNA from other land plants and green algae, but not from animals or fungi. He hypothesizes that this is because plants and green algae share a mechanism to fuse mitochondria together, while animals and fungi have a completely different, noncomplementary fusing mechanism. “Species barriers go deep between plants and animals,” says Palmer.

The king of gene stealing is Amborella, a nondescript flowering plant that grows only in New Caledonia, an island off the east coast of Australia. Amborella can often be found draped with mosses, lichens, and other organisms, and apparently the crafty plant extracts a genetic tax from each one. Sequencing the Amborella mitochondrial genome “just knocked our socks off,” says Palmer. “For every native gene, it contains six foreign copies of that gene, acquired from a range of land plants and green algae.” (D.W. Rice et al., Science, in press)

In most cases, horizontal gene transfer in plants appears to be neutral—it doesn’t affect an organism’s phenotype. But in a few cases, foreign genes supplant or replace native genes and assume an active role in the genome. In one transfer of a key mitochondrial gene for respiration from the blueberry family to Ternstroemia, a genus of flowering plants from the tropics, the native and foreign copies of the gene began to recombine, leading to the diversification of mosaic genes across a group of Ternstroemia species. And in 2012, researchers found that some species of grasses in the genus Alloteropsis had acquired nuclear genes from other grasses that are essential for C4 photosynthesis, which employs a more efficient form of carbon fixation. Only those Alloteropsis species that possessed these genes were able to perform C4 photosynthesis.6

Plants don’t rely only on horizontal gene transfer for new alleles; they also gain new genes through the traditional route of mutation. But here, once again, plants go to the extreme, boasting some of the fastest and slowest mutation rates on Earth.
Several years ago, while studying the organization of chloroplast genomes in geraniums, Robert Jansen of the University of Texas at Austin noticed that mutation rates appeared to be exceptionally high compared to other plant species.7 He shared his findings with Indiana University’s Palmer and Jeff Mower at the University of Nebraska, both of whom studied the mitochondrial genomes of the same family of plants, Geraniaceae. Lo and behold, the two also identified highly accelerated genetic mutation rates in the mitochondria. The three researchers decided to work together to find out if the geranium family’s three genomes—nuclear, mitochondrial, and chloroplast—were mutating in coordination.

Geraniaceae is “a natural system for looking at mutation, because there’s so much variation,” says Jansen. The plant family could offer insight as to how multiple genomes in one organism coevolve, including human nuclear and mitochondrial genomes. With a grant from the National Science Foundation, the team has so far generated sequence data for the mitochondrial and chloroplast genomes of more than 100 Geraniaceae species (which are shorter and easier to sequence than nuclear genomes), and will be sequencing the transcriptomes of 30 species from the group. They’ve just begun to analyze the data, but they hypothesize that the geranium’s DNA repair system, which fixes breaks in all three genomes, experienced some type of mutation or alteration, speeding up the mutation rate across the organism’s three genomes.

One way to tell if the geranium DNA repair system is faulty would be to compare it to an especially slow mutator, such as the tulip tree. The tulip tree’s mitochondrial genome—which Palmer sequenced because “it happens to be a tree that I love”—turns out to have one of the slowest mutation rates of any known mitochondrial genome.8 It’s essentially a living fossil, says Palmer, who believes that the tree may have an especially good system for repairing DNA damage, and that studying it could help us learn how to prevent deleterious mutations in our own DNA.

“Plants are a model system for comparative genomics and other processes,” says Palmer. In polyploidy, transposable elements, and rates of mutation, plants lead the way. And there’s plenty more exciting work coming down the pipeline from plants, he adds, but you have to keep your eye out for it. “Furry animals get on the covers of Science and Nature a lot more than plants do,” he says with a laugh. 


  1. J. Pellicer et al., “The largest eukaryotic genome of them all?” Bot J Linn Soc, 164:10-15, 2010.
  2. E. Ibarra-Laclette et al., “Architecture and evolution of a minute plant genome,” Nature, 498:94-98, 2013.
  3. N. Jiang et al., “An active DNA transposon family in rice,” Nature, 421:163-67, 2003.
  4. K. Naito et al., “Unexpected consequences of a sudden and massive transposon amplification on rice gene expression,” Nature, 461:1130-34, 2009.
  5. J.A. Tate et al., “Synthetic polyploids of Tragopogon miscellus and T. mirus (Asteraceae): 60 Years after Ownbey’s discovery,” Am J Bot., 96:979-88, 2009.
  6. P.A. Christin et al., “Adaptive evolution of C(4) photosynthesis through recurrent lateral gene transfer,” Curr Biol, 22:445-49, 2012.
  7. M.M. Guisinger et al., “Genome-wide analyses of Geraniaceae plastid DNA reveal unprecedented patterns of increased nucleotide substitutions,” PNAS, 105:18424-29, 2008.
  8. A.O. Richardson et al., “The “fossilized” mitochondrial genome of Liriodendron tulipifera: ancestral gene content and order, ancestral editing sites, and extraordinarily low mutation rate,” BMC Biol, 11:29, 2013.

Correction (July 4, 2014): This story has been updated from its original version to correctly reflect the size of mammalian genomes as 3 to 4 billion base pairs. The Scientist regrets the error.

Interested in reading more?

Magaizne Cover

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!