Genes – the bits of DNA that code for proteins – make up about 2 percent of the human genome. The rest consists of a genetic material known as noncoding DNA, and scientists have spent years puzzling over why this material exists in such voluminous quantities.
Now, a new study offers an unexpected insight: The large majority of noncoding DNA, which is abundant in many living things, may not actually be needed for complex life, according to an advance online publication in Nature.
The clues lie in the genome of the carnivorous bladderwort plant, Utricularia gibba.
The U. gibba genome is the smallest ever to be sequenced from a complex, multicellular plant. The researchers who deciphered the DNA say that 97 percent of the genome consists of genes and small pieces of DNA that control those genes.
It appears that the plant has been busy deleting noncoding DNA, sometimes also called "junk" DNA, from its genetic material over many generations, the scientists say. This may explain the difference between bladderworts and species with large amounts of noncoding DNA, like corn and tobacco – and humans.
The study was directed by Luis Herrera-Estrella, who leads the Laboratorio Nacional de Genómica para la Biodiversidad
, or LANGEBIO, in Mexico, and Victor Albert of the University at Buffalo, with contributions from scientists in the United States, Mexico, China, Singapore, Spain and Germany.
"The big story is that only 3 percent of the bladderwort's genetic material is so-called 'junk' DNA," Albert said. "Somehow, this plant has purged most of what makes up plant genomes. What that says is that you can have a perfectly good multicellular plant with lots of different cells, organs, tissue types and flowers, and you can do it without the junk. Junk is not needed."
Noncoding DNA is DNA that doesn't code for any proteins. This includes mobile elements called jumping genes that have the ability to copy (or cut) and paste themselves into new locations of the genome, and thus increase its size.
Scientists have spent countless hours puzzling over why noncoding DNA exists – and in such copious amounts. A recent series of papers from ENCODE, a highly publicized international research project, began to offer an explanation, saying that the majority of noncoding DNA (about 80 percent) appeared to play a role in biochemical functions such as regulation and promotion of DNA conversion into its relative, RNA, which for genes, feeds into the machinery that makes proteins.
The new U. gibba genome suggests that having a bunch of noncoding DNA is not crucial for complex life. The bladderwort is an eccentric and complicated plant. It lives in aquatic habitats like freshwater wetlands, and it has developed corresponding, highly specialized hunting methods. To capture prey, the plant pumps water from tiny chambers called bladders, turning each into a vacuum that can suck in and trap unsuspecting critters.
The U. gibba genome has about 80 million DNA base pairs – a miniscule number compared to other complex plants – and the deletion of noncoding DNA appears to account for most of that size discrepancy, the researchers say. U. gibba has about 28,500 genes, comparable to relatives like grape and tomato, which have much larger genomes of about 490 and 780 million base pairs, respectively.
In addition to it being unusually lean, the plant's genome had another surprise in store for the researchers once they asked UA plant genomics expert Lyons to take a closer look.
"I thought, 'this should be easy given how small this genome is,'" said Lyons, "But as I looked more closely, I couldn't make heads or tails of what had happened inside this genome."
Over the course of its evolutionary history, Lyons soon discovered, the plant had undergone three rounds of duplications of its entire genome. That is, at three distinct times in the course of its evolution, the bladderwort's genome doubled in size, with offspring receiving two full copies of the species' entire genome.
Unlike in animals, where duplication of genetic material is usually detrimental – Down syndrome, for example, is caused by an extra copy of only one chromosome – the process is very common in plants.
The combination of the plant's unusually small genome and its history of genome duplications made for a challenging puzzle.
"What made this so difficult is not just the fact that it duplicated, but it duplicated and then random pieces were removed," Lyons explained. "Most of the genes that were duplicated are lost over evolutionary time. This process happened repeatedly. It duplicates, and then three quarters of the genes are lost, it duplicates again, and then three quarters of that are lost, and so on."
"It turned out to be this phenomenal puzzle, and I would spend hours in front of a computer grabbing pieces of various genomes trying to find which pieces match," he said. "Deciphering a genome is a matter of taking all these puzzle pieces and getting them to line up so we can see that there is a particular pattern of duplication followed by gene loss."
Lyons' research team develops the specialized software to allow researchers all over the world to study genomes and manipulate biological data with the same ease as handling organisms.
The computational power for running these analyses is provided by iPlant's cyberinfrastructure, a $100 million project funded by the National Science Foundation and based in the UA's BIO5 Institute
"If I'm an entomologist, I want to go out to the field, find my favorite insect, and be able to pick them up, look at them up close, and identify things that are different," Lyons said. "When I recognize something interesting, say, a spur on the leg of a beetle, I want to be able to quickly go to all my other beetle specimens and see if they have this spur. We need to be able to do the same thing with genomic data. That is what my group has focused on – bringing these kinds of data to life."
Lyons explained that with ever-faster advances in technology, sequencing a genome has become the easy part.
"The bottleneck is making sense of those data and transforming them from information to knowledge."
Lyons likened the task of making sense of the bladderwort genome to cutting up pages of text into small squares and piecing them back together.
"There is a lot of white space on those pages, and like with a puzzle, you have no way of knowing which square goes where if there are many identical squares," he explained. "That is exactly the problem we're facing when we assemble genomes."
The authors of this study argue that organisms may not bulk up on genetic junk for reasons of benefit.
Instead, they say, some species – such as the bladderwort studied here – may simply have an inherent, mechanistic bias toward deleting a great deal of noncoding DNA while others have a built-in bias in the opposite direction — toward DNA insertion and duplication.
These biases are not due to the fact that one way of behaving is more helpful than the other, but because there are two innate ways to behave and all organisms adhere to them to one degree or the other. The place that organisms occupy on this sliding scale of forces depends in part on the extent to which Darwin's natural selection pressure is able to counter or enhance these intrinsic biases.
"There is this idea that a genome exists at a balance between how fast it is being chopped up and how fast it is growing," Lyons explained. "In the case of U. gibba, we think what might have happened is there was this mechanism in place that kept chewing away at the DNA, removing through selection almost all active jumping genes in the process, thus taking away the genome's ability to grow. The only way it can escape from shrinking out of existence is to double the entire genome every now and then."
"Why? We don't know," Lyons said. "It might simply be an evolutionary just-so story. Those plants that double their genomes live to tell the tale and those that don't vanish."