Pines, firs, junipers, cedars, redwoods, yews, spruces: These are but a few trees belonging to an enormous and morphologically diverse group of plants known as conifers.
In turn, conifers are the largest group of gymnosperms — plants known for their exposed seeds, which are unprotected by fruit.
For years, people have been interested in sequencing gymnosperm genomes because of their economic importance. Conifers are the the world’s primary source of lumber. From an evolutionary perspective, scientists also wanted to understand what gymnosperm genomes look like compared with those of flowering plants, or angiosperms, their sister lineage from which they diverged between 350 million to 380 million years ago.
Two years ago, a group of scientists succeeded in sequencing the Norway spruce genome, a significant feat because the spruce’s genome is seven times that of a human’s.
Plant genomes can become large through a couple of different mechanisms, Barker says.
For one, plants often speciate through polyploidy, meaning they have more than two paired sets of chromosomes and can pass on multiple complete sets of genetic information to their offspring. "Polyploid speciation, or whole genome duplication, doubles the genome in one instant," Barker says.
The other key way that genome size evolves in plants is through stretches of repetitive DNA, or transposable elements, that copy themselves, or take advantage of replication in the cell to copy themselves inside the cell, as well. "There’s a whole ecosystem of these in genomes, and their populations can expand within genomes," Barker says.
But what caught Barker’s attention regarding the Norway spruce, he says, is why the genome is so massive and yet previous genomic research showed an absence of polyploidy in the ancestry of contemporary gymnosperms. So, he and his colleagues developed an algorithm called the multi-taxon polyploidy search tool, or MAPS, to look for ancient polyploidy events in sequenced genomes.
"MAPS is a new way of inferring these ancient polyploidy events," Barker says. "A polyploidy doubles everything at one time, so you look for this big burst of gene duplication in the history. The bursts show up as peaks (in a graph) when you look at the age distribution of genes. They are sort of like a genetic 'baby boom' that leaves a significant signature of gene birth across millions of years."
MAPS leverages these data. Instead of seeing polyploidy events in only one species at a time, we can see them in multiple species in a shared framework. It allows us to simultaneously look at the history of shared gene duplications in all their descendant lineages.
By looking at the history of those shared gene duplications, Barker found that within the conifers there are two whole-genome duplications that no one expected to find, because polyploid speciation is so rare among contemporary conifers.
"Polyploid speciation may have been more common among ancient seed plants and conifers hundreds of millions of years ago, as we observed two rounds of polyploid speciation in their ancestry," Barker says. "Although there are some conifers that are recent polyploids, such as the redwoods, the last time most conifer genomes duplicated was around the same time the dinosaurs appeared. It is not clear why there has been so little successful polyploid speciation since these ancient genome duplications."
Now Barker and his colleagues are further exploring the legacy of whole gene duplications. In other lineages, they have found that some types of genes are more likely to be retained following polyploidy than other types of duplications, but they’re not sure why. And they’ll be using MAPS to explore paleopolyploidy across the tree of life.
"We hope to gain a better understanding of how polyploidy and these genetic baby booms have contributed to the diversity of life," Barker says.