Sequencing the banana genome has revealed the secrets of its 520 million base pairs, the “letters” of the genetic code. This work is a big step toward understanding the genetics of and improving banana varieties, and was done within the framework of the Global Musa Genomics Consortium.
The results are published in the July 12 issue of the journal Nature. The banana genome was found to contain more than 36,000 genes, slightly more than the human genome.
Eric Lyons, an assistant professor in the University of Arizona School of Plant Sciences and a member of the iPlant Collaborative, which is based at the UA’s BIO5 Institute, contributed to the project by developing a key part of the cyber infrastructure necessary to handle and analyze the huge amounts of data generated by deciphering the sequence. The tool helps in figuring out the meaning behind the genetic alphabet of the banana by comparing it to other plant genomes.
“We are dealing with huge amounts of information,” Lyons said. “Plant genomes are incredibly dynamic, which makes them some of the most fascinating and at the same time most difficult organisms to study.”
Funded through a $50 million grant from the National Science Foundation in 2008, the iPlant Collaborative has since brought together researchers from all biological fields and biomedical sciences from across the nation as well as overseas. Collaborating with high-power computing experts, they use iPlant as a new platform to gather, store and interpret the immense amounts of data generated by projects such as comparisons among entire genomes.
Lyons has developed a system that does just that: CoGe, which is short for Comparative Genomics. He said CoGe provides the tools allowing any scientist in the world to compare and analyze any genome side by side. Originally developed for plant genomes, the software is designed to accommodate any set of genomes from all domains of life.
CoGe currently contains almost 20,000 genomes from 15,000 organisms, including viruses, bacteria, plants, insects, amphibians and mammals – and, as of now, the banana.
“The number of genomes has exploded,” Lyons said. “The whole reason I designed the system was that we needed ways to compare genomes quickly. However, we also needed to easily manage those data, because no matter where we are today, tomorrow we'll have a new version of our favorite genome and 10 more to which to compare it.”
Of the many varieties of banana, whose scientific name is Musa acuminata, one called DH-Pahang is a breed known for its susceptibility for disease, making it a poor crop choice. Shunned by the agriculture industry, DH-Pahang rose to stardom when the sequencing team, led by two French research organizations, CIRAD and CEA-Genoscope, chose the variety for its project.
The DH-Pahang banana differs from its relatives in that it has what geneticists call a homozygous genome.
“It means both copies of each chromosome are identical,” Lyons explained. “Working with a homozygous genome makes it easier to solve the jigsaw puzzle of the genome and correctly assemble all the pieces. You don't get confused by having slightly different puzzle pieces, or sequences, for gene alleles across a genome.”
According to the sequencing consortium, bananas are vitally important for the food and economic security of more than 400 million people in southern countries, but they are under constant pressure from a range of parasites. That pressure is particularly high in plantations producing the “export” bananas we find in our supermarkets. This makes it crucial to develop new, more resistant varieties, although this is a complex operation given the very low fertility of cultivated banana varieties.
The newly available genome sequence provides access to each one of the plant’s genes and to their position on its 11 chromosomes. The consortium said in a statement that this knowledge will make it much easier to identify the genes responsible for characters such as disease resistance and fruit quality. Lastly, it will be a vital tool for improving banana varieties using the many genetic resources available worldwide.
The banana is the first non-grassy plant in its botanical class, the monocotyledons, whose entire genome has been sequenced. Monocots include grasses, palms, lilies and other plants of mostly fleshy stature. Dicots, on the other hand, comprise more evolutionary recent plants including the majority of flowering plants and all true trees.
“The banana is the first monocot genome we have sequenced that’s not from a cereal, “ Lyons said. “That gives us a good opportunity to compare this group to its distant relatives and better understand the evolution of the monocot lineage.”
Already, researchers have been able to establish that banana has seen three episodes of complete genome duplication, at least two of which are independent of those seen in grasses. Unlike in the animal kingdom, duplicating an entire genome is nothing unusual in the plant world.
Said Lyons, “We sometimes joke that you as soon as you give a plant a funny look, it doubles its entire genome.”
This phenomenon, called polyploidy, is one of the main reasons why plant genomes present a challenge for scientists like Lyons, who are interested in understanding how genomes have changed over evolutionary time and relate that to the function of the organism.
Many of our domestic cultivars are polyploids, for example the mustards, many cereals such as wheat, or fruit like strawberry and watermelon. While most of the genes resulting from such events are generally lost over evolutionary time, some persist and lead to the emergence of new biological traits and functions.
“Consider for example, one of the master genes that regulate the development of a plant,” Lyons said. “A whole genome duplication event creates two copies. Over evolutionary time, these copies may change such that one copy is only active in leaves and the other only in flowers. This provides the plant with opportunities to develop all sorts of interesting developmental architectures.”
Researchers have already identified certain regulatory genes called transcription factors, which are particularly numerous in banana and contribute to important processes such as fruit ripening.
The work was conducted with financial support from the French National Research Agency. The banana genome sequence is publicly available on the CIRAD website.