Rockville, MD -- Ever since the genomics revolution took off, scientistshave been busily deciphering vast numbers of genomes. Cataloging.Analyzing. Comparing. Public databases hold 239 complete bacterialgenomes alone.
But scientists at The Institute for Genomic Research (TIGR) havecome to a startling conclusion. Armed with the powerful tools ofcomparative genomics and mathematics, TIGR scientists have concludedthat researchers might never fully describe some bacteria andviruses--because their genomes are infinite. Sequence one strain of thespecies, and scientists will find significant new genes. Sequenceanother strain, and they will find more. And so on, infinitely.
"Many scientists study multiple strains of an organism," saysTIGR President Claire Fraser. "But at TIGR, we're now going a stepfurther, to actually quantify how many genes are associated with agiven species. How many genomes do you need to fully describe abacterial species?"
In pursuit of that question, TIGR scientist Hervé Tettelin andcolleagues published a study in this week's (September 19-23) earlyonline edition of the Proceedings of the National Academy of Sciences(PNAS). In the study, TIGR scientists, with collaborators at ChironCorporation, Harvard Medical School and Seattle Children's Hospital,compared the genomic sequence of eight isolates of the same bacterialspecies: Streptococcus agalactiae, also known as Group B Strep (GBS), which can cause infection in newborns and immuno-compromised individuals.
Analyzing the eight GBS genomes, the researchers discovered asurprisingly continual stream of diversity. Each GBS strain containedan average of 1806 genes present in every strain (thus constituting theGBS core genome) plus 439 genes absent in one or more strains.Moreover, mathematical modeling showed that unique genes will continueto emerge, even after thousands of genomes are sequenced. The GBSpan-genome is expected to grow by an average of 33 new genes every timea new strain is sequenced.
"We were surprised to find that we haven't cornered thisspecies yet," says Tettelin, lead author of the PNAS paper. "We stilldon't know--and apparently, we'll never know--the extent of itsdiversity."
To interpret this infinite view of microbial genomes, Tettelinand colleagues propose describing a species by its "pan-genome": thesum of a core genome, containing genes present in all strains, and adispensable genome, with genes absent from one or more strains andgenes unique to each strain.
The pan-genome is more than mere syntax. The concept has realimplications for molecular biology. Many important pathogens--includingthose responsible for influenza, Chlamydia, and gastrointestinalinfections, all under study at TIGR--contain multiple strains withspecific genomes. By bringing a pan-genome perspective to the study ofthese organisms, scientists may better learn how new pathogens emergeand better target therapies to specific conditions. One approach is tospotlight a species's core genome. On the flip side, scientists mayeliminate a core genome, hunting instead for fringe genes that explaina specific strain's unique activity.
TIGR researchers say the pan-genome concept also underscoresthe limits of traditional known genomes. Researchers often refer to a"type" genome to describe a given species. That singular,representative genome is often simply the strain easiest to acquirefrom nature or grow in the lab. Yet scientists worldwide routinely tapthese known genomes in public databases to hunt for drug targets,explain ecological niches, and chart evolution. How well do thesemicrobial genomes reflect reality?
As comparative genomics itself evolves, Fraser expects TIGR toincreasingly focus on pan-genomes. Many questions remain. Although somemicrobial species, such as GBS, have infinite pan-genomes, forinstance, others are more limited. Comparing eight independent isolatesof Bacillus anthracis (the bacterium that causes anthrax), forinstance, Tettelin and colleagues found that just four genomes weresufficient to characterize its pan-genome. That raises interestingquestions about rates of evolution, notes Fraser. "We're intrigued tolearn more about the diversity within a given species, and how ithappens," she says.
The Institute for Genomic Research (TIGR) is a not-for-profit centerdedicated to deciphering and analyzing genomes. Since 1992, TIGR, basedin Rockville, Md., has been a genomics leader, conducting researchcritical to medicine, agriculture, energy, the environment andbiodefense.
Cite This Page: