Featured Research

from universities, journals, and other organizations

Software Program Developed At Georgia Tech Adds New Capabilities To Decipher Genomes Of Higher Organisms

Date:
December 2, 1998
Source:
Georgia Institute Of Technology
Summary:
A software program that has been successfully annotating the genes of common bacteria since 1992 is now capable of finding genes in higher organisms. It is particularly useful for finding human genes in anonymous human DNA sequences. Researchers deciphered the complete bacterial genome of Haemophilus influenzae , the structure of which is depicted here, using GeneMark, a software program developed at Georgia Tech.

A software program that has been successfully annotating the genes of common bacteria since 1992 is now capable of finding genes in higher organisms. It is particularly useful for finding human genes in anonymous human DNA sequences.

Related Articles


Understanding the genomes of key microorganisms may increase understanding of human genetics because lower organisms have some genes that correspond to human genes. Also scientists can design new drugs based on knowledge of disease-causing bacteria.

The original software program, called GeneMark, uses probabilistic mathematical models to predict the locations of genes on a strand of DNA. GeneMark was developed by Dr. Mark Borodovsky, a professor of biology at the Georgia Institute of Technology. It has become the world's most-used software program for deciphering bacterial DNA and has proven itself 98 percent accurate.

Borodovsky's latest development uses GeneMark.hmm, a refined version of the original program, as its base to make more sophisticated predictions for the genomes of eukaryotic, or higher organisms.

"Deciphering bacterial DNA is simpler than deciphering human DNA since its genes run continuously, without gaps," Borodovsky explained. "The genes of human DNA may be divided into pieces, called exons, with non-coding genetic material between the exons. These spacers in the genes, called introns, were hard to detect by a computer algorithm. Also, eukaryotic DNA is much longer, with an average gene size of 10,000 nucleotides."

Therefore, the predictions of where eukaryotic genes lie on a strand of DNA must include predictions of the boundaries between the exons, which contain the genetic information, and introns, which are the non-coding regions.

To create a computer program to achieve this, Borodovsky employed a probabilistic mathematical model called Hidden Markov Models or HMM. A recent grant from the National Institutes of Health (NIH) is funding incorporation of HMM into GeneMark, making the program responsive to the boundaries between exons and introns.

Borodovsky developed GeneMark.hmm with Dr. Alexander Lukashin, a researcher who works in the lab. A test of the program demonstrated its "state-of- the-art accuracy," said Borodovsky, meaning, when tested against current means of finding eukaryotic genes, GeneMark.hmm performed at least as well as the best current methods.

"GeneMark.hmm is more than a static software program or product," Borodovsky noted. "It is rather an approach for DNA sequence analysis that is under continuous development."

It is already being used to annotate parts of the genomes of five eukaryotic organisms, including humans, nematodes, fruit flies and a plant in the mustard family.

Borodovsky will present his latest results at the International Workshop on Genomic Sequence Analysis on Dec. 1-4 at the Issak Newton Institute for Mathematical Sciences at the University of Cambridge in England.

GeneMark.hmm will fill a need, as evidenced by early demand from scientists, Borodovsky said. Even before information about GeneMark.hmm has been published in a scientific journal, almost 30 researchers expressed interest to one of Borodovsky's graduate students, John Besemer, who gave a poster presentation on GeneMark.hmm at a recent conference on the eukaryotic organism Chlamydomonas reinhardtii.

Meanwhile, Borodovsky has recorded his research in predicting gene coding regions in a chapter of new book "Organization of the Prokaryotic Genome," soon to be published by the American Society of Microbiology. The chapter is called "Statistical Predictions in Genuine Coding Regions."

Borodovsky, a Russian emigre, conceived the idea for GeneMark while still living in Russia in the 1980s. He envisioned a software program based on Markov models to manage the vast amounts of genetic information scientists were churning out.

The Russian mathematician, Andrey Markov, introduced his models early in the 20th century. Borodovsky believed Markov models could portray genes by the frequency of certain combinations of bases in known genes, contrary to non-genes. Therefore these probabilistic models could be applied to DNA sequences to predict where genes would lie on DNA.

When scientists sequence DNA, they are left with strings of nucleotides that need to be separated into genes and non-coding regions and then translated into proteins to make sense.

Since 1992, researchers from around the world have sent their sequenced DNA fragments via e-mail to Georgia Tech's GeneMark e-mail server, which predicts locations of genes. After mapping gene locations, the computer program compares the newly predicted protein sequence to known ones in a database. This determines protein function. The protein analysis is done in collaboration with the National Center for Biotechnology Information at the NIH.

GeneMark has proven itself a powerful tool for finding bacterial genes, in particular. Researchers at the Institute for Genomic Research have used GeneMark to sequence the complete genomes of numerous common bacteria.

GeneMark Genesis, the refined version of GeneMark, which Borodovsky developed with graduate student William Hayes, was used to find genes in genomes of the bacteria Methanoccocus jannaschii and Helicobacter pylori. There were no experimentally studied segments of M. jannaschii available to train the Markov models, upon which gene prediction is based in GeneMark. So the new program "learned Markov models from anonymous sequences based on the grammar of the genetic code," Borodovsky said.

Borodovsky's work is at the forefront of a new interdisciplinary field called bioinformatics, which uses mathematical methods and computers to answer many important biological questions. Bioinformatics can also help discover genes and design new drugs. Borodovsky is spearheading development of Georgia Tech's new master's degree program in bioinformatics, the first such program in the United States.


Story Source:

The above story is based on materials provided by Georgia Institute Of Technology. Note: Materials may be edited for content and length.


Cite This Page:

Georgia Institute Of Technology. "Software Program Developed At Georgia Tech Adds New Capabilities To Decipher Genomes Of Higher Organisms." ScienceDaily. ScienceDaily, 2 December 1998. <www.sciencedaily.com/releases/1998/12/981202075222.htm>.
Georgia Institute Of Technology. (1998, December 2). Software Program Developed At Georgia Tech Adds New Capabilities To Decipher Genomes Of Higher Organisms. ScienceDaily. Retrieved December 19, 2014 from www.sciencedaily.com/releases/1998/12/981202075222.htm
Georgia Institute Of Technology. "Software Program Developed At Georgia Tech Adds New Capabilities To Decipher Genomes Of Higher Organisms." ScienceDaily. www.sciencedaily.com/releases/1998/12/981202075222.htm (accessed December 19, 2014).

Share This


More From ScienceDaily



More Health & Medicine News

Friday, December 19, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Kids Die While Under Protective Services

Kids Die While Under Protective Services

AP (Dec. 18, 2014) As part of a six-month investigation of child maltreatment deaths, the AP found that hundreds of deaths from horrific abuse and neglect could have been prevented. AP's Haven Daley reports. (Dec. 18) Video provided by AP
Powered by NewsLook.com
Dads-To-Be Also Experience Hormone Changes During Pregnancy

Dads-To-Be Also Experience Hormone Changes During Pregnancy

Newsy (Dec. 18, 2014) A study from University of Michigan researchers found that expectant fathers see a decrease in testosterone as the baby's birth draws near. Video provided by Newsy
Powered by NewsLook.com
Prenatal Exposure To Pollution Might Increase Autism Risk

Prenatal Exposure To Pollution Might Increase Autism Risk

Newsy (Dec. 18, 2014) Harvard researchers found children whose mothers were exposed to high pollution levels in the third trimester were twice as likely to develop autism. Video provided by Newsy
Powered by NewsLook.com
UN: Up to One Million Facing Hunger in Ebola-Hit Countries

UN: Up to One Million Facing Hunger in Ebola-Hit Countries

AFP (Dec. 17, 2014) Border closures, quarantines and crop losses in West African nations battling the Ebola virus could lead to as many as one million people going hungry, UN food agencies said on Wednesday. Duration: 00:52 Video provided by AFP
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:

Breaking News:

Strange & Offbeat Stories


Health & Medicine

Mind & Brain

Living & Well

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile: iPhone Android Web
Follow: Facebook Twitter Google+
Subscribe: RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins