New genetic research technologies, such as DNA chips, enable scientists to evaluate simultaneously tissue samples from several patients, expressing thousands of genes. However, deciphering the vast amount of information derived, consisting of anything from 100,000 to 1,000,000 genetic “figures,” requires highly sophisticated data processing tools.
Addressing this and similar challenges may soon be easier thanks to Prof. Eytan Domany of the Weizmann Institute's Physics of Complex Systems Department and doctoral students Gad Getz and Erel Levine. The team has designed a unique mathematical system for analyzing genetic data based on a computer algorithm that "clusters" information into relevant categories. The algorithm searches simultaneously for clusters of "similar" genes and patients by evaluating the gene expression of tissue samples. (A gene's "expression" refers to the production level of the proteins it encodes.)
Reported in the October 17 issue of the Proceedings of the National Academy of Sciences (PNAS), the algorithm's most powerful feature is that it mimics unassisted learning. Unlike most automated "sorting" processes, in which a computer must be informed of the relevant categories in advance, the algorithm is analogous to human intuition (such as the ability to intuitively categorize images of animals and cars into proper classes). When given a clustering task, it analyzes the data, computes the degree of similarity among its components, and determines its own clustering criteria.
The new method makes use of a previous application by Domany and his colleagues based on a well-known physical phenomenon. When a granular magnet such as a magnetic tape is warm, its grains are highly disorganized. But upon cooling down, the magnet’s grains progressively organize themselves into well-ordered clusters. Using the statistical mechanics of granular magnets, Domany created an algorithm that can look for clusters in any data.
When applied in a cancer study using DNA chips, the new algorithm proved highly effective, evaluating roughly 140,000 figures representing the cellular expression of 2,000 genes from 70 subjects. The algorithm categorized tissue samples into separate clusters according to their gene expression profiles. For example, one cluster consisted of cancerous tissues, while another contained samples from healthy subjects. The new method also distinguished among different forms of cancer and demonstrated treatment effects, picking up differences in the gene expression of leukemia patients that had received treatment versus those that had not. The ability to monitor cell response to treatment and understand the origin of disease in each patient may improve future treatment protocols, which would be tailored to individual pathologies.
Finally, one of the algorithm’s most promising features is that it enabled researchers to pinpoint a small group of genes from within the 2,000 examined that can be used to accurately distinguish among cellular cancerous processes.
In a sense, however, applying the new algorithm to DNA chips is only a start. The new algorithm's inherent clustering capacity makes it invaluable for use in data-heavy scientific and industrial applications. It may be used to analyze financial information and MRI data in brain research, or to perform "data mining," the process by which specific details are culled from the world's huge and ever-growing data banks, such as those generated by the international Human Genome Project. The Institute's technology transfer arm, Yeda Research and Development, has issued a patent application for the algorithm.
The Weizmann Institute of Science, in Rehovot, Israel, is one of the world’s foremost centers of scientific research and graduate study. Its 2,500 scientists, students, technicians and engineers pursue basic research in the quest for knowledge and the enhancement of humanity. New ways of fighting disease and hunger, protecting the environment, and harnessing alternative sources of energy are high priorities at Weizmann.
Materials provided by Weizmann Institute. Note: Content may be edited for style and length.
Cite This Page: