Featured Research

from universities, journals, and other organizations

'Knowledge Discovery' Could Speed Creation Of New Products

Date:
October 20, 2004
Source:
Purdue University
Summary:
A team at Purdue University currently is developing a "data-rich" environment for scientific discovery that uses high-performance computing and artificial intelligence software to display information and interact with researchers in the language of their specific disciplines.

Purdue University graduate student Leif Delgass reviews chemical structures associated with points in a 3-D "scatter plot." The interactive graph is part of a system being developed at Purdue University that could dramatically speed up scientific discovery by enabling researchers to test hypotheses in real time using high-performance computing and artificial intelligence software. (Purdue News Service photo/David Umberger)

WEST LAFAYETTE, Ind. – In the recent science-fiction thriller "Minority Report," Tom Cruise plays a detective who solves future crimes by being immersed in a "data cave," where he rapidly accesses all the relevant information about the identity, location and associates of the potential victim.

A team at Purdue University currently is developing a similar "data-rich" environment for scientific discovery that uses high-performance computing and artificial intelligence software to display information and interact with researchers in the language of their specific disciplines.

"If you were a chemist, you could walk right up to this display and move molecules and atoms around to see how the changes would affect a formulation or a material's properties," said James Caruthers, a professor of chemical engineering at Purdue.

The method represents a fundamental shift from more conventional techniques in computer-aided scientific discovery.

"Most current approaches to computer-aided discovery center on mining data in a process that assumes there is a nugget of gold that needs to be found in a sea of irrelevant information," Caruthers said. "This data-mining approach is appropriate for some scientific discovery problems, but scientific understanding often proceeds through a different method, a 'knowledge discovery' approach.

"Instead of mining for a nugget of gold, knowledge discovery is more like sifting through a warehouse filled with small gears, levers, etc., none of which is particularly valuable by itself. After appropriate assembly, however, a Rolex watch emerges from the disparate parts."

A team of researchers at Purdue led by Caruthers is developing a computer environment that allows experts to talk naturally in their specific scientific language. That way, the researchers don't have to deal with computerese and can take full advantage of the most advanced visualization capabilities to become more engaged in the scientific discovery process, Caruthers said.

Such a system could become crucial for enabling scientists to deal with the recent explosion of data now available to them. The source of this flood of data is "high-throughput" experimentation, in which hundreds or thousands of experiments are conducted simultaneously in tiny vessels that are sometimes as small as a few human hairs. Having so much information presents a challenge: it is difficult for researchers to find what they are looking for within this huge sea of data.

"You run the risk of drowning in data," said W. Nicholas Delgass, a Purdue professor of chemical engineering. "What you really want is knowledge, not data."

Purdue researchers believe they have a solution to the problem. They are developing a method to extract knowledge from data, promising to speed up the process of discovery in many areas of research, including work aimed at creating new drugs, fuel additives, catalysts and rubber compounds.

The method, called "discovery informatics," enables researchers to test new theories on the fly and literally see how well their concepts might work in real time via a three-dimensional display, said Venkat Venkatasubramanian, another professor of chemical engineering working to develop the new system.

The multidisciplinary effort involves researchers from Purdue's College of Engineering, School of Science, School of Technology, Information Technology at Purdue, or ITaP, and the e-Enterprise Center in Purdue's Discovery Park, a collection of six centers formed to speed the development of new technologies.

Discovery informatics depends on a two-part repeating cycle made up of a "forward model" and an "inverse process" and two types of artificial intelligence software: hybrid neural networks and genetic algorithms.

The forward model combines fundamental knowledge and rules of thumb with neural networks – software that mimics how the human brain thinks – to tell researchers how a particular material will perform.

"In the forward model, a researcher postulates a molecular structure or a product's formulation and then wants to predict what properties that structure or formulation will have," Delgass said.

The inverse process is just the opposite: Researchers enter the properties they are looking for, and the system gives them a molecular structure or formulation that will likely have those properties. The inverse process cannot begin until the forward model is completed because the former depends on information in the model.

"What we are talking about is an advanced method for product design," said Venkatasubramanian. "The product design problem is this: I want some material that would have the following mechanical, chemical, electrical properties and so on.

"I know what properties I want in order to get my job done, but I don't know what material, what molecular combinations, will give me that. It is a bit like 'Jeopardy.' You know the answer, but you are looking for the question."

The inverse process may use genetic algorithms, software programs that mimic the Darwinian survival-of-the-fittest evolutionary approach to find the best candidates. The algorithms cull the best materials and eliminate the poor performers, just like survival of the fittest, generating "mutations" of the best materials to create even better versions over time, and the software determines the chemical structures of those mutations.

The resulting formulas are tested and used to improve the forward model, and the cycle starts over again, progressively creating better and better solutions.

"Once we have the forward model, we use it to predict which possibilities are going to be good," Delgass said. "Many of them turn out to be bad, but all of the negative information essentially tells me that the model has a flaw because it initially said these were good possibilities, and they weren't.

"Now that I have an opportunity to fix the model, I have a repeating way of making the model better and better."

The cycle might be called a "forward-inverse loop," a method for creating mathematical models that are critical to the discovery process.

"Before you can create one of these models, you typically spend years discovering the fundamental scientific principles behind the problem," Caruthers said. "We want to drastically speed up that discovery process, so that it no longer takes years to create models for important industrial products and processes."

Before high-throughput experimentation, researchers were able to keep up with the amount of available data.

"It's a little bit like horse-and-buggy transportation 100 years ago in this country," said Venkatasubramanian. "The horse and buggy did 10 miles per hour, and your country road supported 10 miles per hour, so everyone was happy. But suddenly now you can produce a month's worth of data in a matter of hours via high-throughput experiments. It's like having a Ferrari on a country road. You can do 200 miles per hour, but you are still stuck driving on the country road.

"Now we need an interstate, a modeling superhighway."

Discovery informatics, which has numerous potential applications, is that modeling superhighway.

"The opportunities are enormous for engineers who work in product design, which is now largely done as an art form by formulation chemists," Caruthers said. "We want to retain the creative aspects that can only come from the human mind, while reducing the amount of guesswork now needed to create new catalysts and other materials.

"Researchers generally discover with an Edisonian, guess-and-test approach. Lots of intuition. Lots of experience. Lots of gray hair. And a little bit of luck. But that cycle is too long, too expensive."

With conventional methods, it might take several years and thousands of tests before hitting on the right formulation, whereas discovery informatics dramatically speeds up the process by using a computer to sample potential materials and requires a fraction of the usual number of laboratory experiments.

The method will be tested in a new Center for Catalyst Design headed by Delgass and funded with a three-year, $2.4 million grant from the U.S. Department of Energy and $1.7 million from the Indiana 21st Century Research and Technology Fund, established by the state to promote high-tech research and to help commercialize university innovations.

Catalysts in American industry account for billions of dollars in annual business revenues. That means even small improvements in catalyst performance can result in significant increases in profits, Delgass said.

Discovery informatics uses the scientific method to enable researchers to test new theories and hypotheses.

"In the scientific method, you make a hypothesis, you see whether the hypothesis fits the data – it never does the first time," Caruthers said. "You then revise your hypothesis and test it back against the data. It's a little better, but it isn't right. You do it again, and you do it again, and eventually you get to where your data and your hypothesis match, and you say, 'Now I have knowledge.'"

Researchers in Purdue's e-Enterprise Center helped the chemical engineers create software prototypes needed to manage huge amounts of data and simulations, turning the information into interactive images, said Joseph Pekny, director of the e-Enterprise Center and a professor of chemical engineering.

Then information technology experts use supercomputers to run the complex software for applications such as predicting chemical reactions and then "visualizing" such data on a three-dimensional, 12-foot-wide, 7-foot-high display in the Envision Center, said Gary Bertoline, associate vice president for discovery resources at ITaP and a professor of computer graphics technology in Purdue's School of Technology.

"We are helping them look at large amounts of data all at the same time," said Laura Arns, a visualization and computer graphics application engineer at ITaP. "You can display information in stereo, in which the left and the right eye each get their own pictures, and you get a 3-D depth effect. To see the 3-D visualization, you wear special glasses that are like sun glasses."

A large 3-D high-resolution display, known as a tiled wall, allows researchers to look at an entire problem, including chemical and atomic structures, graphs and charts.

"You are no longer limited to the size of a computer screen," Caruthers said. "You now have a huge field of view."

Caruthers likens the display to the concept moviegoers saw in "Minority Report."

"We're almost there," Caruthers said. "We will soon have a sophisticated tool that shows researchers in real time whether a particular idea is on the right track."

The method allows scientists and engineers to take full advantage of human creativity.

"Discovery requires human beings making intuitive leaps," Caruthers said. "You try one thing. It doesn't work, you try something else. Sometimes you go off in an entirely new direction.

"But this process is very inefficient. What we are doing is enhancing the efficiency of this process, assisting the intuitive human mind by providing massive data and computing power."

The three engineers presented a paper about their method in July during an international conference, Foundations of Computer-Aided Process Design, at Princeton University. Purdue held a workshop on Sept 13 and 14 focusing on methods of visualizing and manipulating data for the design of new catalysts for chemical reactions. The workshop attracted representatives from national laboratories, industry, academia and the U.S. Department of Energy.

Work to develop the method began in 1988 with funding from the National Science Foundation. Further research has been funded by Lubrizol Co., the Indiana 21st Century Research and Technology Fund and Caterpillar Inc.


Story Source:

The above story is based on materials provided by Purdue University. Note: Materials may be edited for content and length.


Cite This Page:

Purdue University. "'Knowledge Discovery' Could Speed Creation Of New Products." ScienceDaily. ScienceDaily, 20 October 2004. <www.sciencedaily.com/releases/2004/10/041020093016.htm>.
Purdue University. (2004, October 20). 'Knowledge Discovery' Could Speed Creation Of New Products. ScienceDaily. Retrieved April 20, 2014 from www.sciencedaily.com/releases/2004/10/041020093016.htm
Purdue University. "'Knowledge Discovery' Could Speed Creation Of New Products." ScienceDaily. www.sciencedaily.com/releases/2004/10/041020093016.htm (accessed April 20, 2014).

Share This



More Computers & Math News

Sunday, April 20, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Nintendo Changed Gaming World, but Its Future Uncertain: Upstone

Nintendo Changed Gaming World, but Its Future Uncertain: Upstone

AFP (Apr. 19, 2014) The Nintendo Game Boy celebrates its 25th anniversary Monday and game expert Stephen Upstone says the console can be credited with creating a trend towards handheld gaming devices. Duration: 01:21 Video provided by AFP
Powered by NewsLook.com
Nearly Two Weeks On, The Internet Copes With Heartbleed

Nearly Two Weeks On, The Internet Copes With Heartbleed

Newsy (Apr. 19, 2014) The Internet is taking important steps in patching the vulnerabilities Heartbleed highlighted, but those preventive measures carry their own costs. Video provided by Newsy
Powered by NewsLook.com
Facebook To Share Nearby Friends Data With Advertisers

Facebook To Share Nearby Friends Data With Advertisers

Newsy (Apr. 19, 2014) A Facebook spokesperson has confirmed the company will use GPS data from the new Nearby Friends feature for advertising sometime in the future. Video provided by Newsy
Powered by NewsLook.com
Man Claims He Found Loch Ness Monster With... Apple Maps?

Man Claims He Found Loch Ness Monster With... Apple Maps?

Newsy (Apr. 18, 2014) Andy Dixon showed the Daily Mail a screenshot of what he believes to be the mythical beast swimming just below the lake's surface. Video provided by Newsy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:

Breaking News:
from the past week

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile: iPhone Android Web
Follow: Facebook Twitter Google+
Subscribe: RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins