Featured Research

from universities, journals, and other organizations

Image Search To Analyze Images (Not Surrounding Text)

April 2, 2007
University of California - San Diego
Electrical engineers have made progress on a different kind of image search engine -- one that analyzes the images themselves. This approach may be folded into next-generation image search engines for the Internet; and in the shorter term, could be used to annotate and search commercial and private image collections.

Image Segmentation: the system recognized much of the mountain and sky but had problems recognizing the snow on the mountain. For the swimmer photo, the system – for the most part – separated swimmer from water. The swimming cap and parts of the swimmer’s back in shadow, however, proved more difficult.
Credit: Image courtesy of University of California - San Diego

A Google image search for “tiger” yieldsmany tiger photos – but also returns images of a tiger pear cactus stuck in a tire, a racecar, Tiger Woods, the boxer Dick Tiger, Antarctica, and many others. Why? Today’s large Internet search engines look for images using captions or other text linked to images rather than looking at what is actually in the picture.

Related Articles

Electrical engineers from UC San Diego are making progress on a different kind of image search engine – one that analyzes the images themselves. This approach may be folded into next-generation image search engines for the Internet; and in the shorter term, could be used to annotate and search commercial and private image collections.

“You might finally find all those unlabeled pictures of your kids playing soccer that are on your computer somewhere,” said Nuno Vasconcelos, a professor of electrical engineering at the UCSD Jacobs School of Engineering, and senior author of a paper in the March 2007 issue of the IEEE journal TPAMI – a paper coauthored by Gustavo Carneiro, a UCSD postdoctoral researcher now at Siemens Corporate Research, UCSD doctoral candidate Antoni Chan, and Google researcher Pedro Moreno.

At the core of this Supervised Multiclass Labeling (SML) system is a set of simple yet powerful algorithms developed at UCSD. Once you train the system, you can set it loose on a database of unlabeled images. The system calculates the probability that various objects or “classes” it has been trained to recognize are present – and labels the images accordingly. After labeling, images can be retrieved via keyword searches. Accuracy of the UCSD system has outpaced that of other content-based image labeling and retrieval systems in the literature. The SML system also splits up images based on content – the historically difficult task of image segmentation. For example, the system canseparate a landscape photo into mountain, sky and lake regions.

“Right now, Internet image search engines don’t use any image content analysis. They are highly scalable in terms of the number of images they can search but very constrained on the kinds of searches they can perform. Our semantic search system is not fully scalable yet, but if we’re clever, we will be able to work around this limitation. The future is bright,” said Vasconcelos.

The UCSD system uses a clever image indexing technique that allows it to cover larger collections of images at a lower computational cost than was previously possible. Whilethe current version would still choke on the Internet’s vast numbers of public images, there is room for improvement and many potential applications beyond the Internet, including the labeling of images in various private and commercial databases.

A New Era for Image Annotation:

Without a caption or any other text label with the word “tiger,” today’s Web-based image search tools would not “see” a tiger in a photo.

The UCSD Supervised Multiclass Labeling system “…outperforms existing approaches by a significant margin, not only in terms of annotation and retrieval accuracy, but also in terms of efficiency,” the authors write in their TPAMI (IEEE Transactions on Pattern Analysis and Machine Intelligence ) paper.

What does Supervised Multiclass Labelingmean?

Supervisedrefers to the fact that the users train the image labeling system to identify classes of objects, such as “tigers,” “mountains” and “blossoms,” by exposing the system to many different pictures of tigers, mountains and blossoms. The supervised approach allows the system to differentiate between similar visual concepts – such as polar bears and grizzly bears. In contrast, “unsupervised” approaches to the sametechnical challengesdo not permit such fine-grained distinctions. “Multiclass”means that the training process can be repeated for many visual concepts. The same system can be trained to identify lions, tigers, trees, cars, rivers, mountains, sky or any concrete object.

This is in contrast to systems that can answer just one question at a time, such as “Is there a horse in this picture?” (Abstract concepts like “happiness” are currently beyond the reach of the new system, however.) “Labeling”refers to the process of linking specific features within images directly to words that describe these features.

Scientists have previously built image labeling and retrieval systems that can figure out the contents of images that do not have captions, but these systems have a variety of drawbacks. Accuracy has been a problem. Also, some older systems need to be shown a picture and then can only find similar photos. Other systems can only determine whether one particular visual concept is present or absent in an image. Still others are unable to search through large collections of images, which is crucial for use in big photo databases and perhaps one day, the Internet. The new system from the Vasconcelos team begins to addresses these open problems.

To understand SML, you need to start with the training process, which involves showing the system many different pictures of the same visual concept or “class,” such as a mountain. When training the system to recognize mountains, the location of the mountains within the photos does not need to be specified. This makes it relatively easy to collect the training examples. After exposure to enough different pictures that include mountains, the system can identify images in which there is a high probability that mountains are present.

During training, the system splits each image into 8-by-8 pixel squares and extracts some information from them. The information extracted from each of these squares is called a “localized feature.” The localized features for an image are collectively known as a “bag of features.”

Next, the researchers pool together each “bag of features” for a particular visual concept. This pooled information summarizes – in a computationally efficient way – the important information about each of the individual mountains. Pooling yields a density estimate that retains the critical details of all the different mountains without having to keep track of every 8 by 8 pixel square from each of the mountain training images.

After the system is trained, it is ready to annotate pictures it has never encountered. The visual concepts that are most likely to be in a photo are labeled as such. In the tiger photo, the SML system processed the image and concluded that “cat, tiger, plants, leaf and grass” were the most likely items in the photograph.

The system, of course, can only label images with visual concepts that it has been trained to recognize.

“At annotation time, all the trained classes directly compete for the image. The image is labeled with the classes that are most likely to actually be in the image,” said Vasconcelos.

One way to test the SML system is to ask it to annotate images in a database and then retrieve images based on text queries.

In the TPAMI paper, the researchers illustrate some of their image annotation and retrieval results. Searching for “blooms” in an image database that was annotated using the new SML system yielded four images of flowers and one of a girl with a flower necklace.

Asking for “mountain” brought up 5 images of mountain ranges.

In the paper, the authors also display images that were returned for searches for the keywords“Pool” “smoke” and “woman.”

Finding Multiple Features in the Same Picture

The system can also recognize a series of different features within the same image. Vasconcelos and colleagues document clear similarities between SML’s automated image labeling and labeling done by humans looking at the same pictures.

Image Segmentation:

The SML system can also split up a single image into its different regions – a process known as “image segmentation.” When the system annotates an image, it assigns the most likely label to each group of pixels or localized feature, segmenting the image into its most likely parts as a regular part of the annotation process.

“Automated segmentation is one of the really hard problems in computer vision, but we’re starting to get some interesting results,” said Vasconcelos.

Looking at the mountain and swimmer images and their respective segmented representations above, you can see that the system recognized and split up the images into their major features.

The SML project was started in 2004 by Gustavo Carneiro who was then a post doctoral researcher in the Vasconcelos lab. Dr. Carneiro currently works at Siemens Corporate Research in Princeton, New Jersey. Doctoral student Antoni Chan, the second author on the paper, spent a summer at Google testing the system on a cluster of 3,000 state-of-the-art Linux machines. Chan worked under the guidance of Dr. Pedro Moreno, a Google researcher and author on the paper. The results from the Google work indicate that the system can be used on large image collections, Chan explained.

“My students go to Google and do experiments at a scale that they can’t do here. The collaboration with Google allows us to use their resources to do things we couldn’t do otherwise,” said Vasconcelos.

Story Source:

The above story is based on materials provided by University of California - San Diego. Note: Materials may be edited for content and length.

Cite This Page:

University of California - San Diego. "Image Search To Analyze Images (Not Surrounding Text)." ScienceDaily. ScienceDaily, 2 April 2007. <www.sciencedaily.com/releases/2007/03/070330092835.htm>.
University of California - San Diego. (2007, April 2). Image Search To Analyze Images (Not Surrounding Text). ScienceDaily. Retrieved March 31, 2015 from www.sciencedaily.com/releases/2007/03/070330092835.htm
University of California - San Diego. "Image Search To Analyze Images (Not Surrounding Text)." ScienceDaily. www.sciencedaily.com/releases/2007/03/070330092835.htm (accessed March 31, 2015).

Share This

More From ScienceDaily

More Computers & Math News

Tuesday, March 31, 2015

Featured Research

from universities, journals, and other organizations

Featured Videos

from AP, Reuters, AFP, and other news services

IBM Promises Millions For Businesses With ... Weather Data?

IBM Promises Millions For Businesses With ... Weather Data?

Newsy (Mar. 31, 2015) IBM announced Tuesday a partnership with The Weather Company and a $3 billion investment for its Internet of Things unit. Video provided by Newsy
Powered by NewsLook.com
Bionic Ants Could Be Tomorrow's Factory Workers

Bionic Ants Could Be Tomorrow's Factory Workers

Reuters - Innovations Video Online (Mar. 30, 2015) Industrious 3D printed bionic ants working together could toil in the factories of the future, says German technology company Festo. The robotic insects cooperate and coordinate their actions and movements to achieve a common aim. Amy Pollock reports. Video provided by Reuters
Powered by NewsLook.com
Internet Giants Drive Into the Electric Vehicle Space

Internet Giants Drive Into the Electric Vehicle Space

Reuters - Business Video Online (Mar. 30, 2015) Internet companies are looking to disrupt the auto industry with new smart e-vehicles, but widespread adoption in Asia may not be cured by new Chinese investments. Pamela Ambler reports. Video provided by Reuters
Powered by NewsLook.com
Talking Dinosaur Toy Has All The Answers

Talking Dinosaur Toy Has All The Answers

Rooftop Comedy (Mar. 29, 2015) A company has invented a new toy that can have an entire conversation with kids. It’s called CogniToy, and it’s a plastic dinosaur that is powered by IBM’s super computer, Watson. So, it basically knows the answer to every question, and can even tell jokes, stories, and remember things. Parents – would you buy CogniToy? Video provided by Rooftop Comedy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.


Breaking News:

Strange & Offbeat Stories

Space & Time

Matter & Energy

Computers & Math

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News


Free Subscriptions

Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile

Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?

Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile: iPhone Android Web
Follow: Facebook Twitter Google+
Subscribe: RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins