Featured Research

from universities, journals, and other organizations

New mathematical model of information processing in the brain accurately predicts some of the peculiarities of human vision

Date:
March 8, 2011
Source:
Massachusetts Institute Of Technology
Summary:
The human retina -- the part of the eye that converts incoming light into electrochemical signals -- has about 100 million light-sensitive cells. So retinal images contain a huge amount of data. High-level visual-processing tasks -- like object recognition, gauging size and distance, or calculating the trajectory of a moving object -- couldn't possibly preserve all that data: The brain just doesn't have enough neurons. So vision scientists have long assumed that the brain must somehow summarize the content of retinal images, reducing their informational load before passing them on to higher-order processes.

Rosenholtz's model grew out of her investigation of a phenomenon called visual crowding. If you were to concentrate your gaze on a point at the center of a mostly blank sheet of paper, you might be able to identify a solitary A at the left edge of the page. But you would fail to identify an identical A at the right edge, the same distance from the center, if instead of standing on its own it were in the center of the word "BOARD." Rosenholtz's approach explains this disparity: The statistics of the lone A are specific enough to A's that the brain can infer the letter's shape; but the statistics of the corresponding patch on the other side of the visual field also factor in the features of the B, O, R and D, resulting in aggregate values that don't identify any of the letters clearly.
Credit: Image courtesy of MIT

The human retina -- the part of the eye that converts incoming light into electrochemical signals -- has about 100 million light-sensitive cells. So retinal images contain a huge amount of data. High-level visual-processing tasks -- like object recognition, gauging size and distance, or calculating the trajectory of a moving object -- couldn't possibly preserve all that data: The brain just doesn't have enough neurons. So vision scientists have long assumed that the brain must somehow summarize the content of retinal images, reducing their informational load before passing them on to higher-order processes.

At the Society of Photo-Optical Instrumentation Engineers' Human Vision and Electronic Imaging conference on Jan. 27, Ruth Rosenholtz, a principal research scientist in the Department of Brain and Cognitive Sciences, presented a new mathematical model of how the brain does that summarizing. The model accurately predicts the visual system's failure on certain types of image-processing tasks, a good indication that it captures some aspect of human cognition.

Most models of human object recognition assume that the first thing the brain does with a retinal image is identify edges -- boundaries between regions with different light-reflective properties -- and sort them according to alignment: horizontal, vertical and diagonal. Then, the story goes, the brain starts assembling these features into primitive shapes, registering, for instance, that in some part of the visual field, a horizontal feature appears above a vertical feature, or two diagonals cross each other. From these primitive shapes, it builds up more complex shapes -- four L's with different orientations, for instance, would make a square -- and so on, until it's constructed shapes that it can identify as features of known objects.

While this might be a good model of what happens at the center of the visual field, Rosenholtz argues, it's probably less applicable to the periphery, where human object discrimination is notoriously weak. In a series of papers in the last few years, Rosenholtz has proposed that cognitive scientists instead think of the brain as collecting statistics on the features in different patches of the visual field.

Patchy impressions

On Rosenholtz's model, the patches described by the statistics get larger the farther they are from the center. This corresponds with a loss of information, in the same sense that, say, the average income for a city is less informative than the average income for every household in the city. At the center of the visual field, the patches might be so small that the statistics amount to the same thing as descriptions of individual features: A 100-percent concentration of horizontal features could indicate a single horizontal feature. So Rosenholtz's model would converge with the standard model.

But at the edges of the visual field, the models come apart. A large patch whose statistics are, say, 50 percent horizontal features and 50 percent vertical could contain an array of a dozen plus signs, or an assortment of vertical and horizontal lines, or a grid of boxes.

In fact, Rosenholtz's model includes statistics on much more than just orientation of features: There are also measures of things like feature size, brightness and color, and averages of other features -- about 1,000 numbers in all. But in computer simulations, storing even 1,000 statistics for every patch of the visual field requires only one-90th as many virtual neurons as storing visual features themselves, suggesting that statistical summary could be the type of space-saving technique the brain would want to exploit.

Rosenholtz's model grew out of her investigation of a phenomenon called visual crowding. If you were to concentrate your gaze on a point at the center of a mostly blank sheet of paper, you might be able to identify a solitary A at the left edge of the page. But you would fail to identify an identical A at the right edge, the same distance from the center, if instead of standing on its own it were in the center of the word "BOARD."

Rosenholtz's approach explains this disparity: The statistics of the lone A are specific enough to A's that the brain can infer the letter's shape; but the statistics of the corresponding patch on the other side of the visual field also factor in the features of the B, O, R and D, resulting in aggregate values that don't identify any of the letters clearly.

Road test

Rosenholtz's group has also conducted a series of experiments with human subjects designed to test the validity of the model. Subjects might, for instance, be asked to search for a target object -- like the letter O -- amid a sea of "distractors" -- say, a jumble of other letters. A patch of the visual field that contains 11 Q's and one O would have very similar statistics to one that contains a dozen Q's. But it would have much different statistics than a patch that contained a dozen plus signs. In experiments, the degree of difference between the statistics of different patches is an extremely good predictor of how quickly subjects can find a target object: It's much easier to find an O among plus signs than it is to find it amid Q's.

Rosenholtz, who has a joint appointment to the Computer Science and Artificial Intelligence Laboratory, is also interested in the implications of her work for data visualization, an active research area in its own right. For instance, designing subway maps with an eye to maximizing the differences between the summary statistics of different regions could make them easier for rushing commuters to take in at a glance.

In vision science, "there's long been this notion that somehow what the periphery is for is texture," says Denis Pelli, a professor of psychology and neural science at New York University. Rosenholtz's work, he says, "is turning it into real calculations rather than just a side comment." Pelli points out that the brain probably doesn't track exactly the 1,000-odd statistics that Rosenholtz has used, and indeed, Rosenholtz says that she simply adopted a group of statistics commonly used to describe visual data in computer vision research. But Pelli also adds that visual experiments like the ones that Rosenholtz is performing are the right way to narrow down the list to "the ones that really matter."


Story Source:

The above story is based on materials provided by Massachusetts Institute Of Technology. The original article was written by Larry Hardesty. Note: Materials may be edited for content and length.


Cite This Page:

Massachusetts Institute Of Technology. "New mathematical model of information processing in the brain accurately predicts some of the peculiarities of human vision." ScienceDaily. ScienceDaily, 8 March 2011. <www.sciencedaily.com/releases/2011/02/110202215339.htm>.
Massachusetts Institute Of Technology. (2011, March 8). New mathematical model of information processing in the brain accurately predicts some of the peculiarities of human vision. ScienceDaily. Retrieved April 23, 2014 from www.sciencedaily.com/releases/2011/02/110202215339.htm
Massachusetts Institute Of Technology. "New mathematical model of information processing in the brain accurately predicts some of the peculiarities of human vision." ScienceDaily. www.sciencedaily.com/releases/2011/02/110202215339.htm (accessed April 23, 2014).

Share This



More Computers & Math News

Wednesday, April 23, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

High Court to Hear Dispute of TV Over Internet

High Court to Hear Dispute of TV Over Internet

AP (Apr. 22, 2014) The future of Aereo, an online service that provides over-the-air TV channels, hinges on a battle with broadcasters that goes before the U.S. Supreme Court on Tuesday. (April 22) Video provided by AP
Powered by NewsLook.com
Aereo Takes on Broadcast TV Titans in Supreme Court Today

Aereo Takes on Broadcast TV Titans in Supreme Court Today

TheStreet (Apr. 22, 2014) Aereo heads to the Supreme Court today to fight for its right to stream broadcast TV over the Internet -- against broadcasters who say the start-up infringes upon copyright law. TheStreet Deputy Managing Editor Leon Lazaroff explains the importance of the case in the TV industry and details what the outcome of it could mean for broadcasters and for cloud storage services -- as Aereo allows its subscribers to not just watch live TV shows but also store content to a DVR in the cloud. Video provided by TheStreet
Powered by NewsLook.com
Lytro Introduces 'Illum,' A Professional Light-Field Camera

Lytro Introduces 'Illum,' A Professional Light-Field Camera

Newsy (Apr. 22, 2014) The light-field photography engineers at Lytro unveiled their next innovation: a professional DSLR-like camera called "Illum." Video provided by Newsy
Powered by NewsLook.com
Netflix To Raise Prices For New Subscribers

Netflix To Raise Prices For New Subscribers

Newsy (Apr. 21, 2014) Netflix executives say they don't think a $1 or $2 price hike will hurt the service, and they have their sites set on overtaking HBO. Video provided by Newsy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:

Breaking News:
from the past week

In Other News

... from NewsDaily.com

Science News

    Health News

    Environment News

    Technology News



    Save/Print:
    Share:

    Free Subscriptions


    Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

    Get Social & Mobile


    Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

    Have Feedback?


    Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
    Mobile: iPhone Android Web
    Follow: Facebook Twitter Google+
    Subscribe: RSS Feeds Email Newsletters
    Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins