Natural spoken dialogue technology has long been a dream for many. Advances by European researchers are making this a reality.
The results of their work could soon be used to allow us to verbally interact with technology in our everyday lives, from the music systems in our cars to functions in the homes of wheelchair users.
Interactions between human and computer are currently inefficient, particularly when we try talking. Previously, users have had to rely on specific commands making natural interactions in everyday language impossible.
Motivated by the idea of allowing people to say what they want to say, in the way they want to say it, the EU-funded TALK (Talk and Look, Tools for Ambient Linguistic Knowledge) project set about developing technology that would also allow for the systems to learn from the process.
“We developed methods for designing better, more natural, flexible, and adaptive spoken dialogue systems that learn from their interactions with users,” says Oliver Lemon from Edinburgh University and project coordinator. “We showed for the first time that machine learning techniques in Information State Update systems can lead to better human-computer interaction.”
And show it they did, at the IST 2006 conference in Helsinki, where project partners BMW, Bosch, and DFKI showcased some of the fruits of the project with SAMMIE, an in-car dialogue system for an MP3 player.
Sammie was installed in a BMW car. The system operates in German and English. This multilingual system is a first in human-computer interactions.
“This in-car system was extensively tested by BMW and Bosch, with real drivers, and was assessed to be less distracting and more comfortable than two competing systems,” says Lemon.
Information states versus graphs
First generation human-computer dialogue systems modelled speech interactions as a series of graphs. While this yields a functioning system, it is also an inefficient and fixed system. Key to the advances is a mathematical structure known as the Information State Update (ISU) approach, which was developed by TALK’s predecessor projects.
The approach, already used in many projects and applications around the world, uses information recorded in the course of human-computer-dialogue and saved in the ‘information state’ of the system.
This can include, for example, a formal meaning representation of information that has been uttered by a user, and a complete history of the dialogue up to that point. The computer or device then calculates the appropriate reaction for the situation, as well as updating the information state with new data on the user and context.
“This approach allows a level of flexibility, adaptivity, robustness, and naturalness of interaction which is superior to the previous techniques, which rely on simple finite state machine representations which essentially model conversations as enormous graphs,” says Lemon.
The TALK technologies can also be applied to different languages, graphical interfaces and operating systems, which means developing other applications that use the technology will be more time- and cost-effective.
“Given the continuous improvements in embedded computing devices, such as better processor speed and miniaturisation, we can expect that speech interfaces similar to SAMMIE can now be installed in vehicles at relatively low cost,” says Lemon.
“In fact, note that many new vehicles now do have voice control, like the BMW iDrive that is similar in many respects to the TALK project's SAMMIE system, although some of its features have been simplified.”
Out of the car and into the home
In addition to SAMMIE in the car, which allows the user to control the onboard music system by talking to it, the impressive list of developments using the ISU and machine learning technologies include TownInfo. The service gives tourists a talking guide of the place they are visiting. Another, AgendaTalk, functions as a voice-controlled calendar and diary.
Additionally, the technology has been applied to making the lives of housebound or mobility-restricted people easier. The European researchers developed MIMUS – a spoken dialogue system for smart homes for wheelchair users.
“The smart-home system is an excellent example of the potential of natural, robust, and adaptive spoken dialogue technology to transform the ways in which we will interact with IT in the future, for all members of society,” says Lemon.
The future today
The BMW iDrive, developed alongside the SAMMIE system, is proof that the TALK technology is the science of today and not tomorrow. But what about the other possibilities? When could these start filtering through to everyday life?
“Some other results, such as machine learning techniques, will take longer to influence spoken dialogue systems in everyday life, but the levels of adaptivity and robustness that the various TALK systems offer will be vital in future speech interfaces,” says Lemon.
And the researchers are continuing their studies, albeit under a different guise. A new EU funded project ‘CLASSIC’ (Computational Learning in Adaptive Systems for Spoken Conversation) will ensure the EU-researchers keep pushing forward the boundaries of human-computer interaction.
TALK received funding from the EU's Sixth Framework Programme for research.
Cite This Page: