Featured Research

from universities, journals, and other organizations

Fast-learning Computer Translates From Four Languages

Date:
February 22, 2008
Source:
ICT Results
Summary:
Efforts to use computers to translate languages, known as machine translation, date from the 1950s, yet computers still cannot compete with human translators for the quality of the results. Machine translation works best for formal texts in specialized areas where vocabulary is unambiguous and sentence patterns are limited. Aircraft manufacturers, for example, have devised their own systems for quickly translating technical manuals into many languages.

Modern approaches to machine translation between languages require the use of a large ‘corpus’ of literature in each language. Now a European project has demonstrated a cheaper solution which compares favourably with the market leaders in translating from Dutch, German, Greek or Spanish into English.

Related Articles


The European Union now has 23 official languages. That means documents written in one language may need to be translated into any of 22 others, a total of 253 possible language pairs. Small wonder that the institutions of the European Union, and organisations dealing with international commerce, among others, have a keen interest in automating the process where they can.

Efforts to use computers to translate languages, known as machine translation, date from the 1950s, yet computers still cannot compete with human translators for the quality of the results. Machine translation works best for formal texts in specialised areas where vocabulary is unambiguous and sentence patterns are limited. Aircraft manufacturers, for example, have devised their own systems for quickly translating technical manuals into many languages.

The EU has been active in promoting research in this field since the large Eurotra project of the 1980s. In common with other projects of the time, Eurotra used a ‘rules-based’ approach where the computer is taught the rules of syntax and applies them to translate a text from one language to another. This is also the basis of most commercial translation software.

But since the early 1990s the new concept of ‘statistical’ translation has gained ground in the machine translation community, arising out of research into speech recognition. This dispenses with rules in favour of using statistical methods based on a text ‘corpus’.

A corpus is a large body of written material, amounting to tens of millions of words, intended to be representative of a language. Parallel corpora contain the same material in two or more languages and the computer compares the corpora to learn how words and expressions in one language correspond to those in another. An important example is a parallel corpus of 11 languages based on the proceedings of the European Parliament.

Pattern matching

“Parallel corpora are expensive and rare,” says Dr Stella Markantonatou, of the Institute for Language and Speech Processing in Athens, which coordinates the EU’s METIS II project. “They exist only for a very few languages and in small amounts and in specialised texts. So our idea was to try to do statistically based machine translation without this resource, using just monolingual corpora of the target language. For instance, to translate from Greek into English we use a large English corpus.”

To use a single corpus you need a dictionary for the vocabulary and a way to understand the syntax. In the original METIS project, completed in 2003, the corpus was processed to analysis sentence patterns and the text to be translated was then matched against the patterns.

In Greek, for example, the verb can precede the subject of a sentence. “So if you come in with a Greek sentence, ‘Eats Mary a cake’, you would like the machine to be able to translate it into English and rearrange the words to make ‘Mary eats a cake’,” explains Dr Markantonatou. “Pattern matching is a good way of doing that because it is able to take patterns from the source language and make them like the target language.”

METIS II takes the principle further by matching patterns at the ‘chunk’ level, a phrase or fragment of a sentence rather than a sentence as a whole, as this makes the pattern matching more efficient.

It can also use grammar rules to generate alternative possibilities for the translation and then use the corpus to identify which is the more probable. For example, where English would say ‘I like cakes’, some European languages might use the form ‘cakes please me.’ So in translating into English, METIS II can test alternative interpretations against the English language corpus. In this example, 'cakes please me' would get a very low score while the closest match 'I like cakes' would score highly.

Four languages

The partners have now built a system that translates from Greek, Spanish, German or Dutch into English. Trials so far show that it performs well in comparison with SYSTRAN, the rules-based market leader in machine translation. Considering that SYSTRAN is based on half a century of development while METIS II has only run for three years, that is quite an achievement. A prototype is already available on the internet.

The problem now is what to do next. Results from METIS II are being followed up in national research programmes in Spain and Belgium, but there are no plans as yet to further develop the whole system. Some of the components created in the project, such as dictionaries and associated language tools, could be marketable in their own right, but would need an industrial partner to provide the investment needed to turn the prototype into a commercial product.

“For Greek, it would be an excellent opportunity because there is nothing really good for [translating it] at present,” Dr Markantonatou tells ICT Results. “With a better lexicon, fixing bugs and making algorithms more efficient, this kind of thing could work. In another two or three years, METIS could be a very serious competitor to SYSTRAN. It’s a matter of funding.”


Story Source:

The above story is based on materials provided by ICT Results. Note: Materials may be edited for content and length.


Cite This Page:

ICT Results. "Fast-learning Computer Translates From Four Languages." ScienceDaily. ScienceDaily, 22 February 2008. <www.sciencedaily.com/releases/2008/02/080221101659.htm>.
ICT Results. (2008, February 22). Fast-learning Computer Translates From Four Languages. ScienceDaily. Retrieved November 28, 2014 from www.sciencedaily.com/releases/2008/02/080221101659.htm
ICT Results. "Fast-learning Computer Translates From Four Languages." ScienceDaily. www.sciencedaily.com/releases/2008/02/080221101659.htm (accessed November 28, 2014).

Share This


More From ScienceDaily



More Computers & Math News

Friday, November 28, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Recharge Your Phone in 30 Seconds? Israeli Firm Says It Can

Recharge Your Phone in 30 Seconds? Israeli Firm Says It Can

Reuters - Innovations Video Online (Nov. 28, 2014) With consumers demanding more and more from their mobile devices, scientists in Israel and Singapore are developing super fast-charging batteries to power them. Amy Pollock has more. Video provided by Reuters
Powered by NewsLook.com
EU Pushes Google For Worldwide Right To Be Forgotten

EU Pushes Google For Worldwide Right To Be Forgotten

Newsy (Nov. 27, 2014) Privacy regulators recommend Google expand its requested removals to apply to all its web domains. Video provided by Newsy
Powered by NewsLook.com
Predictions Of Tablets' Demise Sound Familiar

Predictions Of Tablets' Demise Sound Familiar

Newsy (Nov. 26, 2014) The tablet's days are numbered, at least according to a recent IDC report. The market-research firm paints a grim outlook for tablets. Video provided by Newsy
Powered by NewsLook.com
Today's Prostheses Are More Capable Than Ever

Today's Prostheses Are More Capable Than Ever

Newsy (Nov. 26, 2014) Advances in prosthetics are making replacement body parts stronger and more lifelike than they’ve ever been. Video provided by Newsy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:

Breaking News:

Strange & Offbeat Stories


Space & Time

Matter & Energy

Computers & Math

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile: iPhone Android Web
Follow: Facebook Twitter Google+
Subscribe: RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins