khaosworks: (Einstein)
[personal profile] khaosworks
How to build a Babel fish
Jun 8th 2006
From The Economist print edition
Translation software: The science-fiction dream of a machine that understands any language is getting slowly closer

IT IS arguably the most useful gadget in the space-farer's toolkit. In “The Hitchhiker's Guide to the Galaxy”, Douglas Adams depicted it as a “small, yellow and leech-like” fish, called a Babel fish, that you stick in your ear. In “Star Trek”, meanwhile, it is known simply as the Universal Language Translator. But whatever you call it, there is no doubting the practical value of a device that is capable of translating any language into another.

Remarkably, however, such devices are now on the verge of becoming a reality, thanks to new “statistical machine translation” software. Unlike previous approaches to machine translation, which relied upon rules identified by linguists which then had to be tediously hand-coded into software, this new method requires absolutely no linguistic knowledge or expert understanding of a language in order to translate it. And last month researchers at Carnegie Mellon University (CMU) in Pittsburgh began work on a machine that they hope will be able to learn a new language simply by getting foreign speakers to talk into it and perhaps, eventually, by watching television.

Within the next few years there will be an explosion in translation technologies, says Alex Waibel, director of the International Centre for Advanced Communication Technology, which is based jointly at the University of Karlsruhe in Germany and at CMU. He predicts there will be real-time automatic dubbing, which will let people watch foreign films or television programmes in their native languages, and search engines that will enable users to trawl through multilingual archives of documents, videos and audio files. And, eventually, there may even be electronic devices that work like Babel fish, whispering translations in your ear as someone speaks to you in a foreign tongue.

This may sound fanciful, but already a system has been developed that can translate speeches or lectures from one language into another, in real time and regardless of the subject matter. The system required no programming of grammatical rules or syntax. Instead it was given a vast number of speeches, and their accurate translations (performed by humans) into a second language, for statistical analysis. One of the reasons it works so well is that these speeches came from the United Nations and the European Parliament, where a broad range of topics are discussed. “The linguistic knowledge is automatically extracted from these huge data resources,” says Dr Waibel.

More...

Date: 2006-06-19 06:07 am (UTC)
mdlbear: blue fractal bear with text "since 2002" (Default)
From: [personal profile] mdlbear
It's well known in the AI field that the main factor in the quality of any pattern-recognition software is the size of the database you train it on.

Date: 2006-06-19 06:44 am (UTC)
From: [identity profile] nelladarren.livejournal.com
If it has "no programming of grammatical rules or syntax" then it's really "just" an enormous data base. Like - it "heard" the sentence "The training made her better" and knows the translation into another language but as soon as it's confronted with a NEW sentence like "He made her a cup of tea" it's either "TILT, database error!" or it will translate it as "he transmogrified her into a cup of tea".

Only rules and syntax enable us to CREATE language.

Goddamn, people, read your Chomsky... :o)

Date: 2006-06-19 06:53 am (UTC)
From: [identity profile] zanda-myrande.livejournal.com
Oh, let them play. It's another of those ideas that goes back to would-be scientists watching Star Trek as kids (the Babelfish is a, um, red herring) and we know it won't work, but at least it's something harmless...

Date: 2006-06-19 07:40 am (UTC)
From: [identity profile] nelladarren.livejournal.com
Yes, you're right.
Actually what I like in articles like these is that they are over-enthusiastic and give something to look forward to; I think science should be optimistic. Because why not, so many things have been developed against all odds and so many bad things have been developed - so why not dream...

It's just that with this particular subject I know how many pains computer linguists go through to create better translation programs and saying "lo, stupids, we can do better without you!" kinda set me off - even though that impression was probably only created by the writer of this article. :o)

Date: 2006-06-19 01:44 pm (UTC)
mneme: (Default)
From: [personal profile] mneme
This isn't actually true, and I can only assume you didn't understand the context of the article.

The thing about learning AIs is that the system attepts to develop a predictor for its data -- and then refines that predictor as it encounters more and more data. So given a large database and a good learning algorithm, it -will- be able to translate arbitrary documents -- because the AI has developed rules (and refined/replaced them as it encountered data that contradicte them...just like we do) for how it treats language, and can now apply those rules. The key here isn't that there aren't rules, but that the programmers do not -code- pre-defined rules; instead, they code feedback mechanisms and let the machine make up its own rules...because it will often do a better job (and certainly a -different- job, coming up with different and possibly more accurate rules than a human analyst will).

Date: 2006-06-19 02:25 pm (UTC)
mdlbear: blue fractal bear with text "since 2002" (Default)
From: [personal profile] mdlbear
Well put. Basically, the system learns language the way people do -- by observing it in use and developing its own rules.

December 2011

S M T W T F S
    123
456789 10
11121314 151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Mar. 13th, 2026 08:37 pm
Powered by Dreamwidth Studios