Stick it in your ear
Jun. 19th, 2006 01:05 pmHow to build a Babel fish
Jun 8th 2006
From The Economist print edition
Translation software: The science-fiction dream of a machine that understands any language is getting slowly closer
More...
Jun 8th 2006
From The Economist print edition
Translation software: The science-fiction dream of a machine that understands any language is getting slowly closer
IT IS arguably the most useful gadget in the space-farer's toolkit. In “The Hitchhiker's Guide to the Galaxy”, Douglas Adams depicted it as a “small, yellow and leech-like” fish, called a Babel fish, that you stick in your ear. In “Star Trek”, meanwhile, it is known simply as the Universal Language Translator. But whatever you call it, there is no doubting the practical value of a device that is capable of translating any language into another.
Remarkably, however, such devices are now on the verge of becoming a reality, thanks to new “statistical machine translation” software. Unlike previous approaches to machine translation, which relied upon rules identified by linguists which then had to be tediously hand-coded into software, this new method requires absolutely no linguistic knowledge or expert understanding of a language in order to translate it. And last month researchers at Carnegie Mellon University (CMU) in Pittsburgh began work on a machine that they hope will be able to learn a new language simply by getting foreign speakers to talk into it and perhaps, eventually, by watching television.
Within the next few years there will be an explosion in translation technologies, says Alex Waibel, director of the International Centre for Advanced Communication Technology, which is based jointly at the University of Karlsruhe in Germany and at CMU. He predicts there will be real-time automatic dubbing, which will let people watch foreign films or television programmes in their native languages, and search engines that will enable users to trawl through multilingual archives of documents, videos and audio files. And, eventually, there may even be electronic devices that work like Babel fish, whispering translations in your ear as someone speaks to you in a foreign tongue.
This may sound fanciful, but already a system has been developed that can translate speeches or lectures from one language into another, in real time and regardless of the subject matter. The system required no programming of grammatical rules or syntax. Instead it was given a vast number of speeches, and their accurate translations (performed by humans) into a second language, for statistical analysis. One of the reasons it works so well is that these speeches came from the United Nations and the European Parliament, where a broad range of topics are discussed. “The linguistic knowledge is automatically extracted from these huge data resources,” says Dr Waibel.
More...
no subject
Date: 2006-06-19 06:07 am (UTC)no subject
Date: 2006-06-19 06:44 am (UTC)Only rules and syntax enable us to CREATE language.
Goddamn, people, read your Chomsky... :o)
no subject
Date: 2006-06-19 06:53 am (UTC)no subject
Date: 2006-06-19 07:40 am (UTC)Actually what I like in articles like these is that they are over-enthusiastic and give something to look forward to; I think science should be optimistic. Because why not, so many things have been developed against all odds and so many bad things have been developed - so why not dream...
It's just that with this particular subject I know how many pains computer linguists go through to create better translation programs and saying "lo, stupids, we can do better without you!" kinda set me off - even though that impression was probably only created by the writer of this article. :o)
no subject
Date: 2006-06-19 01:44 pm (UTC)The thing about learning AIs is that the system attepts to develop a predictor for its data -- and then refines that predictor as it encounters more and more data. So given a large database and a good learning algorithm, it -will- be able to translate arbitrary documents -- because the AI has developed rules (and refined/replaced them as it encountered data that contradicte them...just like we do) for how it treats language, and can now apply those rules. The key here isn't that there aren't rules, but that the programmers do not -code- pre-defined rules; instead, they code feedback mechanisms and let the machine make up its own rules...because it will often do a better job (and certainly a -different- job, coming up with different and possibly more accurate rules than a human analyst will).
no subject
Date: 2006-06-19 02:25 pm (UTC)