Google Translator Lost in Translation

Have you ever used one of these on-line translators (like Google Translator) to translate anything from one languge to another? If you did, I hope you didn't rely on these tools to translate any critical information. At times they may be useful; they are certainly free for you; but they are also always far from giving you great and accurate results.

Theoretically, if there really was a translator that works from language A to language B and from language B to language A, then it should be possible to submit a text in language A, translate it into B and then translate that translation back into language A. The result should maintain the original meaning.

For example if I translate the phrase "I am a writer" in Italian, the correct translation is "Io sono uno scrittore". If I translate "Io sono uno scrittore" from Italian back to English the correct translation is "I am a writer". The result is exactly the same. It doesn't matter how many times I repeat the process. The text maintains the same meaning.

On the other hand, if every time that you translate a text the meaning changes a little bit, the sense of a text that gets translated back and forth a number of times drifts further and further from the original. If you do enough translations back and forth you end up with something completely different from the original.

This is exactly what happens with current major on-line translators.

To demostrate what I mean I wrote a small Java program that submits a text in English to Google Translator for translation into a different language; the result is caputured and submitted to Google for translation back to English. The new English version is submitted again, and the cycle is repeated a number of times. What you get in the end is sometimes really different from the original.

I'll give you some real examples.

First of all I tried with the simple phrase “I am a writer”, as mentioned above. The translation to Italian given by Google Translator is “Sono un produttore”. This is wrong already. That means “I am a producer”. If I give this back to Google for translation in English I get “They are a producer”. It went from “I” to “they” and from “writer” to “producer”. The sense of the original is completely distorted.

Let’s try something more complex. The following is an extract from the speech that President George W. Bush gave in his first State Of The Union in 2001:

I want to thank so many of you who have accepted my invitation to come to the White House to discuss important issues. We're off to a good start. I will continue to meet with you and ask for your input. You have been kind and candid, and I thank you for making a new President feel welcome.

Using my program I submitted this to Google and translated it back and forth from Engish to Italian and from Italian to English 10 times. In the end of this process the text has been distorted into:

I wish that ringraziar many that they have accepted the mines they have invited in order to come to lodge the woman white woman of the woman of the woman of the woman of the woman of the woman of the woman of the woman of the woman in order to discuss important editions. We are isinseriti to a good beginning. I will continue to come it in order to put itself in contact with and asking yours the immesso. You have been kind and candid and ringrazio in order to make a new welcome of tact of the president.

Huh? Among all the various errors the funniest is that the “white house” for some reason became “white woman” (repeated a number of times), and “invitation” became “mines”.

Let’s see more examples. The next is an extract from a news article published on the INDOLink website about Dick Cheney hunting stunts.

The facts first, that have all the ingredients of a thriller because of the secrecy surrounding them. The hero of the thriller is Dick Cheney, second in command to George W. Bush, the man who wanted to get Osama bin Laden but could not.

After a few passes from English to Italian, this is what I ended up with:

The facts in the first place, that have all the ingredients of a moving history because of the segretezza that encircles them. The hero of the moving history is to the inner part of second the place Dick Cheney, to the inner part of the place that directs George W. Bush, the man who has wished to obtain the bucket of loaded Osama but he could not.

I love how “Bin Laden” ends up reading “bucket of loaded Osama”. Jokes come to mind, but to keep this blog PG-10 I’ll leave them for the reader to imagine.

I’ll give a third example. This is from a speech by English Prime Minister Tony Blair addressing the war with Iraq in March 2003.

Tonight, British servicemen and women are engaged from air, land and sea. Their mission: to remove Saddam Hussein from power, and disarm Iraq of its weapons of mass destruction.

And after a few passes with the translation back and forth from English to Italian and from Italian to English this is what it becames:

This evening, the British mechanics and the women are support to you from air, earth and the sea. Their mission: for for of removal of Saddam di Hussein that is fed and disarming the relative Iraq of the destruction square you which they collect in a heap.

I have no idea how it went from "disarm Iraq of its weapons of mass destruction" to "and disarming the relative Iraq of the destruction square you which they collect in a heap.".

You can try this by your own by hand using Google Translator.

Rich said...

Back in the day, when Jack Paar hosted the Tonight Show, he had this games he played when he had guests who spoke a variety of languages. He'd whisper a story in English to one guest, who would whisper it to the next in another language, and eventually it gets back to English completely garbled. Even with the best algorithms, I suspect translation is always a lossy process.