Tuesday, June 21, 2011

Lost in Translation

Have you seen Bad Translator? It takes the core idea of sites like Hanzi Smatter/Engrish Funny, which is that it's hilarious to watch people arrogantly using languages they're not fluent in, to a whole new level by repeatedly running user-supplied text through machine translation (Google Translate, of course), into and out of the various languages (in alphabetical order), always running back into English so we can track the changes, and for maximum hilarity. I haven't managed to create any knee-slappers, but it's easy to see how tiny errors that creep in can reverberate throughout the generations in completely unexpected ways, in a sort of online version of the children's game called (in North America) Telephone (and the UK, Chinese Whispers).

One important thing you need to know about Google Translate, if you haven't encountered it already, is that when it runs across a word it doesn't know, it won't make an attempt at translation (and flag it as such), but will simply paste the word untouched into the translation.

Here is a sentence I happened to be reading at more or less the same time I discovered Bad Translator —what? I read more than one page at once, I can multi-task— from the latest edition of Fred Clark's minute, epic deconstruction of the Left Behind series:

There’s a vast and thriving cottage industry of this sort of thing in the evangelical subculture, one that has existed for decades on the traveling-speaker and seminar circuit and in recent years has proliferated online.

And here is the result of fifty translations from and to English:

Last year, travel, language laboratories, important industry in Saint Florian, the Bible, the Internet is everywhere.

The big question is, "Where did Saint Florian come from?" The fifth word, "thriving", was translated as "flourishing" on the trip back from Finnish, and survived in that form until Haitian Creole turned it into "florissante", from which there was no return: the translation into English couldn't make any sense of it, so "florissante" it remained in and out of tongue after tongue, until the translation from Italian, apparently assuming it was a name, capitalized it, and then all was lost. Japanese phonetically turned it into "Florian Santo", and there it remained for a few iterations until, mysteriously, it came out of Malay as "Saint Florian" (the mystery being why it didn't happen sooner), and there it remained until the end of the process.

Languages that use alphabets other than the Roman are where many errors pop up. "I want an apple" turns into "I want an apology" on its way through Korean, though "apple pie" survives unscathed, presumably because the phrase is a more or less universally known unit. (The equally unitary, but idiomatic, "Dutch door" almost instantly becomes "Netherlands".)

"Proliferated online" turning into "the Internet is everywhere" is charming, and the way stations are revealing: from "proliferated online" to "outbreaks online", then "the spread of the Internet" (since, I assume, an outbreak is the spread of a disease), and then "Internet penetration" for quite a few iterations, which rapidly becomes "the Internet is widespread", which stays in circulation for a while until "the Internet is everywhere" replaces it and never goes away, suggesting that such a basic construction (article-noun-verb-adjective, made of common words) can be very durable across languages.

It depends on the languages, though. "The sky is blue" quickly becomes "Blue sky" (in Arabic); "The dog is hungry" is sturdier, but even that eventually turns into "Hungry dog" (in Indonesian). I played around for a while, but I couldn't find a sentence that kept its identity to the last translation. ("Apple pie" might survive, but "Give me an apple pie" becomes, delightfully, "Apple of my pie.")

Still, I imagine it would be fairly easy to formulate a paragraph that comes out of the whole Bad Translator process more or less unscathed, lots of straightforward assertions and basic constructions. I know it would be simple to destroy highly idiomatic phrases, because I tried a bunch of them. One example: "It's raining cats and dogs" survives a few iterations, until it's replaced by another idiom, "it's pouring down in buckets", and then it's a series of missteps; "pour into buckets", "into the barrel", and then, disastrously, "in the tube", which predictably turns into "in the metro", and then "in the subway", "the subway", back to "Metro", and, charmingly, "England", where it remains in one form or another to the bitter end.


