Subscribe to Blog via Email
What is the future of Machine Translation?
So… lemme get this straight. A guy who was worked for Google Translate is A2A’ing someone who did a couple of graduate courses on Machine Translation 20 years ago?
Once again, Adam, you flatter me.
We agree, and I defer to your superior expertise; I’ll just, eh, restate what you said.
Machine Translation is AI-hard: you need real understanding of the world to do accurate cross-context machine translation. AI has a long history of hype, and a long history of shifting goalposts. The current results of Machine Learning in Language are all around us here in Quoraland: they’re the bots, and they’re people popping over to Google Translate. How impressed are you? And yet, if you thought about this 10 years ago, you would be astonished at what they do. If you allow for their limitations, you still can be.
(I was astonished at my accidental discovery the other day, that Google Translate does so well with Augustine of Hippo. For the fairly obvious reason that the bilingual corpus of Latin it was trained on must have included Augustine of Hippo.)
The foreseeable future of machine translation I see is more of the same: more machine learning, more statistical methods, with maybe a bit more sensitivity to context, through better AI in the backends. What you’ll end up with is what people have been saying for a while you’ll end up with in AI: something that does very well in a specific domain it has been trained for, and not so well if it’s confronted with novel domains. So you will be able to use it as a tourist; arguably, you already can. You will be able to translate documents in a particular field. But the translator will still get confused very easily, once confronted with anything unfamiliar. People will be aware of that, and will work around it.
Minority languages will be supported a little by MT, but I dispute the overoptimistic belief that noone will need to learn English anymore, because machines will do the translating for you. Anyone who needs to make sure they are understood accurately for their job is not going to take the chance that the computer misconstrues them: they’re still going to learn English (or Mandarin or Spanish or Uzbek, or whatever the future lingua franca is), and make their own decisions about how to deal with ambiguity. And minority languages will still be restricted to the home and away from the public sphere, which is how languages die.
Or even worse, minority languages will devolve into translationese: the only Irish around will be whatever word-to-word translation from English the Google–Apple–McDonalds Translatatrix 4000 spits out.
The catch in all of this is that we have an event horizon of, I don’t know, 10–20 years for any prediction of the future in IT—past which, who knows. The Singularity may yet happen.
But AI has had a lot of hype, and a lot of shifting goalposts.