Google TPUs mean language not lost in translation

3 years ago admin Comments Off on Google TPUs mean language not lost in translation

Imagine a language learning tool with the computing power of 180 thousand billion machine calculations a second. That’s the power of one of Google’s new second-generation tensor processing units. TPUs are specialist circuit boards built for machine learning. Combine a bunch of them and the calculation speed is mind-boggling.

I wish I had that machine learning capability at my disposal years ago when I learned French, German and Latin at school.

Machine learning is repetition learning. Feed in a million photos of a cat, with each one labelled as a cat, and machine learning can work out the pixel combinations that constitute a cat and identify cats in future photos. The same technique can be applied to medicine. Feed in, say, a million labelled photos of known skin cancer lesions and this artificial intelligence technique can predict the probability of a new photo showing a cancer.

Repetition learning is a complex task when applied to language, but it is revolutionising the operation of Google Translate, which attempts to translate between 103 languages.

How we translate languages varies. There’s the tedious way I used at school. I learned the French equivalent for English language nouns, verbs and their tenses, and put them into a correct sentence structure. Then there’s the way a two-year-old learns. Toddlers aren’t aware of sentence structure, verbs and tenses. When they ask for an ice cream, they are not concerned about whether ice cream is a noun, just the taste.

With this in mind I tackled the man from Google’s Brain team who is heading its move to bring artificial intelligence and machine learning to language translation, during a visit to the company’s Mountain View, California, campus last week.

The work of research scientist Mike Schuster and the Google Translate team is seeing language translation morph from translating individual phrases 10 years ago to translating entire sentences in one gulp with machine learning.

The original Google Translate built tables of phrases from translations on the web.

“You’d find small segments like ‘My dog is red’, you’d find the same segment in a Japanese sentence and you’d make a big table of that,” Schuster explains. “Then you’d search your tables and put together your sentences.”

The project to use machine learning for language translation began in August 2015. Google soon began to write an artificial intelligence system for Google Translate making use of its new TPU hardware. “About five months later we had the first results,” Schuster says. “In September, basically a year later, we had everything ready to launch our first language pair which was Chinese to English.”

Instead of maintaining tables, the new machine learning system combs the net, taking in millions, even billions of sentences and their translated equivalent in a target language. It doesn’t bother about parsing, just about the differences in new translations to previous ones and the probability of accuracy. It then extrapolates and “predicts” a translation when it encounters similar sentences.

“It’s taking all sentences currently; in the future we will work on taking whole paragraphs. Because there is knowledge in between sentences,” Schuster says. “But the extrapolation is complicated. It’s not just linear interpolation, it’s way more complicated than that.”

He says the first AI-based translations were too slow. “It took 10 seconds for a 10-word sentence to translate. Ten seconds is way too long. People don’t want to wait. And if you want to use it for a billion people, you won’t have enough machines.

“So we did all kinds of things. We made better algorithms, we used our TPUs to make everything faster. Eventually it was 200 milliseconds for a 10-word sentence, and that was done over a period of two months; 200ms was fast enough to launch the system.” The system went live in September with English to Chinese. Google added eight more languages in November, another seven in March, and last month 26 more. So far 41 languages use machine learning for translation.

The system doesn’t always translate directly. For examples, translations from Korean to Japanese involves Korean to English and English to Japanese. But “this works”, says Schuster. “It doesn’t work perfectly but it works.”

Chris Griffith attended Google’s developer conference in the US courtesy of Google.

Reader comments on this site are moderated before publication to promote lively and civil debate. We encourage your comments but submitting one does not guarantee publication. We publish hundreds of comments daily, and if a comment is rejected it is likely because it does not meet with our comment guidelines, which you can read here. No correspondence will be entered into if a comment is declined.