Dialectics of Neural Machine Translation
or Does quantity turn into quality?
An article based on a speech at the RIF + CIB 2017 conference.
They have been talking about neural networks for a long time, and it would seem that one of the classic tasks of artificial intelligence - machine translation - just begs to be solved on the basis of this technology.
Nevertheless, here is the dynamics of popularity in the search for queries about neural networks in general and about neural machine translation in particular:
It is clearly seen that until recently there was nothing about neural machine translation on radars - and here at the end of 2016 there are new technologies and systems Machine translation, built on the basis of neural networks, was demonstrated by several companies at once, including Google, Microsoft and SYSTRAN. They appeared almost simultaneously, with a difference of several weeks or even days. Why is that?
In order to answer this question, it is necessary to understand what machine translation based on neural networks is and what is its key difference from the classical statistical systems or analytical systems that are used today for machine translation.
The neural translator is based on the Bidirectional Recurrent Neural Networks (Bidirectional Recurrent Neural Networks) mechanism, built on matrix calculations, which allows you to build significantly more complex probabilistic models than statistical machine translators.
Like statistical translation, neural translation requires parallel cases for learning, which allow you to compare automatic translation with the reference “human” one, only in the learning process it operates not with separate phrases and phrases, but with whole sentences. The main problem is that for training such a system requires significantly more computing power.
To speed up the process, developers use NVIDIA GPUs, and Google also uses the Tensor Processing Unit (TPU), a proprietary chip adapted specifically for machine learning technologies. Graphics chips are initially optimized for matrix computing algorithms, and therefore the performance gain is 7-15 times compared to the CPU.
Even with all this, training a single neural model requires from 1 to 3 weeks, while a statistical model of approximately the same size is configured in 1-3 days, and with increasing size, this difference increases.
However, not only technological problems were a brake on the development of neural networks in the context of the problem of machine translation. In the end, it was possible to teach language models earlier, albeit more slowly, but there were no fundamental obstacles.
The mod on neural networks also played a role. There were many developments within themselves, but they were not in a hurry to announce this, fearing that they might not get the quality gain that society expects from the Neural Networks phrase. This can explain the fact that several neural translators were announced one after another.
Let’s try to understand whether the increase in the quality of translation corresponds to the accumulated expectations and the increase in costs that accompany the development and support of neural networks for translation.
Google in its research demonstrates that neural machine translation provides a Relative Improvement from 58% to 87%, depending on the language pair, compared to the classical statistical approach (or Phrase Based Machine Translation, PBMT, as it is also called).
SYSTRAN conducts a study in which the quality of a translation is evaluated by choosing from several presented options made by various systems, as well as a “human” translation. And he claims that his neural translation is preferred in 46% of cases to the translation made by man.
Despite the fact that Google claims an improvement of 60% or even higher, there is a slight catch in this indicator. Representatives of the company talk about “Relative Improvement”, that is, how much they succeeded with the neural approach to approach the quality of Human Translation in relation to what was in the classical statistical translator.
Industry experts analyzing the results presented by Google in the article “Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation” are rather skeptical of the presented results and say that in fact the BLEU score was only 10% better, and significant progress It is noticeable on fairly simple tests from Wikipedia, which, most likely, were used in the process of training the network.
Inside PROMT, we regularly compare translations on various texts of our systems with competitors, and therefore there are always examples at hand where we can check whether neural translation really surpasses the previous generation, as the manufacturers claim.
Source text: Worrying never did anyone any good.
Google PBMT Translation: Without worrying, did no good to anyone.
Google NMT Translation: Anxiety has never helped anyone.
By the way, the translation of the same phrase on Translate.Ru: “Excitement never brought anyone any benefit”, you can see that it was and remains the same without the use of neural networks.
Microsoft Translator is not far behind in this matter either. Unlike colleagues from Google, they even made a website on which you can translate and compare two results: neural and donor, to make sure that claims about growth in quality are not unfounded.
In this example, we see that there is progress, and it is really noticeable. At first glance, it seems that the statement of the developers that machine translation almost caught up with the “human” one is true. But is this really so, and what does it mean in terms of the practical application of technology for business?
In general, translation using neural networks is superior to statistical translation, and this technology has enormous potential for development. But if you carefully approach the issue, then we can make sure that progress is not in everything, and not for all tasks, you can use neural networks without looking at the task itself.
From an automatic translator the whole history of its existence - and this is more than 60 years old! - they were waiting for some magic, imagining it as a machine from science fiction films, which instantly translates any speech into an alien whistle and back.
In fact, tasks are of different levels, one of which implies a “universal” or, so to speak, “everyday” translation for everyday tasks and to facilitate understanding. The tasks of this level are perfectly handled by online translation services and many mobile products.
These tasks include:
• quick translation of words and short texts for various purposes;
• automatic translation in the process of communication on forums, in social networks, instant messengers;
• automatic translation when reading news, Wikipedia articles;
• travel translator (mobile).
All those examples of the increase in the quality of translation using neural networks, which we examined above, just relate to these tasks.
However, with the goals and objectives of the business in relation to machine translation, everything is somewhat different. For example, here are some requirements that apply to corporate machine translation systems:
• translation of business correspondence with customers, partners, investors, foreign employees;
• localization of sites, online stores, product descriptions, instructions;
• translation of user content (reviews, forums, blogs);
• the ability to integrate translation into business processes and software products and services;
• accuracy of translation in compliance with terminology, confidentiality and security.
Let’s try to understand with examples whether any tasks of a translation business using neural networks can be solved and how.
Amadeus is one of the world's largest global airline distribution systems. On the one hand, air carriers are connected to it, on the other, agencies that must receive all information about changes in real time and communicate to their customers.
The task is to localize the conditions for applying tariffs (Fare Rules), which are formed automatically in the booking system from various sources. These rules are always formed in English. Manual translation is almost impossible here, due to the fact that there is a lot of information and it often changes. An air ticket agent would like to read the Fare Rules in Russian in order to promptly and expertly advise its customers.
An understandable translation is required that conveys the meaning of tariff rules, taking into account typical terms and abbreviations. And it is required that the automatic transfer be integrated directly into the Amadeus reservation system.
→ The task and implementation of the project are described in detail in the document .
Let's try to compare the translation made through the PROMT Cloud API, integrated into the Amadeus Fare Rules Translator, and the “neural” translation from Google.
Original: ROUND TRIP INSTANT PURCHASE FARES
PROMT (Analytical Approach): INSTANT PURCHASE RATES THERE AND BACK
GNMT: ROUND PURCHASES
It is obvious that the neural translator can not cope here, and a little further it will become clear why.
TripAdvisor is one of the largest travel services in the world that needs no introduction. According to an article published by The Telegraph, 165,600 new reviews about various tourist sites in different languages appear on the website daily.
The task is to translate the reviews of tourists from English into Russian with a translation quality sufficient to understand the meaning of this review. The main difficulty: typical features of user generated content (texts with errors, typos, omissions).
Also part of the task was to automatically evaluate the quality of the translation before posting on TripAdvisor. Since manual evaluation of all translated content is not possible, a machine translation solution should provide an automatic mechanism for assessing the quality of translated texts - a confidence score, to enable TripAdvisor to publish translated reviews of only high quality.
→ Read more about the project on the company's website .
The PROMT DeepHybrid technology was used for the solution, which allows to obtain a translation that is more qualitative and understandable to the end reader, including through statistical post-editing of the translation results.
Let's look at some examples:
Original: We ate there last night on a whim and it was a lovely meal. The service was attentive without being over bearing.
PROMT (Hybrid Translation): We ate there last night by chance, and it was great food. The staff was attentive, but not overbearing.
GNMT: We ate there last night on a whim, and it was great food. The service was attentive without being over bearings.
Here, everything is not so depressing in terms of quality, as in the previous example. In general, in terms of its parameters, this problem can potentially be solved using neural networks, and this can still improve the quality of translation.
As mentioned earlier, a “universal” translator does not always provide acceptable quality and cannot support specific terminology. To integrate into your processes and use neural networks for translation, you need to fulfill the basic requirements:
• The availability of sufficient volumes of parallel texts in order to be able to train a neural network. Often, the customer simply has few or no texts on this topic in nature. They may be classified or in a condition not very suitable for automatic processing.
To create a model, you need a base that contains at least 100 million tokens (word usage), and to get a translation of a more or less acceptable quality - 500 million tokens. Not every company has such a volume of materials.
• The presence of a mechanism or algorithms for automatically evaluating the quality of the result.
• Sufficient computing power.
A “universal” neural translator is most often not suitable for quality, and to deploy your own private neural network that can provide acceptable quality and speed of work, you need a “small cloud”.
• It is not clear what to do with privacy.
Not every customer is ready to submit their content for transfer to the cloud for security reasons, and NMT is a cloud-first story.
• In general, neural automatic translation produces a higher quality result than a “purely" statistical approach;
• Automatic translation through a neural network - better suited for solving the problem of “universal translation”;
• None of the approaches to MP in itself is an ideal universal tool for solving any translation task;
• To solve translation tasks in business, only specialized solutions can guarantee compliance with all requirements.
We come to the absolutely obvious and logical decision that for your translation tasks you need to use the translator that is most suitable for this. It doesn’t matter if there is a neural network inside or not. Understanding the task itself is more important.
An article based on a speech at the RIF + CIB 2017 conference.
Neural Machine Translation: why only now?
They have been talking about neural networks for a long time, and it would seem that one of the classic tasks of artificial intelligence - machine translation - just begs to be solved on the basis of this technology.
Nevertheless, here is the dynamics of popularity in the search for queries about neural networks in general and about neural machine translation in particular:
It is clearly seen that until recently there was nothing about neural machine translation on radars - and here at the end of 2016 there are new technologies and systems Machine translation, built on the basis of neural networks, was demonstrated by several companies at once, including Google, Microsoft and SYSTRAN. They appeared almost simultaneously, with a difference of several weeks or even days. Why is that?
In order to answer this question, it is necessary to understand what machine translation based on neural networks is and what is its key difference from the classical statistical systems or analytical systems that are used today for machine translation.
The neural translator is based on the Bidirectional Recurrent Neural Networks (Bidirectional Recurrent Neural Networks) mechanism, built on matrix calculations, which allows you to build significantly more complex probabilistic models than statistical machine translators.
Like statistical translation, neural translation requires parallel cases for learning, which allow you to compare automatic translation with the reference “human” one, only in the learning process it operates not with separate phrases and phrases, but with whole sentences. The main problem is that for training such a system requires significantly more computing power.
To speed up the process, developers use NVIDIA GPUs, and Google also uses the Tensor Processing Unit (TPU), a proprietary chip adapted specifically for machine learning technologies. Graphics chips are initially optimized for matrix computing algorithms, and therefore the performance gain is 7-15 times compared to the CPU.
Even with all this, training a single neural model requires from 1 to 3 weeks, while a statistical model of approximately the same size is configured in 1-3 days, and with increasing size, this difference increases.
However, not only technological problems were a brake on the development of neural networks in the context of the problem of machine translation. In the end, it was possible to teach language models earlier, albeit more slowly, but there were no fundamental obstacles.
The mod on neural networks also played a role. There were many developments within themselves, but they were not in a hurry to announce this, fearing that they might not get the quality gain that society expects from the Neural Networks phrase. This can explain the fact that several neural translators were announced one after another.
Translation quality: Whose BLEU score is thicker?
Let’s try to understand whether the increase in the quality of translation corresponds to the accumulated expectations and the increase in costs that accompany the development and support of neural networks for translation.
Google in its research demonstrates that neural machine translation provides a Relative Improvement from 58% to 87%, depending on the language pair, compared to the classical statistical approach (or Phrase Based Machine Translation, PBMT, as it is also called).
SYSTRAN conducts a study in which the quality of a translation is evaluated by choosing from several presented options made by various systems, as well as a “human” translation. And he claims that his neural translation is preferred in 46% of cases to the translation made by man.
Translation quality: is there a breakthrough?
Despite the fact that Google claims an improvement of 60% or even higher, there is a slight catch in this indicator. Representatives of the company talk about “Relative Improvement”, that is, how much they succeeded with the neural approach to approach the quality of Human Translation in relation to what was in the classical statistical translator.
Industry experts analyzing the results presented by Google in the article “Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation” are rather skeptical of the presented results and say that in fact the BLEU score was only 10% better, and significant progress It is noticeable on fairly simple tests from Wikipedia, which, most likely, were used in the process of training the network.
Inside PROMT, we regularly compare translations on various texts of our systems with competitors, and therefore there are always examples at hand where we can check whether neural translation really surpasses the previous generation, as the manufacturers claim.
Source text: Worrying never did anyone any good.
Google PBMT Translation: Without worrying, did no good to anyone.
Google NMT Translation: Anxiety has never helped anyone.
By the way, the translation of the same phrase on Translate.Ru: “Excitement never brought anyone any benefit”, you can see that it was and remains the same without the use of neural networks.
Microsoft Translator is not far behind in this matter either. Unlike colleagues from Google, they even made a website on which you can translate and compare two results: neural and donor, to make sure that claims about growth in quality are not unfounded.
In this example, we see that there is progress, and it is really noticeable. At first glance, it seems that the statement of the developers that machine translation almost caught up with the “human” one is true. But is this really so, and what does it mean in terms of the practical application of technology for business?
In general, translation using neural networks is superior to statistical translation, and this technology has enormous potential for development. But if you carefully approach the issue, then we can make sure that progress is not in everything, and not for all tasks, you can use neural networks without looking at the task itself.
Machine translation: what are the tasks
From an automatic translator the whole history of its existence - and this is more than 60 years old! - they were waiting for some magic, imagining it as a machine from science fiction films, which instantly translates any speech into an alien whistle and back.
In fact, tasks are of different levels, one of which implies a “universal” or, so to speak, “everyday” translation for everyday tasks and to facilitate understanding. The tasks of this level are perfectly handled by online translation services and many mobile products.
These tasks include:
• quick translation of words and short texts for various purposes;
• automatic translation in the process of communication on forums, in social networks, instant messengers;
• automatic translation when reading news, Wikipedia articles;
• travel translator (mobile).
All those examples of the increase in the quality of translation using neural networks, which we examined above, just relate to these tasks.
However, with the goals and objectives of the business in relation to machine translation, everything is somewhat different. For example, here are some requirements that apply to corporate machine translation systems:
• translation of business correspondence with customers, partners, investors, foreign employees;
• localization of sites, online stores, product descriptions, instructions;
• translation of user content (reviews, forums, blogs);
• the ability to integrate translation into business processes and software products and services;
• accuracy of translation in compliance with terminology, confidentiality and security.
Let’s try to understand with examples whether any tasks of a translation business using neural networks can be solved and how.
Case: Amadeus
Amadeus is one of the world's largest global airline distribution systems. On the one hand, air carriers are connected to it, on the other, agencies that must receive all information about changes in real time and communicate to their customers.
The task is to localize the conditions for applying tariffs (Fare Rules), which are formed automatically in the booking system from various sources. These rules are always formed in English. Manual translation is almost impossible here, due to the fact that there is a lot of information and it often changes. An air ticket agent would like to read the Fare Rules in Russian in order to promptly and expertly advise its customers.
An understandable translation is required that conveys the meaning of tariff rules, taking into account typical terms and abbreviations. And it is required that the automatic transfer be integrated directly into the Amadeus reservation system.
→ The task and implementation of the project are described in detail in the document .
Let's try to compare the translation made through the PROMT Cloud API, integrated into the Amadeus Fare Rules Translator, and the “neural” translation from Google.
Original: ROUND TRIP INSTANT PURCHASE FARES
PROMT (Analytical Approach): INSTANT PURCHASE RATES THERE AND BACK
GNMT: ROUND PURCHASES
It is obvious that the neural translator can not cope here, and a little further it will become clear why.
Case: TripAdvisor
TripAdvisor is one of the largest travel services in the world that needs no introduction. According to an article published by The Telegraph, 165,600 new reviews about various tourist sites in different languages appear on the website daily.
The task is to translate the reviews of tourists from English into Russian with a translation quality sufficient to understand the meaning of this review. The main difficulty: typical features of user generated content (texts with errors, typos, omissions).
Also part of the task was to automatically evaluate the quality of the translation before posting on TripAdvisor. Since manual evaluation of all translated content is not possible, a machine translation solution should provide an automatic mechanism for assessing the quality of translated texts - a confidence score, to enable TripAdvisor to publish translated reviews of only high quality.
→ Read more about the project on the company's website .
The PROMT DeepHybrid technology was used for the solution, which allows to obtain a translation that is more qualitative and understandable to the end reader, including through statistical post-editing of the translation results.
Let's look at some examples:
Original: We ate there last night on a whim and it was a lovely meal. The service was attentive without being over bearing.
PROMT (Hybrid Translation): We ate there last night by chance, and it was great food. The staff was attentive, but not overbearing.
GNMT: We ate there last night on a whim, and it was great food. The service was attentive without being over bearings.
Here, everything is not so depressing in terms of quality, as in the previous example. In general, in terms of its parameters, this problem can potentially be solved using neural networks, and this can still improve the quality of translation.
Problems Using NMT for Business
As mentioned earlier, a “universal” translator does not always provide acceptable quality and cannot support specific terminology. To integrate into your processes and use neural networks for translation, you need to fulfill the basic requirements:
• The availability of sufficient volumes of parallel texts in order to be able to train a neural network. Often, the customer simply has few or no texts on this topic in nature. They may be classified or in a condition not very suitable for automatic processing.
To create a model, you need a base that contains at least 100 million tokens (word usage), and to get a translation of a more or less acceptable quality - 500 million tokens. Not every company has such a volume of materials.
• The presence of a mechanism or algorithms for automatically evaluating the quality of the result.
• Sufficient computing power.
A “universal” neural translator is most often not suitable for quality, and to deploy your own private neural network that can provide acceptable quality and speed of work, you need a “small cloud”.
• It is not clear what to do with privacy.
Not every customer is ready to submit their content for transfer to the cloud for security reasons, and NMT is a cloud-first story.
conclusions
• In general, neural automatic translation produces a higher quality result than a “purely" statistical approach;
• Automatic translation through a neural network - better suited for solving the problem of “universal translation”;
• None of the approaches to MP in itself is an ideal universal tool for solving any translation task;
• To solve translation tasks in business, only specialized solutions can guarantee compliance with all requirements.
We come to the absolutely obvious and logical decision that for your translation tasks you need to use the translator that is most suitable for this. It doesn’t matter if there is a neural network inside or not. Understanding the task itself is more important.
Related Links
https://research.google.com/pubs/pub45610.html
https://arxiv.org/abs/1609.08144
https://research.googleblog.com/2016/09/a-neural-network-for-machine. html
https://slator.com/technology/nearly-indistinguishable-from-human-translation-google-claims-breakthrough/
https://slator.com/technology/hyperbolic-experts-weigh-in-on-google-neural -translate /
https://translator.microsoft.com/neural/
http://blog.systransoft.com/how-does-neural-machine-translation-work/
http://kv-emptypages.blogspot.ru/2016 /09/a-deep-dive-into-systrans-neural.html
https://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-with-gpus/
https: //kv-emptypages.blogspot. com / 2010/03 / need-for-automated-quality-measurement.html
http://kv-emptypages.blogspot.ru/2017/04/the-problem-with-bleu-and-neural.html
https://slator.com/technology/alibaba-launches-language-services-unit/
https://arxiv.org/abs/1609.08144
https://research.googleblog.com/2016/09/a-neural-network-for-machine. html
https://slator.com/technology/nearly-indistinguishable-from-human-translation-google-claims-breakthrough/
https://slator.com/technology/hyperbolic-experts-weigh-in-on-google-neural -translate /
https://translator.microsoft.com/neural/
http://blog.systransoft.com/how-does-neural-machine-translation-work/
http://kv-emptypages.blogspot.ru/2016 /09/a-deep-dive-into-systrans-neural.html
https://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-with-gpus/
https: //kv-emptypages.blogspot. com / 2010/03 / need-for-automated-quality-measurement.html
http://kv-emptypages.blogspot.ru/2017/04/the-problem-with-bleu-and-neural.html
https://slator.com/technology/alibaba-launches-language-services-unit/