How much does Habr cost?

    Hello, Habr! The guys from Smartcat and I decided to go crazy and try to translate into English all the posts that were published here before July 19, 2017, and then estimate how much it will cost on average if Human VS Machine translates. Under the cut you will find out what came of it.



    Without unnecessary input I pass the word scalywhale from Smartcat.

    8,729,613 words


    Or 62 397 253 characters - so much text on the habrahabr.ru website .

    The most common practice among our clients (mainly translation companies): first, the text is translated, then the editor checks it, and then the corrector corrects it. Let's leave only the translation stage, since the speed of content delivery is directly proportional to its value, and it’s unlikely that all texts undergo so many iterations on Habré.



    2500 words per day - the translator can work at such a speed on average, which means that without days off and holidays he would be able to cope with the translation of all Habr for 9 years and 6 months . During this time, the translated texts will lose relevance, plus new ones will be added to them, and the translator will most likely go crazy .

    Translates Man


    An experienced native English translator takes on average $ 0.08 ($ 4.80 *) per word, it turns out $ 698,369 ( $ 41,874,973.45). Excluding process management costs.



    Let's try faster - let several translators take up the project at once, in Smartcat you can work on one project and even a document all together at the same time. We will assemble a team of 50 people, their total productivity will be 125 thousand words per day. So, the transfer itself will take 70 days, and the cost will remain the same. Let’s add here two weeks to search for suitable candidates and testing, and this is the very minimum.



    SMT vs NMT


    So, let's try to finish the task even faster and use machine translation. The technology, which, in the opinion of most Internet users, only needs to be quickly and literally translated, has recently begun to translate so well that the translation industry has seriously paid attention to it. One needs only to recall the news at the end of 2016, when the news first appeared that the translator from Microsoft now works using the neural network , and then an article from The New York Times appeared in which they said that Google Translate learned to translate texts almost like a man.

    Previously, machine translation engines used rules-based algorithms and statistical models derived from large volumes of bilingual texts, which is why it is called Statistical Machine Translation (SMT). The new technology uses an artificial neural network, which independently explores deep connections in languages ​​at the level of whole sentences, rather than individual phrases, and on their basis generates more accurate and well-read translations.

    Translates Machine


    In general, machine translation comes into play. For clarity, we take this article with Habr in the volume of 842 words . A live translator will manage it in about three hours and ask for $ 67.4 (₽ 4,041.38). Let's make it easier for him and save at the same time. We trust the machine with the translation, and the editing with the person.

    This translation method is called post-editing and requires special skills. A post editor should not only know the language, but also understand how machine translation works.

    So, we connect machine translation and look for a post-editor through our Smartcat website. Fill in the Word document with the text of the article and tick the preliminary translation via Microsoft Translator. On the site you can not only translate, but also look for freelance translators from around the world, including post-editors with native English.



    Post-editor services are cheaper than translation services. We found a freelancer girl who asks for $ 0.022 ($ 1.32) per word. In total, the translation of the text costs $ 18.5 ($ 1109.28). According to the post-editor herself, she coped with the task faster than if she translated herself - in 2 hours. As a machine translation engine, we used the paid version of Microsoft Translator , which should translate better. We recalculate the numbers at the tariff S1:



    As a result, the transfer according to such a scheme is 75% more profitable and one third faster. It turns out that if you use machine translation and hire 50 post-editors, then the whole Habr can be translated in 48 days for $ 192 675 (11 553 004.94 ₽).

    Opinions


    Is the use of machine translation uniquely effective in the professional field? We have collected for you several opinions from our customers.

    Alexei Dyagterev, head of the B2B-Center electronic trading platform , says that they are trying to attract foreign companies to the site. Previously, only texts of the most significant procedures were manually translated into English, about 10% of all lots. Now, thanks to machine translation of an international audience, headlines and descriptions of all 5000 lots published on the site daily are available. The quality of the translation is acceptable - it is enough to find out the information and then clarify the details.

    Thanks to machine translation and integration with the Smartcat system, routine operations are performed in an automated manner, and the efficiency of using qualified employees has improved significantly .”

    Fedor Bezrukov, head of the department of one of the largest Russian translation companies Logrus IT , claims that there is some sense in the new technology, but it's not so simple.

    Recently, we received an urgent order for the translation of a technical text with a volume of 900 words from Russian into English. Three machine translation engines were connected at once - statistical (SMT) and neural (NMT) from Microsoft and statistical from Google. And to check the style and grammar - also the Grammarly plugin. Microsoft NMT and Google SMT produced the most successful translations. The translation was ready in 1 hour 40 minutes, the process was controlled by the translator. It turns out that we achieved a productivity of ≈500 words per hour . "

    According to Fedor, the difference between statistical and neural machine translation is that neural translators produce a much more coherent text, but there is a danger: the result may be well-read nonsense.

    “At this stage, we prefer to use the output of several engines in order to combine the advantages of each and level the disadvantages. When NMT engines can be trained and taught terminology on the fly, the process will reach a whole new level. ”

    Weebly colleagues recently contacted us who decided to localize their product in 13 languages. It immediately became clear to us that the project was not only large-scale, but also complex - the text content of the site is distributed across the system and stored in different formats, plus it is constantly changing and updating. An elegant solution was found: thanks to the integration of the Weebly website through the API, texts were processed, translated and transferred back to the website without any extra effort. To speed up the work, a team of 5-10 people worked on the translation into each language, and this is a good hundred translators. We actively used machine translation - to make translation faster and to check texts in different languages ​​on the layout.

    "Smartcat team supported us at every stage. Whenever questions arose or new tasks suddenly appeared, we could count on the guys helping or sharing their experiences. Thanks to Smartcat, we were able to localize the Weebly website in 13 languages ​​efficiently and in a short time, effectively managing the process at each stage: from finding translators and distributing tasks to data management and integrating automated solutions into the project . Nicholas Olucha Sanchez, Weebly Localization Manager.

    “The Weebly project was not easy, and therefore interesting. At Smartcat, we are developing the translation community and creating smart technologies, enabling companies to easily scale their businesses. With us you can easily find an artist or assemble an entire team, combine machine translation engines, connect glossaries and translation memory, and if there is a lot of work, we will do everything for you. We love difficult tasks, if you have such - write! ” Sergey Andreev, Product Manager at Smartcat

    * Throughout the article, the conversion is $ to ₽ at the rate of the Central Bank of the Russian Federation on August 10, 2017. Data from the site .

    about the author


    Pavel Doronin - loves localization, translations and everything related to this, and works on creating the best tools for this. He also loves electronic music and synthesizers (after work). # i18n # l10n # xl8n

    Also popular now: