
Innovative technologies in localization


In preparation for the Loc Kit localization conference, we continue to communicate with speakers . This time we asked several questions to representatives of ABBYY Language Services Nadezhda Batyukova, head of the multilingual localization department, and Anton Voronov, director of innovation.
- Tell us about your work. What does it consist of?
Nadezhda Batyukova: We do localization of software, sites, computer games, marketing materials in different languages, compile multilingual glossaries and conduct linguistic testing. The department managers interact with customers, select executors for projects, manage the multi-stage localization process and provide quality control.
- How is the localization process technologically built: what technologies and software do you work with?
Nadezhda: Localization is an interesting and complex task. We use various software products of our own design to optimize the work. Lingvo.Pro helps us manage terminology and centrally store all linguistic resources for projects. Many of our clients also use this solution to organize terminological work within their companies.
Since we work with different file formats, we constantly need to convert materials to CAT tools. Previously, for these purposes, our engineers created special utilities for each project. They helped to extract texts for translation and statistics counting, to save code and tags, and also to automate the assembly of the file in the original format - that is, they relieved managers of the need to insert translations into many languages manually, line by line. We also used developments for automatic checks of the uniformity of terminology and the design of translations (absence of extra spaces, inconsistency of punctuation and case letters in the source and final texts, etc.). Now the basis of automation is SmartCAT , which is gradually replacing many separate utilities.
Anton Voronov: Localization is a multi-stage process, and the connection between all stages is especially important here. Now the main part of the translation processes in the company takes over SmartCAT. It grows with additional connectors, learns to work with different software, etc. Connectors, in turn, allow you to integrate the localization process with code repositories. As we know, when developers create software, they write code and put it into a repository - for example, GitHub or SVN. Thanks to the connectors, as soon as the program enters a certain repository and is ready for localization, you can proceed with the translation. There is no need to send anything by mail: we take the source files into the repository, translate and put the finished text in the same repository. And developers can immediately build a localized version of the program.
- What languages do you most often work with?
Hope:Our asset includes more than 80 language pairs. We work with translators around the world, and often localization is done without the participation of the Russian language (i.e. from one foreign language to another). Localization from English to the main European languages - the so-called FIGS (French, Italian, German, Spanish) - is a basic set that is in demand among almost all of our customers. On average, one project accounts for eight to ten language pairs: FIGS, Polish, Czech, Turkish, and the popular Asian languages — Japanese, Chinese, and Korean. And this is not the limit: for example, for Hamsetrsoft we regularly localize software and materials for the site in 40 languages, and the project for our other client is the absolute record holder for the number of languages: as part of this project, we made localization in 100 different languages of the world.
- What interesting examples and features of localization into separate languages have you met?
Hope: Usually, curiosities happen when the source and localized texts have differences in the direction of writing (from right to left, and not from left to right), in the rules for punctuation or in the connotations of words (including the names of software products) with which not familiar with the client. For example, somehow we persuaded the customer not to delete unusual punctuation marks from a file in Spanish, where, according to the rules of the language, the question and exclamation marks are duplicated upside down at the beginning of the sentence.
Also, in the process of localization, sometimes it is necessary to exclude certain realities from the text in order to adapt it to the target audience. For example, when working on one catalog in a variant intended for Muslim countries, in agreement with the customer, everything that was connected with alcohol and underwear was deleted. In another case, when adapting the American quiz for the Russian market, a number of topics were rewritten. In particular, baseball, not so popular with us, was replaced by figure skating. - How is the localization quality control process built? Hope:
Before starting the project, we draw up a multilingual glossary: we extract the terminology from the texts, translate into the required languages and approve it from the customer. If the client has his own glossary, then we take it as a basis. Lingvo.Pro helps us to maintain the relevance of the terminology, especially since the customer also connects to work directly in the solution interface.
We have a three-step process of work: after the translator, the editor looks at the text, then the corrector. All performers are native speakers of languages in which they live, living in the respective countries. For each language, we have our own team of professionals: a separate team strictly for each topic. From experience I can say: it doesn’t happen that one person works equally well with different languages and topics - for example, speaks Turkish, Japanese and Serbian in technical, legal and medical subjects.
The company also has a quality control department, whose task is to train performers, regularly monitor the quality of completed orders, and conduct examinations in any language of the world.
And finally, after all stages of work on the text, we carry out final automated checks using our own tools: we look to see if everything is correct, if the extra comma has slipped, are the terms related to the glossary.
Anton: Since quality control is very painstaking and scrupulous, but necessary work, it is very important to automate this process as much as possible. Therefore, in SmartCAT, we are trying to pay special attention to this area: automatic checks are now in great demand. We are also working to ensure that users can view information in context and test how the translation will look in the interface - for this we are developing special visualizers for various types of content.
- You mentioned linguistic testing. What kind of service is this and is it related to quality control?
Hope: Linguistic testing is the final stage of localization and an important component of the quality control process. The localized file is embedded in the real shell, and the expert evaluates it in context, viewing screen by screen according to the approved plan. During testing, the specialist fixes all the shortcomings in a special form (bug report) and offers correction options that are agreed with the customer and then entered into working files. Linguistic testing is charged per hour. Tariffs differ in languages and depend on the cost of the work of performers.
Unfortunately, in some cases, localization projects have an urgency of "yesterday", so customers insist on skipping the testing phase. But, as practice shows, it saves a lot of time, which is subsequently spent, for example, on additional edits and approvals, when at the stage of final assembly it turns out that the text does not pass along the length, etc. - How do you see the localization process after 5 years? Anton: Today, there are several important trends in the industry - I think in five years they will become more familiar with the localization process.
First of all, this is integration: a multi-stage localization process will be built on uniform solutions, the role of manual labor will be reduced, the success of projects will no longer depend on communication between the performers, and the content will be transmitted directly through connectors.
The role of professional machine translation will increase: now many customers are afraid of MT with post-editing, because they think that this technology cannot give a high-quality result. Nevertheless, the future lies with professional machine translation - with access to linguistic resources of translation memory and glossaries, with the setting of “engines”, post-editing and other stages of processing.
The number of localization languages will also increase. Previously, for you to understand 90% of the audience on the network, it was enough 5-10 languages. Now, 18-20 languages will be needed for this: more and more countries are going online, mobile technologies are developing, and the process of further globalization is ongoing.
The role of collective work on localization projects will definitely increase.
- And what, in your opinion, will become with the project manager in 5 years in such a world of automation?
Nadezhda: A good project manager in 5 years should already be the head of the department, at least J
There are cars with automatic transmission, but this does not mean that they can go without a driver. Similarly with localization: we have SmartCAT, in which it is more convenient for the project manager to control the process. I am sure that the introduction of new technologies will not lead to a massive reduction in staff in the localization departments. Rather, the work will become more effective and there will be more time for the development of the process.
Anton:I think localization project managers will be even more involved in the process and become more technologically competent. In general, they now understand the technological aspects better than the managers of other translation departments, since they possess not only linguistic, but also engineering competencies. Knowledge in the field of professional translation software will lead to competition at a different level in the labor market. Therefore, if you work in a technology company - this is one thing, and if in the old fashion you do everything “on your knee” - this is completely different. - How do you feel about the use of crowdsourcing? Anton:
We take crowdsourcing positively. In the same localization, with the appropriate organization, you can involve in the translation those people who are active users of the product, its fans, and almost more developers know about it. However, they really want to benefit the product - this is a different level of motivation. One of the obvious advantages is that such translators are already immersed in the context and are familiar with the subject.
In a professional scenario, we actively use teamwork and implement this direction in our products, thereby laying the ground for crowdsourcing. For example, SmartCAT is the cornerstone of “Translate Coursera,” a project where volunteers translate lectures on Coursera’s educational resource. The results cannot but impress: these people translated the first million words in just 3 months. But at the same time, we understand that crowdsourcing raises many serious questions: it is the motivation of a large number of performers, and the assessment of competence, and community management, and technology development. Recently, we talked about this in detail in a special column on Slon.ru.
- Why did you decide to develop your own technologies for translation automation, and what results have you achieved so far?
Anton: A variety of specialized technological solutions for automating the translation process are beneficial at various stages of the process. But a real breakthrough begins when the entire localization process is continuously automated from writing and extracting content to the work of a translator. Moreover, the available solutions are often heavyweight, difficult to learn and adapt, and also do not always cope with rapidly evolving processes and tasks. Our company has always been distinguished by a high degree of automation of all processes, so a difficult choice was made in favor of developing a single solution for automating all aspects of the business. As a result, today we not only use these technologies within the company, but also offer the market in the form of our products: ABBYYSmartCAT, Lingvo.Pro, Perevedem.ru and others. It is too early to talk about all the conclusions, but the first results that we have received speak about the correctness of the chosen path.