20 projects, 20 languages, deadline yesterday. Part 3
Final article on Serge + Smartcat integration . In this article, I’ll tell you how we scale Serge to the entire company, consider 4 non-standard integrations and, as a bonus, talk about 2 features that can simplify your life.
Previous articles:
20 projects, 20 languages, deadline yesterday
20 projects, 20 languages, deadline yesterday. Part 2
In a previous article, I talked about how to configure Serge for a single repository. In our company, we have several dozen repositories that need translations, so a separate server for localization was allocated. The file structure and environment on it are completely identical to what is described in the previous article. Each repository uses its own instance of Serge. In order not to execute commands manually, each instance has a crown, which successively runs Serge commands: receiving new lines from the repository, receiving new translations, parsing, sending new lines to Smartcat and sending new translations to Gitlab.
Let's start with the simplest case. Imagine that your repository has several sets of resource files. For example, client strings and application APIs are stored in the same repository, but in different directories. The client is translated into 20 languages, the API into 6.
Task : to organize an independent delivery of translations to each of the directories.
Solution :
Please note that prefixes must be unique among all projects configured for one repository.
In total, we have 2 projects in Smartcat and 2 corresponding projects on the localization server. Both projects look at the same repository in Gitlab, but in different directories. Serge, using the branch prefix, understands which lines he needs to send for translation. To calculate the diff, the same base-translate branch is used.
In our company, all products, including documentation, are localized. Now we are introducing auto-generation of documentation from swagger, and we are faced with the need to localize it.
Task : localize swagger with minimal effort.
Solution : In the myproject.tmpl.serge file, add the data object to the parser object and list in it those fields whose value must be extracted and sent for translation:
A similar task : it is necessary to translate texts from a file, but not all, but only legal ones. Other texts are provided by a marketing team. In order not to complicate the structure and not create an additional file for legal texts, the keys of all legal lines received the prefix “legal”:
Another interesting case. We have a legal document, the terms of which vary from country to country. But, nevertheless, this is one application and resource files are in the same directory.
Objective : in the framework of one project to translate several documents, each document must be translated into one specific language.
What has been done :
Our system has a part of the code that stores translations in the database, and for a number of reasons it cannot move to resource files in the repository. However, we need to be able to deliver translations quickly and automatically.
Task : Organize a process of continuous localization if the rows are not stored in the repository, but in the database.
Solution :
Basic Smartcat alerts were not suitable for us, since each team wants to receive notifications only about its branches and only about the complete readiness of translations in all resource files of the product.
It was decided to build on the availability of all translations in the repository and, if they are completely ready, send notifications to the corporate messenger, in our case, Google Chat.
Task : organize alerts in the repository, where 8 teams can commit, duplicate all alerts in the channel of the technical documentation department.
Solution :
This is what our localization manager looks like when it comes time to assign all branches for translation.
On average, we have more than 10 branches in our work every day. In Smartcat, each language pair is a separate document, and translators must be assigned to each such document. Manually. Imagine: 40-60 appointments every day. To simplify this process, we made an appointment through the API, and also put it in the pipeline. This job is launched by the button. A reasonable question: why not make assignments automatic when sending transfers, and not place a method call in the Smartcat plugin, and not in the pipeline?
There are several reasons for this decision:
Solution: when the localization manager considers that the lines in this branch are ready for translation, she presses a button in Gitlab. The entire team of translators is assigned to this branch. The task is taken by the translator who responded first.
This concludes my series of articles on integrating and configuring continuous localizations. I will be glad to answer any of your questions.
Previous articles:
20 projects, 20 languages, deadline yesterday
20 projects, 20 languages, deadline yesterday. Part 2
Scalability
In a previous article, I talked about how to configure Serge for a single repository. In our company, we have several dozen repositories that need translations, so a separate server for localization was allocated. The file structure and environment on it are completely identical to what is described in the previous article. Each repository uses its own instance of Serge. In order not to execute commands manually, each instance has a crown, which successively runs Serge commands: receiving new lines from the repository, receiving new translations, parsing, sending new lines to Smartcat and sending new translations to Gitlab.
Integration options
Two sets of languages in one repository
Let's start with the simplest case. Imagine that your repository has several sets of resource files. For example, client strings and application APIs are stored in the same repository, but in different directories. The client is translated into 20 languages, the API into 6.
Task : to organize an independent delivery of translations to each of the directories.
Solution :
- Set up 2 projects in Smartcat: in 6 languages and in 20.
- Set up 2 projects on the localization server.
- In the first project in the project1.cfg file add the line our $ unmerged_branch_mask = '^ (translateAPI-)'; # process unmerged branches matching this mask , where “ translateAPI- ” is the branch name prefix. The prefix will indicate to Serge that this branch needs translations in the API directory.
- In the project1.serge.tmpl file, specify the path to the resource files in the API directory in the source_dir parameter .
- Similarly, for the second project in the project2.cfg file add the line our $ unmerged_branch_mask = '^ (translateCLIENT-)'; # process unmerged branches matching this mask , where “ translateCLIENT ” is the prefix for the branches of this project. The prefix will indicate to Serge that this branch needs translations in the Client directory.
- In the project2.serge.tmpl file, specify the path to the resource files in the CLIENT directory in the source_dir parameter .
Please note that prefixes must be unique among all projects configured for one repository.
In total, we have 2 projects in Smartcat and 2 corresponding projects on the localization server. Both projects look at the same repository in Gitlab, but in different directories. Serge, using the branch prefix, understands which lines he needs to send for translation. To calculate the diff, the same base-translate branch is used.
Localization Swagger
In our company, all products, including documentation, are localized. Now we are introducing auto-generation of documentation from swagger, and we are faced with the need to localize it.
Task : localize swagger with minimal effort.
Solution : In the myproject.tmpl.serge file, add the data object to the parser object and list in it those fields whose value must be extracted and sent for translation:
parser {
plugin parse_json
data
{
path_matches \/(summary|description)$
}
}
A similar task : it is necessary to translate texts from a file, but not all, but only legal ones. Other texts are provided by a marketing team. In order not to complicate the structure and not create an additional file for legal texts, the keys of all legal lines received the prefix “legal”:
parser {
plugin parse_json
data
{
path_matches ^\/legal\..*
}
}
Subtleties of legal translations
Another interesting case. We have a legal document, the terms of which vary from country to country. But, nevertheless, this is one application and resource files are in the same directory.
Objective : in the framework of one project to translate several documents, each document must be translated into one specific language.
What has been done :
- An appropriate directory was created for each country, inside of which lay an English source file relevant to that country.
- The path for the source_dir variable is specified to the shared directory with resource files.
- We enable the search for resource files in all subdirectories : source_process_subdirs YES
- We add a new plugin to the list of called plug-ins, which allows you to send each specific resource file to the desired language. As a guide, use the name of the directory where it lies:
callback_plugins {
:feature_branch {
plugin feature_branch
data {
master_job job.base-translate
}
}
:limit_languages
{
plugin limit_languages
data
{
# all rules are processed top to bottom; each rule can add or remove languages
# so the most priority rules are placed at the bottom
if
{
# by default, don't localize
file_matches .
then
{
exclude_all_languages YES
}
}
if
{
file_matches de-au\/
then
{
include_languages de-AT
}
}
if
{
file_matches li-LI\/
then
{
include_languages li
}
}
if
{
file_matches pt\/
then
{
include_languages pt-BR
}
}
if
{
file_matches zh-Hans\/
then
{
include_languages zh-Hans
}
}
# and so on..
}
}
Localization when storing rows in the database
Our system has a part of the code that stores translations in the database, and for a number of reasons it cannot move to resource files in the repository. However, we need to be able to deliver translations quickly and automatically.
Task : Organize a process of continuous localization if the rows are not stored in the repository, but in the database.
Solution :
- Create a repository, collect and group in it all the lines from the database according to the principle that is convenient for us (by the number of translation languages or by products).
- Create a project in Smartcat.
- Start the standard cycle of continuous localization.
- Merge translation branches into the base-translate branch.
- By crown, check the hash value of the last commit in base-translate. If the hash has changed, that is, new translations have been generated, parse diff between the old and the current hash, and send new / changed rows to the database.
Bonus Features
Alerts
Basic Smartcat alerts were not suitable for us, since each team wants to receive notifications only about its branches and only about the complete readiness of translations in all resource files of the product.
It was decided to build on the availability of all translations in the repository and, if they are completely ready, send notifications to the corporate messenger, in our case, Google Chat.
Task : organize alerts in the repository, where 8 teams can commit, duplicate all alerts in the channel of the technical documentation department.
Solution :
- Agree with each team that the name of the branches should contain the name of the team. Still use the prefix translate- to indicate branches that need translation.
- Create a pipeline that runs only for branches prefixed with translate-.
- In the pipeline, determine which command the branch belongs to, check for the presence of lines with an empty value, and, if there are none, send readiness notifications to the appropriate channel. Since the code is quite voluminous, I put it into a script.
Ci
check-translations:
stage: check-translations
image: node:8.14.0
tags:
- devops
script:
- chmod +x ./notification.sh
- ./notification.sh
only:
- base-translate
- /^translate.*$/
when: always
Alert Script
#!/bin/bash
hangouts(){
curl -X POST --max-time 180 -H "Content-Type: application/json; charset=UTF-8" --data "{
\"cards\": [{\"header\": {\"title\": \"LOCALIZATION IS READY\",\"subtitle\": \"REPOSITORY NAME\",\"imageUrl\": \"https://avatanplus.com/files/resources/mid/5775880ee27f8155a31b7a50.png\"},\"sections\": [{\"widgets\": [{\"keyValue\": {\"topLabel\": \"Translation is finished in the branch\",\"content\": \"$1\"}}]},{\"widgets\": [{\"buttons\": [{\"textButton\": {\"text\": \"SEE COMMIT\",\"onClick\": {\"openLink\": {\"url\": \"https://gitlab.loc/common/publisher-client/commit/$2\"}}}}]}]}]}]}" "$3" || true
}
cd app/translations
if echo "$CI_COMMIT_REF_NAME" | grep "commandname1";
then
grep -rl '\:\s\"\"' *.json >> result.file
if [ -s network.file ];
then
echo "Translations are not ready";
cat result.file
else
hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_COMMAND_NAME_1
hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_DOC
fi
fi
if echo "$CI_COMMIT_REF_NAME" | grep "commandname2";
then
grep -rl '\:\s\"\"' *.json >> result.file
if [ -s result.file ];
then
echo "Translations are not ready";
cat result.file
else
hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_COMMAND_NAME_2
hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_DOC
fi
fi
...
if echo "$CI_COMMIT_REF_NAME" | grep "commandname8";
then
grep -rl '\:\s\"\"' *.json >> result.file
if [ -s result.file ];
then
echo "Translations are not ready";
cat result.file
else
hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_COMMAND_NAME_8
hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_DOC
fi
fi
Translator Assignments via Smartcat API
This is what our localization manager looks like when it comes time to assign all branches for translation.
On average, we have more than 10 branches in our work every day. In Smartcat, each language pair is a separate document, and translators must be assigned to each such document. Manually. Imagine: 40-60 appointments every day. To simplify this process, we made an appointment through the API, and also put it in the pipeline. This job is launched by the button. A reasonable question: why not make assignments automatic when sending transfers, and not place a method call in the Smartcat plugin, and not in the pipeline?
There are several reasons for this decision:
- Human factor. Despite the fact that we build processes and try to adhere to them, unread lines or lines without context regularly get into Smartcat. Automatic assignment in this case would mean additional expenses for us, since some lines would be sent for translation twice: before and after editing.
- Distribution of roles. Projects are configured and managed at the localization server level by the localization engineer or technical writer of the project. Appointments and communication with translators are handled by the localization manager. Thus, assignments must be manageable, transparent, and accessible through the GUI.
Solution: when the localization manager considers that the lines in this branch are ready for translation, she presses a button in Gitlab. The entire team of translators is assigned to this branch. The task is taken by the translator who responded first.
Ci
assignee:
stage: assignee
image: node:8.14.0
tags:
- devops
script:
- chmod +x ./assignee.sh
- ./assignee.sh
only:
- base-translate
- /^translate.*$/
- assignee
when: manual
Assignment script
#!/bin/bash
if echo "$CI_COMMIT_REF_NAME" | grep "translate-";
then
node -pe "JSON.parse(process.argv[1]).documents.forEach(function(elem){ if(elem.name.indexOf(\"$CI_COMMIT_REF_NAME\") !== -1) { console.log(elem.id) } });" "$(curl -XGET -H "Authorization: Basic $SMARTCAT_API_KEY" -H "Content-type: application/json" "https://smartcat.com/api/integration/v1/project/$SMARTCAT_PROJECT_ID")" >> documents
fi
sed '$d' documents > documents.list
while read LINE; do bash -c "curl -XPOST -H 'Authorization: Basic $SMARTCAT_API_KEY' -H "Content-type:application/json" -d '{"documentIds":[\""$LINE"\"],"stageNumber": 1}' 'https://smartcat.com/api/integration/v1/document/assignFromMyTeam'";done < documents.list
This concludes my series of articles on integrating and configuring continuous localizations. I will be glad to answer any of your questions.