Application Localization Node.js. Part 2: tools and process

Original author: Austin King
  • Transfer
  • Tutorial
From the interpreter: This article is from the Tenth cycle of Node.js from Mozilla Identity team that is engaged in the project Persona .





In a previous article on localizing Node.js applications, we learned how to use the i18n-abide module in our code. Our work as programmers actually ended on the fact that we wrapped the lines in the templates and application code in calls gettext(). But the work of localizing and translating the application is just beginning.

Tools


The localization toolkit of the Mozilla Persona team is compatible with the tools used by the rest of the Mozilla community, while retaining the friendliness and flexibility of Node.

The Mozilla project is almost 15 years old, and our team of localizers and translators is one of the largest (and cool) in the Open Source world. Therefore, we have widely used for a long time the familiar, one might even say old and fancy tools.

Gettext


GNU Gettext is a toolkit designed to localize desktop and web applications. When you write code and templates for Node, you use English phrases everywhere, but you wrap each one in a challenge gettext().

gettext does two things:

  • during assembly, compiles a directory of all lines encountered in the application;
  • at run time replaces them with localized options.

All extracted lines are stored in text files with the extension .poIn the future, we will call them po-files.

Po files


Po files are text files of a certain format that gettext can read, write and merge.

Here is an example of the contents of the zhTW / LCMESSAGES / messages.po po file:

#: resources/views/about.ejs:46
msgid "Persona preserves your privacy"
msgstr "Persona 保護您的隱私"

Details we will consider it later, but now it is important to understand that msgid- it is an English string, and msgstr- its translation into Chinese. Anything that starts with #- a comment. The comment in this example indicates the location of this line in the code.

Gettext provides many other tools for working with strings and po-files. We touch them too.

Why exactly this toolkit?


Before we dive into a more detailed study of Node.js modules for working with gettext, we need to ask ourselves why we chose this particular set of tools?

A year ago, I examined in detail the existing Node.js modules for internationalization and localization. Most of them invented their own bikes and JSON-based formats for storing strings.

On the other hand, Mozilla has long and successfully used tools like POEdit , Verbatim , Translate Toolkit, and Pootle . Instead of forcing people to relearn, we decided to develop tools for them that are compatible with familiar standards and processes.

Po-files - a common format for the exchange and cooperation of our translators. It is in this format that they should receive lines from us for translation, and give us the finished text.

Having extensive development experience in Mozilla back in PHP and Python, I find Gettext very convenient. As the web application grows and contains more text, more and more nuances appear that require the use of well-tested tools and the Gettext API.

We create po-files for translators


So, we marked up our code with gettext calls. What's next? The one who we call the "string driver" comes into play. It can be you yourself, a translator or an administrator. What does the string driver do?

  • Retrieves the lines that first appeared in the application.
  • Finds new, changed lines or marks deleted ones in subsequent releases.
  • Prepares po-files for each team of translators.
  • Resolves conflicts and marks modified or deleted translation strings.

This may sound a little confusing, but, fortunately, most of these tasks are well automated. The lineman only has to intervene when problems arise.

msginit, xgettext, msgfmt and other GNU Gettext tools - a powerful suite for working with directories strings. With these tools, only the string code works. Most developers can remain blissfully unaware of them.

Creating a file tree for a locale:

$ mkdir -p locale/templates/LC_MESSAGES

This directory stores po-template files - files .pot. They will be used by gettext in the future.

Fetching Rows


In the last article, we installed i18n-abide:

$ npm install i18n-abide

Among other command line tools, abide provides extract-pot. This command is used to retrieve strings in a locale directory:

mkdir -p locale/templates/LC_MESSAGES
$ ./node_modules/.bin/extract-pot --locale locale

The script will go through the entire source code of the application, find the lines and write them to the po template file.

You could use traditional gettext utilities to create pot files, but we wrote a special jsxgettext module that is convenient and cross-platform. Under the hood, extract-pot uses it.

Jsxgettext looks up calls in the code gettext()and extracts a string argument from them, then it formats the strings in a format compatible with the gettext toolkit. Here is an excerpt from such a pot file:

#: resources/views/about.ejs:46
msgid "Persona preserves your privacy"
msgstr ""
#: resources/views/about.ejs:47
msgid ""
"Persona does not track your activity around the Web. It creates a wall "
"between signing you in and what you do once you're there. The history of "
"what sites you visit is stored only on your own computer."
msgstr ""
""
#: resources/views/about.ejs:51
msgid "Persona for developers"
msgstr ""

Later, po-files with translation will be created based on this template. They will look like this:

#: resources/views/about.ejs:46
msgid "Persona preserves your privacy"
msgstr "Persona 保護您的隱私"
#: resources/views/about.ejs:47
msgid ""
"Persona does not track your activity around the Web. It creates a wall "
"between signing you in and what you do once you're there. The history of "
"what sites you visit is stored only on your own computer."
msgstr ""
"Persona 只是連結您登入過程的一座橋樑,不會追蹤您在網路上的行為。您的網頁瀏覽"
"紀錄只會留在您自己的電腦當中。"
#: resources/views/about.ejs:51
msgid "Persona for developers"
msgstr "Persona 的開發人員資訊"

To get a better feel for the topic, you can take a look at the full version of the po-file for Chinese.

Locale creation


The msginit command from the Gettext set is used to create a po file for a specific locale based on a template file:

$ for l in en_US de es; do
    mkdir -p locale/${l}/LC_MESSAGES/
    msginit --input=./locale/templates/LC_MESSAGES/messages.pot \
            --output-file=./locale/${l}/LC_MESSAGES/messages.po \
            -l ${l}
  done

We just created po-files for American English, German and Spanish.

Po files


So, we extracted the lines and created the locale folders. This is what our file tree looks like:

locale/
  el/
    LC_MESSAGES/
      messages.po
  en_US
    LC_MESSAGES/
      messages.po
  es
    LC_MESSAGES/
      messages.po
  templates
    LC_MESSAGES/
      messages.pot

These parts of your application can be given access to translators. For example, the Spanish team will have access to locale/es/LC_MESSAGES/messages.po. If you have a very large project, there may even be two separate locales for the Spanish and Argentinean versions of Spanish: es-ES and es-AR.

Over time, new locales may be added.

Merge row changes


Release by release you will add new ones, change and delete old lines. You will need to update all po-files in accordance with these changes. Gettext has powerful tools for this. For ourselves, we made a merge-po.sh script wrapper that uses the msgmerge command from the GNU Gettext package.

Add i18n-abide tools to the system paths:

$ export PATH=$PATH:node_modules/i18n-abide/bin

and start the process of merging strings:

$ ./node_modules/.bin/extract-pot --locale locale .
$ merge_po.sh ./locale

Like the first time, extract-pot collects all the lines and creates a template. Then merge-po.sh updates all po-files, bringing them in line with the current version of the application. After that, translation teams can take up the job again.

Gettext Against Invented Not Here Syndrome


It's not a big deal to invent your JSON-based bike instead of gettext. Most authors of Node modules went this way. But as applications grow and new languages ​​are added, minor troubles will grow like a snowball. For example, without merge-po.sh, sooner or later you will have to write and debug your own merge tools. Manually update 30 files for 30 locales, without losing or confusing anything - that’s still a mess.

And gettext already has everything you need and it saves us a lot of time and nerves.

Conclusion


Now that we have finally figured out how to create and update po-files, we can delegate them to the cares of translators. In general, it is always better to talk to them in advance and discuss when it will be possible to start the translation, what volume is expected and when it is desirable to finish. It will also be useful to study gettext documentation.

So, the lines are translated, and in the next article we will learn how localization works during application execution.




Also popular now: