How I implemented multilingualism on the site and in the project

    Having created and supporting an open source project, I want to immediately solve all the possible problems of multilingual support for both the project and the site. I have come across multilingualism support in various projects for a very long time, starting with desktop programs. Thus, having an idea of ​​the possible needs, I began to get acquainted with the proposed solutions. Yes, almost all SaaS services offer free use for open-source projects, but there basically everything is focused on the translation of string resources. But what about the site and the documentation? Unfortunately, I did not find anything suitable and proceeded to independent implementation. I must say right away that I am satisfied with the result and have been using the system for almost half a year, although I warn you that this is not a mass completed solution, but rather a concrete implementation for my needs, but I hope

    To begin with, I will list the requirements that I set for the future offspring.

    1. You need to localize both the resources for the project stored in the form of JSON in .js, and all the texts and documentation on the site.
    2. The resource may not have translation into other languages. That is, for example, I can accumulate texts in Russian, and then give them to a translator, and in the Russian version of the site these texts will already be available.
    3. There should be a convenient system on the site so that the user can translate resources not translated into his language, create a new resource (text) or check and edit existing texts in his native language. It should look something like this - the user selects the action (translation, verification), the native language (and in the case of translation, the original language), as well as the desired volume. Based on these parameters, a resource is searched and offered to the user for translation or editing. Naturally, a log of user actions should be kept and statistics on the work performed should be accumulated.
    4. The site should have a choice of languages, but on each page should be shown only those languages ​​for which there is already a translation of this page.
    5. The same line can be used in several places. For example, the string is used in .js and in the documentation. That is, the resource must be in one instance and when it changes, it must change both in JSON and in the documentation.
    6. Ideally, there should be some kind of auto-moderated system, but for now you can focus on making personal decisions about publishing.

    Displaying changes in real time was not relevant to me, and I decided to make several intermediate tables with the entire internal kitchen and then, on command, build JSON and generate pages of the site itself. In fact, four tables are enough.
    Table structure
    CREATE TABLE IF NOT EXISTS `languages` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      `_owner` smallint(5) unsigned NOT NULL,
      `name` varchar(32) NOT NULL,
      `native` varchar(32) NOT NULL,
      `iso639` varchar(2) NOT NULL,
      PRIMARY KEY (`id`),
      KEY `_uptime` (`_uptime`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 ;
    CREATE TABLE IF NOT EXISTS `langid` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      `_owner` smallint(5) unsigned NOT NULL,
      `name` varchar(96) NOT NULL,
      `comment` text NOT NULL,
      `restype` tinyint(3) unsigned NOT NULL,
      `attrib` tinyint(3) unsigned NOT NULL,
      PRIMARY KEY (`id`),
      KEY `_uptime` (`_uptime`),
      KEY `name` (`name`),
      KEY `restype` (`restype`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 ;
    CREATE TABLE IF NOT EXISTS `langlog` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      `_owner` smallint(5) unsigned NOT NULL,
      `iduser` int(10) unsigned NOT NULL,
      `idlangres` int(10) unsigned NOT NULL,
      `action` tinyint(3) unsigned NOT NULL,
      PRIMARY KEY (`id`),
      KEY `_uptime` (`_uptime`),
      KEY `iduser` (`iduser`,`idlangres`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 ;
    CREATE TABLE IF NOT EXISTS `langres` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      `_owner` smallint(5) unsigned NOT NULL,
      `langid` smallint(5) unsigned NOT NULL,
      `lang` tinyint(3) unsigned NOT NULL,
      `text` text NOT NULL,
      `prev` mediumint(9) unsigned NOT NULL,
      `verified` tinyint(3) NOT NULL,
      `size` mediumint(9) unsigned NOT NULL,
      PRIMARY KEY (`id`),
      KEY `_uptime` (`_uptime`),
      KEY `langid` (`langid`,`lang`),
      KEY `size` (`size`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 ;
    

    Languages ​​table with three fields name, native, iso639. Entry example: Russian, Russian, ru

    Table of textual identifiers for langid resources, where you can also specify a comment and type. I divided all resources for myself into several types: JSON string, site page, plain text, text in MarkDown format. You can of course use your own types.
    Example: сancelbtn, Text for Cancel button, JSON

    Text resource table langres (langid, language, text, prev). We store links to the identifier, language and the text itself.
    The last prev field ensures that the text is versioned during edits and points to the previous version of the resource.

    All changes are committed to the langlog log table (iduser, idlangres, action). The action field will indicate the perfect action - creation, editing, verification.

    I will not stop working with users, I can only say that the user registers automatically when sending a translation or correction. Since email is not required, the user is immediately informed of the username and password. All changes made by him will be tied to his account. In the future, he can indicate his email and other data or simply forget about this registration.

    I drew a diagram so that you better understand all the relationships between the tables.
    image

    Since I need the ability to insert resources into other resources, I added macros of the form # identifier #. For example, in the simplest case, if we have a resource name = "Name", then we can use it in the resource entername = "Specify your # name #", which will be replaced by Specify your Name during generation .
    Now, to generate site pages, just go through all the languages ​​and resources with the appropriate type, process each text with a special replacement function and write the result in a separate table with the finished pages. Moreover, processing occurs in such a way that if # identifier # is not found in the current language, then it is searched in other languages. Here is a sketch of a recursive function (with anti-looping protection) that performs this processing.
    PHP lookup function example
        public function proceed( $input, $recurse = false )
        {
            global $db, $syslang;
            if ( !$recurse )
                $this->chain = array();
            $result = '';
            $off = 0;
            $start = 0;
            $len = strlen( $input );
            while ( ($off = strpos( $input, '#', $off )) !== false && $off < $len - 2 )
            {
                $end = strpos( $input, '#', $off + 2 );
                if ( $end === false )
                    break;
                if ( $end - $off > $this->lenlimit )
                {
                    $off = $end - 1;
                    continue;
                }
                $name = substr( $input, $off + 1, $end - $off - 1 );
                $langid = $db->getone("select id from langid where name=?s", $name );
                if ( $langid && !in_array( $langid, $this->chain ))
                {
                    $langres = $db->getrow("select _uptime, id,text from langres where langid=?s && verified>0
                                                                order by if( lang=?s, 0, 1 ),lang",  $langid, $this->lang );
                    if ( $langres )
                    {
                        if ( $langres['_uptime'] > $this->time )
                            $this->time = $langres['_uptime'];
                        $result .= substr( $input, $start, $off - $start );
                        $off = $end + 1;
                        $start = $off;
                        array_push( $this->chain, $langid );
                        $result .= $this->proceed( $langres['text'], true );
                        array_pop( $this->chain );
                        if ( $off >= $len - 2 )
                            break;
                        continue;
                    }
                }
                $off = $end - 1;
            }
            if ( $start < $len )
                $result .= substr( $input, $start );
            return $result;
        }
    


    In addition to replacing macros of the form # name #, I also immediately convert MarkDown markup to HTML and process my own directives. For example, I have a table of pictures where screenshots for different languages ​​can be hung on one record, and if I specify the tag [img "/ file / # * indexes #"] in the text, then I get an image named indexes with the one I need tongue. But most importantly, I can generate uploads for various purposes in any format. As an example, I will give the code for generating JSON files, although the truth is there, as unnecessary, the identifier substitution function is not used.
    JSON file generation for RU and EN
    function jsonerror( $message )
    {
        print $message;
        exit();
    }
    function save_json( $filename )
    {
        global $db, $original;
        preg_match("/^\w*_(?\w*)\.js$/", $filename, $matches );
        if ( empty( $matches['lang'] ))
            jsonerror( 'No locale' );
        $lang = $db->getrow("select * from languages where iso639=?s", $matches['lang'] );
        if ( !$lang )
            jsonerror( 'Unknown locale '.$matches['lang'] );
        $list = $db->getall("select lng.name, r.text from langid as lng
            left join langres as r on r.langid = lng.id
            where  lng.restype=5 && verified>0 && r.lang=?s
            order by lng.name", $lang['id'] );
        $out = array();
        foreach ( $list as $il )
            $out[ $il['name']] = $il['text'];
        if ( $lang['id'] == 1 )
            $original = $out;
        else
            foreach ( $original as $ik => $io )
                if ( !isset( $out[ $ik ] ))
                    $out[ $ik ] = $io;
        $output = "/* This file is automatically generated on eonza.org.
       Use http://www.eonza.org/translate.html to edit or translate these text resources.
    */
    var lng = {
    \tcode: '$lang[iso639]',
    \tnative: '$lang[native]',
    ";
        foreach ( $out as $ok => $ov )
        {
            if ( strpos( $ov, "'" ) === false )
                $text = "'$ov'";
            elseif (strpos( $ov, '"' ) === false )
                $text = "\"$ov\"";
            else
                jsonerror( 'Wrong text:'.$text );
            $output .= "\t$ok: $text,\r\n";
        }
        $output .= "\r\n};\r\n";
        $jsfile = dirname(__FILE__)."/i18n/$lang[iso639].js";
        if ( file_exists( $jsfile ))
            $output .= file_get_contents( $jsfile );
        if (file_put_contents( HOME."tmp/$filename", $output ))
            print "Save: ".HOME."tmp/$filename
    "; else jsonerror( 'Save error:'.HOME."tmp/$filename" ); } $original = array(); $files = array( 'en', 'ru'); foreach ( $files as $if ) save_json( "locale_$if.js" ); $zip = new ZipArchive(); print $zip->open( HOME."tmp/locale.zip", ZipArchive::CREATE ); foreach ( $files as $f ) print $zip->addFile( HOME."tmp/locale_$f.js", "locale_$f.js" ); print $zip->close(); print "Finish
    ZIP file";


    Thus, having spent not so much effort, I realized almost everything I wanted. Only things that are not relevant at the moment due to the low activity on the site remained unrealized. But additional features that were needed in the process of use were added. For example, receiving a text file with resources that need translation and reverse loading the translated text.
    Those who wish can take a look at the work page where users can translate, edit and create new resources for my project.

    image

    Also popular now: