Competent definition of user language

  • Tutorial
Now I'm working on a site that claims to be global, naturally and with multilingualism it should be all right.

There will be no talk about how to display information in different languages. We will talk about how to determine the user's language, and choose the most suitable language version available on the site.

Who is too lazy to read - look at the screencast, it really didn’t work out very well, so I don’t post it here.

And so we have:
  • Php
  • CodeIgniter framework (the class was written for this framework, but it can be used anywhere with minor changes)

Task:
Determine the user's language and if the user is Russian-speaking (Russian, Belarus, Ukrainian full list here ) we show him the information in Russian. If not, then in English.
All this needs to be made out in the form of a class or function with the ability to quickly set something like links from the user's language to the language best for his understanding on the site.

Solution:
To determine the user's language, we use the $ _SERVER superglobal array, or rather, its element $ _SERVER ['HTTP_ACCEPT_LANGUAGE'], it describes the client’s preferences regarding the language. This information is extracted from the Accept-Language HTTP header, which the client sends to the server.
In my case, it was a string
ru-ru,ru;q=0.8,en-us;q=0.6,en;q=0.4

This line contains the languages ​​of the user that he prefers, and their priorities are expressed through q, if q is not specified for the language, then it is assumed that it will be 1. If you try to display it in a less readable form, it looks like this:
Array
(
    [ru-ru] => 1
    [ru] => 0.8
    [en-us] => 0.6
    [en] => 0.4
)

This shows that I prefer Russian, and in second place I have English.
Languages ​​are written in two formats. The main language code is “ru” and “en” in my case, which refers to the ISO 639 language standards .
And the main language code is the extended language code in my case is “ru-ru” and “en-us” here the extended language code indicates the region of use of the language I have is the United States.
At times, there is a misunderstanding about how to mark languages ​​when ISO code lists contain both two-letter and three-letter codes (sometimes several three-letter codes). Now all valid codes are listed in one IANA registrywhich for the language takes only one value from the ISO lists. If a two-letter ISO code is available, then it will be one in the registry. Otherwise, the registry will contain one three-letter code. This will simplify things.

With the theory sorted out, we move on to practice:
We write the constructor of the class controller:
public function __construct()
    {
        if (($list = strtolower($_SERVER['HTTP_ACCEPT_LANGUAGE']))) {
            if (preg_match_all('/([a-z]{1,8}(?:-[a-z]{1,8})?)(?:;q=([0-9.]+))?/', $list, $list)) {
                $this->language = array_combine($list[1], $list[2]);
                foreach ($this->language as $n => $v)
                    $this->language[$n] = $v ? $v : 1;
                arsort($this->language, SORT_NUMERIC);
            }
        } else $this->language = array();
    }

Here we process the string returned by $ _SERVER ['HTTP_ACCEPT_LANGUAGE'] so that it turns out an array of the form
Array
(
    [ru-ru] => 1
    [ru] => 0.8
    [en-us] => 0.6
    [en] => 0.4
)

Sorted in descending order of language priority (q value)

Next, we create a method that finds the most suitable language.
The first parameter is passed to it the default language, the second array of which will be the languages ​​that are on the site, and the values ​​of the link to it from other languages ​​look like an array:
$langs=array(
            'ru'=>array('ru','be','uk','ky','ab','mo','et','lv'),
            'de'=>'de'
        );

Method Code:
 public function getBestMatch($default, $langs)
    {
        $languages=array();
        foreach ($langs as $lang => $alias) {
            if (is_array($alias)) {
                foreach ($alias as $alias_lang) {
                    $languages[strtolower($alias_lang)] = strtolower($lang);
                }
            }else $languages[strtolower($alias)]=strtolower($lang);
        }
        foreach ($this->language as $l => $v) {
            $s = strtok($l, '-'); // убираем то что идет после тире в языках вида "en-us, ru-ru"
            if (isset($languages[$s]))
                return $languages[$s];
        }
        return $default;
    }

Languages ​​of the format are cut into the function: the main language code is the extended language code to the format the main language code since the need for the English and American versions of the language is unlikely to arise, but you can always add it if you wish.
The result of its execution will be the most suitable user language, in the ISO 639 format, I transferred English as the default language, and en will be returned for all languages ​​that are not in the $ langs array .

Download the library here

Also popular now: