SEO of a multilingual site without geo-referencing under Google
This article will focus on important elements when promoting a site translated into a large number of languages and not having a specific geographical location, that is, we are interested in organic traffic in any language from anywhere in the world.
My experience is based on promoting a site for expats and travelers from all over the world, whose content pages are formed from a database. A site with millions of pages generated from a content database.
Multilingualism and URL formation
In my example, the site had to be translated into 15 main languages, respectively, it was necessary to choose a URL construction scheme for each language.
There are four options for how this can be done:
1- Each language on a separate subdomain
Example: de.example.com, ru.example.com
Benefits - not identified.
Disadvantages - inconvenience of administration.
2- Each language on a separate domain (Example: example.de, example.ru)
Advantages - higher CTR from searching at the expense of the domain zone that is “native” to the user (Example: German will click on the domain in the .de zone rather than on .com).
Disadvantages - difficulty in working with external links, extremely inconvenient administration.
3- For each language, its own parameters in the URL (Example: example.com?leng=de, example.com?leng=en)
Benefits - not identified.
Disadvantages - URLs with parameters are worse indexed and participate in the search.
4- Each language in a separate subdirectory (Example example.com/de/, example.com/en/)
Benefits - not identified.
Disadvantages - not identified.
Even despite the fact that the second option, unlike the others, has an advantage, I chose the fourth because of the ease of implementation and the absence of disadvantages.
Hreflang attribute - show Google in which language the page should be displayed
In working on the site, I encountered a problem. Google and Yandex now use the hreflang attribute to determine the language and country for which to display this version of the page.
users requesting Russian from Australia will be shown version example.com/ru-au
users requesting in Russian from the UK will be shown the example.com/ru-gb version
users requesting German from the UK will see example.com/de-au
users requesting German from the UK will be shown the example.com/de-gb version.
The problem with my site was that any language version of the page could be requested from any geographic location. For example, a Russian-speaking person may request a page from Russia, Australia, Germany and anywhere else in the world.
Therefore, I would have to write attribute paintings that take into account all combinations of languages and countries for each page.
The solution is to set the parameters of the hreflang attribute without specifying a geography, that is, we show Google only the language versions of the page, without reference to the region.
users requesting in Russian from anywhere in the world will see example.com/ru version
users requesting in German from anywhere in the world will see example.com/de
GeoShape - micro-marking of the geographical position of objects
As it turned out, hreflang only partially solved the problem. Since the site forms pages from the database (and its contents cannot be translated into different languages), the situation has turned out that I have millions of pages of duplicates for the main content, differing only in the title of the requested country in the title, and the rest of the content is often absolutely identical or slightly different. As an example, the analogues of Russian medicines in France, Italy and in the EU as a whole will be practically the same. It turns out that the pages of Italy and France that are different for users will be duplicates in the eyes of Google.
The challenge now was to show Google that pages with the same or similar content are for different countries. As we recall, hreflang cannot be used for this task, since it will either be hundreds of attribute lines, or pages with different languages will not be available from all countries.
Microdistribution came to my aid. Faced with a problem, I began to analyze sites like mine, and on booking.com I came across a GeoShape structured data tag. They use it to indicate the location of hotels.
After checking the validity of the code when using GeoShape in drug markup (the site is discussed in the article about drug analogues), I installed everything on the site.
Now Google sees the difference in geography between pages intended for different countries, but with similar content, and users can get their language version of the page from anywhere in the world.
Sitemap.xml - Improving Indexing
Everything is quite simple here, but there are also nuances.
We mark all URLs of the site with language versions in sitemap.xml using the hreflang attribute in accordance with Google’s recommendations .
It turns out that hreflang is duplicated both in pages and in sitemap.xml.
- When translated into many languages, the number of pages also increases in direct proportion, so it is worth remembering that the maximum number of links in one sitemap.xml file is 50,000, and the file itself should weigh no more than 10 MB.
- If there are a large number of pages, it is recommended to split sitemap files by language and site structure.