habit February 24, 2016 at 13:19

How does Google Search work?

How Google Search works, the main update algorithms

Nowadays, search engines, in particular Google, resemble the "window" of the Internet and are the most important channel for disseminating information in digital marketing. With a global market share of over 65% as of January 2016, Google clearly dominates the search industry. Although the company did not officially disclose the extent of its growth, by 2012 it was confirmed that their infrastructure serves about 3 billion search queries per day.

Google.com globally ranked No. 1 on Alexa Top 500 Global Sites. Given these figures, it is especially important for owners of their own web pages to have good visibility of their sites by a search engine.

But despite such universal popularity of Google, do you know how it really works and what kind of pandas, penguins, caliber?

The more Google becomes necessary for modern marketing, the more important it is to understand the search functions and update algorithms that have a direct impact on the ranking of results. Moz suggests that Google changes its algorithms 600 times a year. Many of these changes and related ranking factors are kept secret. And only major updates are announced publicly.

In this article, we will cover the basics of search engine functionality and explain the main updates to the Google algorithm starting in 2011. We also derive strategies following which you can keep up with the changes in the search engine. So read on ...

How does Google work?

With their appearance, search engines have completely changed the way we collect information that is familiar to us. Are you interested in updating stock market data or if you want to find the best restaurant in the area, or are you writing an academic report on Ernest Hemingway - the search engine will answer all your requests. In the 80s, answers to questions would require a visit to the local library. Now everything is solved within a millisecond using the algorithmic powers of the search engine.

In this regard, the main goal of a search engine is to find relevant and relevant information as quickly as possible, as an answer to the entered search terms, also called keywords. Therefore, the central aspect for any search engine that wants to give a really useful result is the concept of the purpose of the search, how exactly people search.

The result of Google’s work can be compared with an online catalog selected using an algorithm-based rating system. More specifically, the search algorithm can be described as “finding an element with given properties in the list of elements”.

Let's now take a closer look at the involved processes of crawling, indexing, and positioning.

Scanning

Scanning can be described as an automated process of systematically studying publicly accessible pages on the Internet. Simply put, during this process, Google discovers new or updated pages and adds them to its database. To facilitate the work, he uses a special program. “Googlebots” (you can find alternative names: “bots” or “robots”) visit a list of URLs received during the last crawl and supplemented with site map data provided by webmasters and analyze their contents. If links to other pages are found while visiting the site, bots also add them to their list and establish systematic links. The scanning process takes place on a regular basis in order to detect changes, remove “dead” links and establish new relationships. And this despite that according to September 2014 alone, there are about a billion websites. Can you imagine the complexity of such a task? Nevertheless, bots do not visit absolutely every site. To get into the list of scanned, a web resource should be considered as quite important.

Indexing

Indexing is the process of storing the received information in a database in accordance with various factors for the subsequent extraction of information. Keywords on the page, their location, meta tags and links are of particular interest for Google indexing.

In order to effectively store information about billions of pages in a search engine database, Google uses large data centers in Europe, Asia, North and South America. These centers have been estimated to use approximately 900,000 servers based on Google’s energy consumption in 2010.

The main goal of the indexing process is to respond quickly to a user's search query. We will discuss it at the next stage.

Treatment

When a user enters a query, Google performs a search in the database that matches the conditions and algorithmically determines the relevance of the content, which leads to a specific rating among the sites found. It is logical that results that are considered more relevant to the user of the search engine intentionally get a higher rank than results that are less likely to provide an adequate response.

Although Google has not released official data on this, the company confirms that it uses more than 200 factors to determine the relevance and importance of a particular page.

Naturally, it is important for all web developers to know what are the ranking factors that affect the position of the page in the search results. Sometimes Google gives certain hints by announcing important changes to updates to its algorithms.

All the above processes of scanning, indexing and positioning can be represented using the following scheme:

Now that you have a basic understanding of how Google works, we’ll look at the main updates to search algorithms starting in 2011.

Algorithm update since 2011

As you yourself understand, Google will never publicly disclose its search algorithms and ranking factors. This would be equivalent to Coca-Cola posting its famous soda recipes on the Internet. However, Google wants to improve the user experience and provide the best search results. In order to reduce embedded content in search results, the company wants to inform webmasters about when and how the main selection quality standards changed. Therefore, it is likely that before conducting a major update of the algorithm, the announcement will follow on Google Webmaster Central Blog.

So, let's look at the main updates that have been implemented since 2011:

Panda

The Panda update was first introduced in late February 2011. After a lot of his updates were released, at the moment the current version is: 4.2. The update can be considered as a significant improvement in the search algorithm, because it is aimed at improving the quality of website content. The main idea is that the original sites with copyright content in the search engine should take a place higher than pages with low quality, repeating what is already known or are copies of other sites. The Panda update has set a new base level for quality standards:

the content on the page should be substantial. More information is statistically ranked higher than containing less than 1,500 words;
The information presented on the site must be original. If you simply copy the contents of other web resources, Google will punish you for it;
site content should bring something new to the topic. Few people will be interested in re-reading the same thing for the hundredth time. For successful promotion, content must be something that is not on other sites;
site text should be spelling and grammatically correct and based on verified facts;
if you intend to automatically generate content from a database, the content must comply with the standards described.

Page Layout (Top Heavy)

The update, first released in January 2012, punishes sites that use too much advertising at the top of the page or make it overly aggressive, distracting from the main content. This was provoked by a large number of complaints from users who found it difficult to find the necessary information and had to scroll down the page for a long time. With this update, Google is calling webmasters to put the content of the site in the spotlight. In this regard, a large number of advertising interferes with the convenience of assimilation of information.

Penguin

It was released in April 2012. A new algorithm aimed at combating search spam. Sites that used spam methods were significantly downgraded or removed from it altogether.

Another feature of Penguin is its ability to parse reference mass.

Pirate

With the Pirate update, which was introduced in August 2012, Google downgraded sites that infringe copyright and intellectual property. To measure these violations, Google uses a copyright infringement request system based on the Digital Millenium Copyright Act. Rightholders can use the tool to report and remove the content of the plagiarist site from the Google database.

Exact Match Domain (EMD)

It was released in September 2012 and is aimed at combating domains similar to MFA.
MFA (made-for-adsense) is a domain created specifically for the Google Display System. Typically, such a domain is intended for a single request (or a family of requests) and Google Adsense is installed on it. A user who enters this domain does not see anything but advertising and as a result either closes the site or goes further by contextual ad. After the release of the EMD algorithm, sites containing a request in a domain name were withdrawn or very significantly downgraded.

Payday loan

It was released in June 2013 and aims to reduce the pages that contain spammed requests. Such requests are often used by webmasters to promote pages of a certain subject.

The update was launched due to numerous complaints, which said that even after the introduction of Panda and Penguin, the purity of the issuance left much to be desired.

Consider this update with a typical example. Let's say you need to buy a door. If you enter a query, Google will display door photos. Of these: 2-3 pages where you can directly buy doors, 3-4 sites of door manufacturing companies and 2-3 sites on how to choose and change a door. If there wasn’t a Payday Loan update, you would see 15-20 requests on one topic (for example, where to buy a door).
Google does not want to disclose the criteria by which such sites are selected, but this algorithm has clearly simplified the life of search engine users.

Hummingbird

Since September 2013, Google has implemented a replacement search algorithm called Hummingbird. Major updates, like Panda and Penguin, have been integrated with this new algorithm. The name Hummingbird was chosen as a synonym for describing the flexibility, accuracy and speed of the new update.

Instead of returning exact answers to queries using user-entered keywords (as it was before), Google interprets the intent and context of the search. The goal is to understand the meaning of the user's search query and return relevant results. This means that exact keyword matches are becoming less important in favor of finding intent. As an example: if you enter the query “weather”, then you hardly expect to get a full explanation of the term itself. Rather, in this case, we mean weather conditions. Something like:

Pigeon

The Pigeon update was first released in July 2014. It focuses on geo-dependent search results. User distance and location are key ranking parameters to ensure accurate results. This update is closely related to Google Maps. For instance:

Mobilegeddon

It was released in April 2015. This update only affects mobile search, it gives preference to pages that are mobile-friendly.

In the current state, the update does not affect the search results from desktop computers or tablets. Unlike Panda or Penguin, the algorithm works in real time.

There is a special test with which webmasters can check the compatibility of their site with mobile devices. You can also use reports on mobile usability in Google Webmaster Tools, only they can work with a delay.

How to keep up with changes in algorithms?

A discussion of major algorithm updates in recent years probably raises the question for you, how to keep up with these changes? The main goal of Google is to constantly move towards ensuring the highest quality and reliability of responses to user queries. While technical features can be modified, the broad strategy is unlikely to change.

As human behavior is constantly changing, the task of Google is also to adapt its algorithms to the changes. For example, Mobilegeddon was introduced in response to the growing trend of searches from mobile devices.

The main thing is to understand who your customers are. Focusing on the real needs of these customers is fundamental to keeping up with the changes.

So, if you are a web programmer, it’s especially important for you to keep abreast of changes in Google’s search algorithms. Here is a selection of a few useful resources that can help keep you up to date:

Google Webmaster Central Blog is your main source for official news and updates, it is often the first time they announced algorithmic changes.

Moz Google Algorithm Change History - In this database, Moz has published each of the notable changes to the algorithm and updates since 2000.

Search Engine Land is one of the most important online magazines for SEO and SEM. It has a whole section on Google algorithm updates with related articles.

Search Engine Roundtable- also includes an interesting section on algorithm updates.

Mozcast is a visual representation of changes in algorithms in the form of a weather summary.

Algoroo is a tool that tracks search results for fluctuations around 17,000 keywords due to a change in algorithm. A very useful site for detecting immediate updates.

Keeping tradition. Here you can find the source .

Tags: