Google wants to measure the importance of sites by facts, not links

    The Google research team published an article on arXiv.org entitled Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources , which addresses the issue of computing a specific Knowledge-Based Trust (KBT) reputation profile for a particular web page. It is planned that KBT should become the basis for the future Google search engine algorithm, which builds sites in accordance with their “reliability”. PageRank

    Link Ranking Algorithm is Knowndefines the importance of a web page as the number of links leading to it. A real Google search takes into account many other factors, such as the presence of certain words on the pages of sites, the relevance of information, the user's location, adaptability to mobile devices - there are about 200 such factors in total. It is believed that the search algorithm update in September 2013, known as Hummingbird (Hummingbird), taught Google to respond not only to keywords, but also to the contexts and images that accompany them. Last year's update of the Pigeon algorithm led to more relevant search results with geographically dependent information.

    A new approach to ranking sites considers the importance of a web page as a numerical characteristic of the reliability of facts. As before, the search robot crawls the site, extracts “statements” from it, the reliability of which is compared with the Knowledge Vault knowledge base. This Google-owned knowledge base now contains approximately 1.6 billion facts automatically collected from the Internet. Its main difference from the more well-known Knowledge Graph is its “omnivorous nature”. If the Knowledge Graph uses the reliable Wikipedia and Freebase as a source of information, then Vault “does not disdain” anything and collects information from absolutely all sites from which you can extract at least something. Based on the number of matches of the “extracted” facts with those stored in Google Vault, the reliability of the resource is determined.

    On test data, the probabilistic model proposed by the authors of the work showed satisfactory results. Then, KBT metrics for 119 million real web pages were automatically calculated. Further verification in manual mode showed that the real data is quite amenable to the new ranking system. How soon the results of the study will affect the existing Google search algorithm is still unknown.

    Also popular now: