Only 4.13% of the Web is compliant

Original author: Ryan Paul
  • Transfer
The browser developer company Opera published the first results of a study aimed at studying the structure of network content. To conduct it, the company created an application called MAMA ( Metadata Analysis and Mining Application ): working as a spider, it indexes markup and some other data from more than 3.5 million pages.

A statistical analysis of the data collected by MAMA allowed Opera engineers to draw conclusions about emerging trends in the field of Web development and how web-based standards-based technologies are applied on the network. Opera plans to take its project to a higher level by developing a search engine based on already indexed data. Thus, Web designers, browser developers and Web engineers can easily get information about the real application of Web technologies on the Internet.

The preliminary data published by the company provides interesting information on the use of specific HTML elements. Among the MAMA pages analyzed, the most popular elements are head , title , html , body ,a , meta , img and table . Less commonly used are elements such as var , del, and bdo .



The company also studied the prevalence of Rich Web Apllications, which are mainly associated with the use of AJAX technology. A study found that Adobe Flash was used by approximately 35% of all sites analyzed. It is most widespread in China (67% of sites), least of all - in Denmark (25% of sites). XMLHttpRequest, which is the main AJAX, is used on 3.2% of all sites. Norway set a kind of record here, where the use of this mechanism was found on 10% of sites.

The study also showed that CSS is used quite widely: almost 80% of its resources were found in one form or another. The most popular CSS properties are related to color and fonts. JavaScript also does not lag behind CSS and is used on 75% of Web resources.

Compliance?


Opera, among other things, decided to check the indexed pages using the W3C validation tools to determine which number meets the standards. Valid XHTML 1.0!The results showed that only 4.13% of all pages are valid. Another striking conclusion is that about 50% of the pages that contain the W3C compliance badge are invalid. Theoretically, the initial layout of such pages was consistent with standards, but later it lost this property (for example, as a result of adding new content to the page).

The company's engineers tried to find out if there is any connection between the development tool and the validity of the pages. For this, the meta tags of the page were analyzed. It turned out that pages created using Apple iWeb are valid in 81% of cases. In comparison, only 3.4 percent of pages created in Adobe Dreamweaver are standards compliant.

The results of the studies are very interesting, but the potential of the entire system is not yet fully revealed. Opera’s attempt to develop a search engine based on MAMA data opens up even more amazing analysis opportunities that other projects can use in their own research and development.

“The Internet is fragmented, complex and prone to continuous growth. MAMA provides us with information on the intensity of the use of certain Web technologies. ”Says Opera Vice President Snorre M. Grimsby. “We can use this information to test and ensure the high compatibility, reliability and performance of our products. “We want to share this technology with our colleagues so that they can also benefit from it.”

Also popular now: