The difference between markup and presentation

    After reading the comments on the Firefox 3 note : * {display: block} bug , I realized that a significant part of Habr’s readers, including those seriously involved in web development, do not quite correctly imagine something like HTML, and why tags are displayed like this not otherwise.



    As always, let's start with the story



    All this has already been written thousands of times, but nonetheless. HTML leads its “pedigree” from the GML language developed by IBM in the late 60s of the last century , designed for a universal document exchange system. In the mid-80s, his successor passed the ISO standardization and was called SGML, and finally, in the early 90s, Berners-Lee developed HTML based on it.

    Is it possible this way: flies separately, cutlets separately?

    Can. And even necessary. This is exactly what IBM experts developing GML decided and separated the semantic meaning of the document elements (logical structure) from their appearance (presentation). The logic of the document (the value of its constituent parts) is an integral part of the document itself, and it was decided to control the appearance through external style files.

    The rules for compiling the document (list of tags and rules for their use) were also transferred to external files - DTD .

    As a result, we get a system of three components:
    1. DTD - Definition of rules for using markup elements.
      Documents - Data labeled in accordance with the rules of the DTD.
      Style files - Determining the appearance of


      HTML markup elements of versions 1 and 2 included only the first two paragraphs in its scheme and did not offer document authors the ability to connect external style files, leaving the appearance of the document at the discretion of the browser. The HTML 2.0 specification only gave recommendations on how to display certain elements of a document.

      Salvage wins ...

      With the booming development of Internet commerce in the mid-90s, HTML 2.0 tools became scarce. And the entrepreneurial Netscape, the then leader in the browser market, and after it, Mircosoft, who also wanted to grab a piece of the pie, began to introduce various "extensions" of the HTML language into their browsers. In fact, turning it from a markup language into a design language. Some of the innovations were reflected in HTML 3.2 in the form of attributes such as background, color, and tags like font.

      CSS is there to help 1

      In 1996, the W3C approved a specification of CSS stylesheets designed to stop mess and return HTML to its original purpose - to determine the meaning of elements in a document, not their appearance. Those. give HTML the very third component that GML had.

      However, nothing goes unnoticed. The era of browser wars and HTML 3.2 has left a serious imprint on both browser engines forced to support HTML "extensions" and the minds of webmasters who are used to mixing logic and presentation together.

      How do browsers determine the appearance of elements?



      The main "instruction" for parsing an HTML document for a browser is a DTD , which contains a list of tags, attributes, rules for their placement and other rules for formatting a document.

      For example, the definition of the style tag says that the tag can contain data of type CDATA inside it, internationalization attributes (lang, dir), as well as type, media and title attributes are acceptable.

      The tags p and hr are described the same way, except that hr cannot have content, and p can contain string type data.
      Please note that in DTD there is no indication that hr should be displayed with a horizontal line, and p should add vertical indents!

      So how does the browser know that p should be indented and strong should be displayed in bold? Well, what do we have in general responsible for the appearance? Yes, the 3rd component of the GML model is styles. In the case of HTML, this is CSS.

      If you open URI: resource: //gre/res/html.css in FireFox , you can see the very styles that it applies to HTML documents by default.

      Here is the style for the paragraph:
      p, dl, multicol {
        display: block;
        margin: 1em 0;
      }
      

      But for the horizontal line:
      hr {
        display: block;
        height: 2px;
        border: 1px inset;
        margin: 0.5em auto 0.5em auto;
        color: gray;
        -moz-float-edge: margin-box;
        -moz-box-sizing: border-box;
      }
      hr [size = "1"] {
        border-style: solid none none none none;
      }
      

      True, everything is simple and logical?

      Let's get back to the behavior described in Firefox 3: * {display: block} bug , and see what rules FireFox follows when displaying the main blocks of a document:
      html, div, map, dt, isindex, form {
        display: block;
      }
      body {
        display: block;
        margin: 8px;
      }
      ...
      / * hidden elements * /
      area, base, basefont, head, meta, script, style, title, noembed, param {
         display: none;
      }
      

      What do we have in the end? And in the end, we have this sequence of rules:
      area, base, basefont, head, meta, script, style, title, noembed, param {
         display: none;
      }
      ...
      * {
         display: block;
      }
      

      According to the cascading rules defined by CSS 2.1, the author’s rules take precedence over user agent rules, so the second style overlaps the first and all elements of the document, including head and style tags, become block and the contents of the style tag, which has the CDATA type (see above), are displayed in the browser.

      Apparently, Opera and Safari and Konqueror do the same.

      Ahead of the planet ...


      ... as always Internet Explorer. It's hard to say how Microsoft's browser processes documents, but judging by a number of signs, based not on DTDs and styles, but on some kind of own idea of ​​HTML. Apparently the legacy affects the time of HTML 3.2.

      Conclusion


      In addition to de jure standards, there are many de facto standards on the web, as in other areas, with historical roots. Such standards are perceived by many as a given, generating some inertia of thinking. Faced with something natural, predictable, but violating the usual order, those with such inertia are surprised, declare the unusual error.

      But I would like to remind a proverb: trust, but verify! And it happens that for verification, a mistake is not a mistake at all, but a consequence of ignorance, misunderstanding, or the very inertia of thinking.


      References








      1 Title of the chapter in Eric Meyer's book “CSS - Cascading Style Sheets”

    Also popular now: