vovkos December 27, 2016 at 15:13

When creating chthonic monsters, document

Under this saying-meme, taken from a wonderful picture of Vladimir Filonov , every person who has at least a distant relation to programming will put his signature. The whole question is how? How exactly to document something?

The text below has several objectives:

Give a brief overview (read - take a little walk on the topic) of the unsatisfactory state of the toolbox applicable to the chthonic monsters of the C / C ++ world;
Offer your own alternative solution (free-of-charge, without SMS-and-registration - the project is non-commercial and posted on GitHub under a MIT license);
Encourage the community to communicate on the topic and collect ideas;
Invite to join the development of the project on GitHub.

I must say right away that although the project was created primarily as an alternative, or rather, the Doxygen add-on for syshlny and plus APIs, architecturally it is equally suitable for other languages . This allows you to create documentation portals for diverse libraries - the libraries themselves can be written in different languages, and the documentation will have a unity of style in appearance and behavior.

Motivation

By and large, there are exactly two approaches to documenting APIs and libraries - positive or not.

The first is to write everything with pens .

It doesn't even matter what - in Help & Manual, RoboHelp, Word or another editor. Despite the fact that this traditional method is clear to everyone and is still widely used, I am deeply convinced that it is fundamentally wrong. The fact is that it generates documentation, which is always irrelevant in places and lags behind the documentation object . Maintaining consistency between separately created, and often also different people, documentation and constantly evolving API libraries - only dead or frozen foods do not evolve! - This is a colossal task, only a little easier writing primary documentation.

The second, “right” approach, is to automatically generate source documentation .

A specially trained parser runs through the sources, isolates specially formatted comments with documentation, and builds the structure of the public API announcement tree. After that, the documentation in the required format is generated - I, as I believe, the majority, are primarily interested in HTML and PDF. The main advantage of this approach is guaranteed coherence of declarations in the source code of the API and in the final documentation. Even with the complete absence in the source code of meaningful comments with the actual “documentation” , in the end we will have a wonderful snapshot of the state of the API library, with the ability to “jump” over declarations and type descriptions, etc.

So, with your permission, I will concentrate on the “right” approach with auto-generation. What options do we have here? Alas, for documentation C / C ++ at the moment there is and is actually used sadly little : Doxygen and QDoc . And with these two, too, everything is not going smoothly.

Doxygen is the first truly successful project to extract comments from code on the pros and turn them into HTML documentation with hyperlinks, pictures of inheritance graphs, calls, etc. Unlike its direct parent, the pioneer of Doc ++, which has never received enough distribution, Doxygen is now the de facto standard for documenting C / C ++ code. And all this would be wonderful if not for the two buts:

The standard HTML generated by Doxigen, to put it mildly, is not burdened with elegance .

Of course, there is a place for subjectivism. I fully admit that there are not so picky people in the world who are completely satisfied with the Doxigenic exhaust (I venture to assume, however, that there will be no professional designers among them). But even if the default Doxigenic HTML suits someone from a visual point of view (and seriously, there are those who really like it aesthetically ? Write in the comments!), Very often you want to change and customize something that goes beyond CSS twisting - for example, put advertisements in

and indent and space according to the coding-style accepted in this particular library. This brings us to Doxygen's second, more fundamental problem:

Doxygen for its long life has not grown a real, modular customizability .

Yes, there is Doxyfilewith a bunch of variables, it is possible to change HTML headers and CSS, but architecturally - everything is hardcoded into a monolithic C ++ core! Moreover, both the front-end is hardcoded, namely, the source parsers and the back-end - HTML, PDF, RTF and other generators (among which, thank heaven, there is XML).

QDoc by default produces much, much prettier HTML than Doxygen. Unfortunately, if you want something not default , then QDoc suffers all the same innate wooden as Doxygen (growing, it is clear from the same~~well ...~~the hardness of both the parser and the generator into a monolithic plus core). In addition to its wooden, QDOC, unlike Doxygen, has only a single input parser - for QT-dialect C ++ (with all the Q_OBJECT, Q_PROPERTY, foreach, and so harshly treated as keywords). And at the same time - absolutely not at all - he can generate PDF!

Alternative

It is proposed to replace one tool with a conveyor . Instead

Doxygen -> (HTML, PDF, ...)

... we will use the following pipeline:

Doxygen -> (XML) ->
    -> Некий-Мост -> (reStructuredText) ->
        -> Sphinx -> (HTML, PDF, ...)

What are we leaving the old?

Developers know how they are used to documenting C / C ++ code using Doxygen comments :

/*!
    \brief This is a brief documentation for class Foo.
    This is a detailed documentation for class Foo.
    ...
 */
class Foo
{
    // ...
}

Why invent a new syntax? We will write the documentation in the same way as before!

Doxygen can pull documentation from source and put it together with the declaration tree in an XML database. Perfectly! This will be our front-end .

Even easier to answer the question of what to use as the back-end - of course, Sphinx . Sphinx deservedly gained enormous distribution as a tool for writing technical documentation. It produces very tasty looking HTML with support for full-fledged themes (and not just CSS!), It can glue everything into one HTML-sheet, generate documentation in PDF, EPUB and many other formats - and all this out of the box! But most importantly, it is fully customizable using Python scripts, and they can be used both for tuning the appearance and for expanding the input language (which reStructuredText is for Sphinx) - namely, add your own directives and then use them in the documentation .

It remains to make friends Doxygen and Sphinx.

Building a bridge

I note that I am not the first to try to build a bridge between Doxygen and Sphinx . The breathe project , written in Python as an extension to Sphinx, gained relative fame . At the moment, the project is not too actively picking with a screwdriver, and, alas, out of the box is not suitable for serious tasks. Architecturally, it is organized as follows: it parses the XML-output of Doxigen and creates the reStructuredText nodes of the tree in memory directly.

I decided to go a little different way. Doxyrest - this is the name of our bridge - parses Doxigen .xmlfiles, and then sends the parsed XML and a set of template files to the template engine (string template engine, template processor). The template engine generates files with reStructuredText , and these .rstfiles are already transferred to the Sphinx-back-end to obtain the final documentation in the specified format.

The main feature is, of course, the use of a template engine. This allows you to fully customize the structure of the documentation: change the order and group documented objects (classes / functions / properties, etc.), customize the style of declarations (where and how to use indents, spaces, line breaks, etc.), use arbitrary logic difficulties to include or not include this particular object in the documentation, and so on - and all this without recompilation , just editing the input templates!

But the main thing - the approach with the template engine allows you to apply Doxyrest to the vast majority of any other languages , and in particular, a variety of DSL - for which no one will ever make specialized documentation systems. Doxygen can't parse your language? They took the language compiler, added there the generation of Doxygen-like XML from the existing AST, then fixed the output .rstfile templates - so that the declarations in the documentation were with the correct syntax - and that’s it! You can now document your language using Doxygen comments and get beautiful Sphinx documentation as an output.

At the moment, the Lua language is used for templating (simply because I already had a ready-made and debugged Lua string templates library), but in theory, nothing prevents us from adding support for other templating languages.

The templates look and work like this:

    Title
    =====
    %{
    if false then
    }
    This text will be excluded..
    %{
    end -- if
    for i = 1, 3 do
    }
    * List item $i
    %{
    end -- for
    }

At the output we will have:

    Title
    =====
    * List item 1
    * List item 2
    * List item 3

Examples of using

Better to see once than hear a hundred times. Therefore, instead of concluding, I decided to simply provide links to the result of Doxyrest's work as applied to various languages:

Jancy Standard Library Reference (Jancy language)
Jancy C API Reference (C language)
IO Ninja API Reference (Jancy language)
AXL Library Reference (C ++)

Despite the incompleteness of the content of the documentation on the links above (the actual description of classes, functions, etc.), all this should be enough to demonstrate the operability of the method.

GitHub project page: http://github.com/vovkos/doxyrest

The project is laid out under one of the most non-stringent licenses in the world - The MIT License . See, try, join the development. And I will be happy to answer all questions in the comments.

Tags: