Program Documentation
At a certain stage in the development of a software system, the task of developing user documentation inevitably arises. And here a technical question arises of choosing formats and documentation development tools.
With the choice of the final format, usually there are no problems, since the target operating system makes its own demands. So, for example, for Windows programs this is the compiled help format for CHM, for Linux and BSD systems it is man. The common format for all online systems for online help is html, and for printing - pdf.
The situation is complicated if it is necessary to have documentation in several formats - for distribution with the program (chm or man), for posting on the site (html) and for printing (pdf). At the same time, it is possible that the content of the documentation in various formats may vary slightly. For example, it makes sense to include video clips in online documentation, and in the print version they need to be replaced with a static image, possibly supplemented by qrcode links to the video clip. In addition, the content of the documents may differ for different categories of users, versions, bundles, and other factors.
Despite the apparent obviousness of the need to use specially created programs, not everything is so simple here.
The approaches differ depending on the target operating system.
So, to create compiled help for Windows in chm format Microsoft suggests using a special free HTML Help Workshop compiler . In this case, the source texts should be prepared in html format (the editor is not included in the delivery), and the table of contents files should be in a specific format. No means of forming printed manuals are provided.
Of course, specialized programs for creating help ( Robohelp , Help & Manual , HelpScribble and the like) provide a high level of service, have the ability to generate output documents in various formats, and even to some extent profile the content.
However, they have the following disadvantages:
1.2.2. Simple markup formats
A rational alternative is the use of simple and, as a result, quickly mastered markup formats.
There are several such formats today:
All these markups use some symbolic, non-tagged set of rules for designing headings, illustrations, and links, suggesting editing in simple text editors. The preparation of a view suitable for viewing is carried out programmatically, as a rule, on the server side.
For example, Wikipedia converts the Wiki format to HTML on the fly. The web portal of the Git system http://github.org is also capable of displaying documents in Markdown markup in a readable form in a browser.
Despite the fact that notepad capabilities are enough to create and edit the source texts, some service functions, such as spell checking and markup highlighting, would be very useful for the writer.
The article http://www.ixbt.com/soft/markdown-online-2.shtml provides an overview of online editors supporting markdown syntax, and http://www.ixbt.com/soft/markdown-online-3 .shtml provides an overview of five desktop editors that support the default markdown format, so to speak out of the box.
One such editor is MarkdownPad .

Figure 1. MarkdownPad 2 editor
As you can see from the screenshot, the MarkdownPad 2 editor supports live preview of the edited file with support for synchronous scrolling of the source text and rendering result.
When installing on Windows 8, a situation may arise when the preview is not available.

Figure 2. Report about the crash of the preview system
According to the developers of http://markdownpad.com/faq.html#livepreview-directx this is due to the need to install a specific SDK for web rendering of the Awesomium 1.6.6 SDK , which in turn uses DirectX .
The editor supports syntax highlighting, syntax checking of onelanguage (including Russian), export to HTML, PDF (only in the paid version). In other words, MarkdownPad 2, like other special editors, is a good choice for a technical writer. In those cases when the user has to edit files of various formats, you can adapt your editor for editing texts with markdown markup.
An editor that adequately meets these requirements can be considered Notepad ++ . Spellchecking of many languages is supported using a special plugin. Moreover, text verification in several languages is supported at the same time .

Figure 3. Notepad ++ editor
Despite the simplicity of the markup rules, it would be more convenient for the author of texts to work with syntax highlighting. With regard to Notepad ++, the Markdown Syntax Highlighting for Notepad ++ project will help in this , which, in essence, is a Markdown user language configuration file. After installing it, the text in the editor is as follows.

Figure 4. Notepad ++ editor with highlight markdown markup elements
It is noteworthy that editors with markdown support exist even for mobile platforms. The image below is a screen shot of a smartphone running the Quoda Code Editor .

Figure 5. Quoda Code Editor - a universal editor for Android with markdown markup support.
It should be said that most of this article was typed in this editor, and only then downloaded to a computer for revision.
Based on the analysis of the capabilities of the Markdown markup language and special editors, we can recommend their use for documenting systems of medium complexity.
At the same time, an open, well-documented format should be used as the initial format for the development of program documentation for large systems. As a means of formation - a tool with wide capabilities for customizing the appearance, profiling and the ability to generate documents in various formats.
Systems such as DITA and Docbook fully meet these requirements .
Despite some differences, both systems have much in common:
It should be emphasized that these systems use semantic markup in the source documents. The appearance of the output document is determined by the rules and parameters of the transformations. This approach allows the author not to think about typography and design at the stage of writing the source texts, but to focus solely on the semantic content.
At the same time, the practical experience of using, in particular, Docbook, which has been confirmed in a number of publications , has shown that there are some difficulties when using such a well-thought-out technology:
Naturally, the above disadvantages hinder the widespread use of XML-based single source technologies.
If you use non-tagged formats for the preparation of offline or printed documentation, you must use conversion utilities. Among many converters, the pandoc program deserves special attention.
Pandoc is a cross-platform program with a command interface that can convert texts in a wide variety of markups to numerous output formats.
So, for example, using pandoc, you can convert source documents in ASCIIdoc, Wiki, Markdown markup to HTML. If you install LaTex, then it becomes possible to obtain a PDF.
So, for example, converting the source text of this article to html format can be done with the following command:
The result will be a ready-made html file:

Figure 6. HTML document generated from Markdown by the pandoc utility.
For its versatility, the program is figuratively called the author “Swiss Army Knife”.
Indeed, pandoc handles conversion without any loss of information. When converting from MarkDown format, reading of three metadata parameters is supported - the title, author and date of the document. It also supports the transfer of command-line options to set some specific properties, such as a document language. It is possible to specify your own output document template, to some extent modifying it.
So, for example, in the above example, it is assumed that in the current folder there is an h.html file that plays the role of a header. If we add a link to a style file in this file and, having determined

Figure 7. HTML document generated by pandoc using a header file with style links
As you can see from the example, headers acquired their own style and external links began to open in new browser tab.
The format features described above make it justifiable to use Markdown markup for documenting relatively small software systems, the design of which does not have GOST requirements, which is proved by its widespread use in the Git system.
As for large systems with extensive and complex documentation, then for its creation it is seen the use of a single source system Docbook. There may also be transitional cases when the scale of the project does not appear immediately.
The complexity of creating XML source files can be overcome by using Markdown source code and then converting it to a Docbook. This conversion is supported by the pandoc utility. So team
will create the resulting file .
The use of the header file

Figure 8. The generated article in the XML editor
Several additional markdown markup requirements that will be used for conversion to the docbook should be noted:
First, avoid the use of angle brackets (<and>) in the text, since they are used in XML to select tags, and the converter leaves them as they are. If angle brackets are needed in meaning, then entities
Secondly, when you insert a picture, be sure toenter alternate text since pandoc uses it to create the required tag
However, the output text is generated in the outdated format of Docbook version 4, while the modern version 5 provides significantly richer opportunities for semantic markup.
To convert text from version 4 to version 5, you can use the special conversion db4-upgrade.xsl , which is included with the Docbook.
The docbook 5 xml schema file thus obtained can be used to form a single source.

Figure 9. An article of a schema in an XML editor in author mode
The described chain of transformations may seem at first glance to be long and unreasonably complex. However, having mastered the necessary tools once and developed batch files (scripts) for frequently performed tasks, you can save a considerable amount of time in the future.
It should be emphasized that the technology of a single source has a pronounced cumulative effect. The initial time spent on developing typical reusable text fragments pays off when they are used in subsequent projects. It is this quality that makes the single-source technology especially attractive when documenting serial software systems.
A set of transformations Docbook supports the formation of documents in HTML with styles, PDF for printing so to speak "out of the box."
The appearance of the output documents can be adjusted to a certain extent using parameters. The resulting file format FO-XSL pandoc5.fo is intermediate and is needed to build the final PDF .
Equally important is the ability to automatically generate a table of contents, a list of illustrations, listings, tables, an index, a glossary of terms and a list of references.
With a large number of documents in the package, it is also possible to create a separate list with the ability to automatically generate correctly formatted links to them. In the case of preparing a typographic layout of the manual, taking into account special requirements, for example, GOST, it is necessary to develop additional xsl for the formats of ordinary pages, cover and final pages.
This may be the subject of the next article.
1.1. Output formats
With the choice of the final format, usually there are no problems, since the target operating system makes its own demands. So, for example, for Windows programs this is the compiled help format for CHM, for Linux and BSD systems it is man. The common format for all online systems for online help is html, and for printing - pdf.
The situation is complicated if it is necessary to have documentation in several formats - for distribution with the program (chm or man), for posting on the site (html) and for printing (pdf). At the same time, it is possible that the content of the documentation in various formats may vary slightly. For example, it makes sense to include video clips in online documentation, and in the print version they need to be replaced with a static image, possibly supplemented by qrcode links to the video clip. In addition, the content of the documents may differ for different categories of users, versions, bundles, and other factors.
1.2. Source Formats
Despite the apparent obviousness of the need to use specially created programs, not everything is so simple here.
The approaches differ depending on the target operating system.
1.2.1. Proprietary Source Formats
So, to create compiled help for Windows in chm format Microsoft suggests using a special free HTML Help Workshop compiler . In this case, the source texts should be prepared in html format (the editor is not included in the delivery), and the table of contents files should be in a specific format. No means of forming printed manuals are provided.
Of course, specialized programs for creating help ( Robohelp , Help & Manual , HelpScribble and the like) provide a high level of service, have the ability to generate output documents in various formats, and even to some extent profile the content.
However, they have the following disadvantages:
- First, all of these systems are commercial and licensed by the number of jobs used.
- Secondly, the internal format they use is proprietary and is not supported by any software other than the one being sold. Of course, you will be given the opportunity to import files into the project, but you will not succeed in exporting the project in any open format suitable for further processing. Even if XML is used as an internal format (such as Help & Manual), its scheme remains closed and not documented in any way.
- Thirdly, the ability to change the appearance of the output document is insufficient to generate, for example, documentation in accordance with the requirements of GOST.
- Fourthly, it is extremely difficult to organize teamwork with these programs, if possible
1.2.2. Simple markup formats
A rational alternative is the use of simple and, as a result, quickly mastered markup formats.
There are several such formats today:
- ASCIIdoc , used de facto on Linux (BSD) systems;
- Wiki , used in various kinds of encyclopedias and even giving them a common name;
- MarkDown is a multi-purpose documentation format, so to speak.
All these markups use some symbolic, non-tagged set of rules for designing headings, illustrations, and links, suggesting editing in simple text editors. The preparation of a view suitable for viewing is carried out programmatically, as a rule, on the server side.
For example, Wikipedia converts the Wiki format to HTML on the fly. The web portal of the Git system http://github.org is also capable of displaying documents in Markdown markup in a readable form in a browser.
1.3. Editors
Despite the fact that notepad capabilities are enough to create and edit the source texts, some service functions, such as spell checking and markup highlighting, would be very useful for the writer.
The article http://www.ixbt.com/soft/markdown-online-2.shtml provides an overview of online editors supporting markdown syntax, and http://www.ixbt.com/soft/markdown-online-3 .shtml provides an overview of five desktop editors that support the default markdown format, so to speak out of the box.
One such editor is MarkdownPad .
1.3.1. Markdownpad

Figure 1. MarkdownPad 2 editor
As you can see from the screenshot, the MarkdownPad 2 editor supports live preview of the edited file with support for synchronous scrolling of the source text and rendering result.
When installing on Windows 8, a situation may arise when the preview is not available.

Figure 2. Report about the crash of the preview system
According to the developers of http://markdownpad.com/faq.html#livepreview-directx this is due to the need to install a specific SDK for web rendering of the Awesomium 1.6.6 SDK , which in turn uses DirectX .
The editor supports syntax highlighting, syntax checking of onelanguage (including Russian), export to HTML, PDF (only in the paid version). In other words, MarkdownPad 2, like other special editors, is a good choice for a technical writer. In those cases when the user has to edit files of various formats, you can adapt your editor for editing texts with markdown markup.
1.3.2. Notepad ++
An editor that adequately meets these requirements can be considered Notepad ++ . Spellchecking of many languages is supported using a special plugin. Moreover, text verification in several languages is supported at the same time .

Figure 3. Notepad ++ editor
Despite the simplicity of the markup rules, it would be more convenient for the author of texts to work with syntax highlighting. With regard to Notepad ++, the Markdown Syntax Highlighting for Notepad ++ project will help in this , which, in essence, is a Markdown user language configuration file. After installing it, the text in the editor is as follows.

Figure 4. Notepad ++ editor with highlight markdown markup elements
1.4. Quota
It is noteworthy that editors with markdown support exist even for mobile platforms. The image below is a screen shot of a smartphone running the Quoda Code Editor .

Figure 5. Quoda Code Editor - a universal editor for Android with markdown markup support.
It should be said that most of this article was typed in this editor, and only then downloaded to a computer for revision.
Based on the analysis of the capabilities of the Markdown markup language and special editors, we can recommend their use for documenting systems of medium complexity.
1.4.1. Open Tag Formats
At the same time, an open, well-documented format should be used as the initial format for the development of program documentation for large systems. As a means of formation - a tool with wide capabilities for customizing the appearance, profiling and the ability to generate documents in various formats.
Systems such as DITA and Docbook fully meet these requirements .
Despite some differences, both systems have much in common:
- use documented (schematized) XML as the source format, which makes it possible to use any XML editor with a validation function for editing;
- almost any xsl-converter xslproc , xalan , saxon , etc. can be used to convert to one of the resulting formats ;
- to obtain a pdf-document, the intermediate xsl-fo format is used, from which pdf is already formed using any fo-processor (for example, Apache FOP or XEP );
- Numerous transformation parameters are used to customize the appearance and profiling, and, if necessary, by adding custom xsl templates.
It should be emphasized that these systems use semantic markup in the source documents. The appearance of the output document is determined by the rules and parameters of the transformations. This approach allows the author not to think about typography and design at the stage of writing the source texts, but to focus solely on the semantic content.
At the same time, the practical experience of using, in particular, Docbook, which has been confirmed in a number of publications , has shown that there are some difficulties when using such a well-thought-out technology:
- creating source codes in XML format of a certain scheme requires a technical writer to work with special editors;
- good XML editors with Docbook support - commercial and expensive products (for example, oXygen XML Editor , Altova XMLSpy XML Editor );
- rich XML markup features entail a more complex format. For example, to insert an illustration with a signature in the Docbook markup into the text, you must use four nested tags.
Naturally, the above disadvantages hinder the widespread use of XML-based single source technologies.
If you use non-tagged formats for the preparation of offline or printed documentation, you must use conversion utilities. Among many converters, the pandoc program deserves special attention.
1.5. Pandoc conversion utility
Pandoc is a cross-platform program with a command interface that can convert texts in a wide variety of markups to numerous output formats.
So, for example, using pandoc, you can convert source documents in ASCIIdoc, Wiki, Markdown markup to HTML. If you install LaTex, then it becomes possible to obtain a PDF.
So, for example, converting the source text of this article to html format can be done with the following command:
pandoc -f markdown pandoc.md -t html -o pandoc.html -H h.html
The result will be a ready-made html file:

Figure 6. HTML document generated from Markdown by the pandoc utility.
For its versatility, the program is figuratively called the author “Swiss Army Knife”.
Indeed, pandoc handles conversion without any loss of information. When converting from MarkDown format, reading of three metadata parameters is supported - the title, author and date of the document. It also supports the transfer of command-line options to set some specific properties, such as a document language. It is possible to specify your own output document template, to some extent modifying it.
So, for example, in the above example, it is assumed that in the current folder there is an h.html file that plays the role of a header. If we add a link to a style file in this file and, having determined
, we get the following result: 
Figure 7. HTML document generated by pandoc using a header file with style links
As you can see from the example, headers acquired their own style and external links began to open in new browser tab.
The format features described above make it justifiable to use Markdown markup for documenting relatively small software systems, the design of which does not have GOST requirements, which is proved by its widespread use in the Git system.
As for large systems with extensive and complex documentation, then for its creation it is seen the use of a single source system Docbook. There may also be transitional cases when the scale of the project does not appear immediately.
1.6. Docbook
The complexity of creating XML source files can be overcome by using Markdown source code and then converting it to a Docbook. This conversion is supported by the pandoc utility. So team
pandoc -f markdown pandoc.md -t docbook -o pandoc4.xml -H h.xml
will create the resulting file .
The use of the header file
h.xml
(you can simply empty it) is necessary for the correct processing of meta tags and the formation of the article. 
Figure 8. The generated article in the XML editor
Several additional markdown markup requirements that will be used for conversion to the docbook should be noted:
First, avoid the use of angle brackets (<and>) in the text, since they are used in XML to select tags, and the converter leaves them as they are. If angle brackets are needed in meaning, then entities
<
and should be used >
. Secondly, when you insert a picture, be sure toenter alternate text since pandoc uses it to create the required tag
title
for the tag figure
. However, the output text is generated in the outdated format of Docbook version 4, while the modern version 5 provides significantly richer opportunities for semantic markup.
To convert text from version 4 to version 5, you can use the special conversion db4-upgrade.xsl , which is included with the Docbook.
xsltproc -o pandoc5.xml db4-upgrade.xsl pandoc4.xml
The docbook 5 xml schema file thus obtained can be used to form a single source.

Figure 9. An article of a schema in an XML editor in author mode
The described chain of transformations may seem at first glance to be long and unreasonably complex. However, having mastered the necessary tools once and developed batch files (scripts) for frequently performed tasks, you can save a considerable amount of time in the future.
It should be emphasized that the technology of a single source has a pronounced cumulative effect. The initial time spent on developing typical reusable text fragments pays off when they are used in subsequent projects. It is this quality that makes the single-source technology especially attractive when documenting serial software systems.
A set of transformations Docbook supports the formation of documents in HTML with styles, PDF for printing so to speak "out of the box."
xsltproc -o pandoc5.fo c: \ <Path to DocbookXSL> \ fo \ docbook.xsl pandoc5.xml
The appearance of the output documents can be adjusted to a certain extent using parameters. The resulting file format FO-XSL pandoc5.fo is intermediate and is needed to build the final PDF .
Equally important is the ability to automatically generate a table of contents, a list of illustrations, listings, tables, an index, a glossary of terms and a list of references.
With a large number of documents in the package, it is also possible to create a separate list with the ability to automatically generate correctly formatted links to them. In the case of preparing a typographic layout of the manual, taking into account special requirements, for example, GOST, it is necessary to develop additional xsl for the formats of ordinary pages, cover and final pages.
This may be the subject of the next article.