Document generation - personal experience

In my previous articles I tried to show individual fragments of the Document Generator. As it became clear from the discussions, its individual fragments exist in various implementations and it is not interesting to discuss them. Indeed, why discuss individual building bricks when you don’t see the whole building. Therefore, in this article I will try to show the whole building so as not to discuss its individual bricks. I will try to describe my vision of the implementation of the document generator, based on personal experience gained in one of the largest banks in Russia. I went from practice, implemented the generator in MS Word and Excel, that’s what was drawn as a result of this process.

1. The generator should be able to generate documents based on templates. The received documents should be as ready to use as possible, they should contain the minimum number of places that require manual editing after generation. Templates should be developed and accompanied by the business itself, IT participation should be to ensure the possibility of obtaining from the databases the information required to fill out the templates. In the future, it will become clear that we are talking about the values ​​of the fields and answers to questions.

2. The generator should automatically build UI forms based on templates to ensure manual input of information that is not in the database. Thus, manually entering the value of such a field, it will be substituted in different places of the document (s), correctly conjugated with other words, have the correct case, etc. Manually entered values ​​must be stored in the database and, if necessary, copied to the main database, ensuring the principle of a single input. Some of the fields should simply be shown on the form without the possibility of editing, only for visual control.

Some of the fields should be hidden: - these are calculated fields and fields that are filled with dashes at a certain stage of preparation of the document / contract. Values ​​of fields with dashes are determined later in the process of agreeing on conditions, for example, a contract. Fields are placed on the UI form in a specific order on certain tabs and have the corresponding element type (control) and a mask for input. When issuing the Generation command, the field values ​​are verified and in case of discrepancies, the user receives an error message or warning.

Template developers using a special development environment set the order of the fields, their location on the tabs of the UI form, select input / output formats, mark the fields to fill with dashes, etc. The role of IT in these processes can only be consultative. Practice shows that good documentation plus the availability of training examples, plus an already trained employee, plus a friendly development environment can almost completely free IT from preparing templates, input forms and questionnaires. This means that the business itself provides the necessary documents, frees IT from the routine, eliminates intermediaries between requirements and implementation, and works almost in a temporary mode convenient for the business itself. Well and of course

The development of templates, in addition to understanding the requirements for the final document, is a painstaking work requiring attention and diligence. Marking up the templates for a future document is not a job for young, ambitious, but impatient IT-workers; here you need female hands, female diligence and a tendency to cope with monotonous operations better than men is most welcome. Support for templates, their modification is still more female bread than male bread.

I myself somehow tried to lay out a document for a business, and to be honest, it did not bring me joy, although it gave some great ideas for improving the processes of preparing templates. Since then, I have greatly respected those women who, without these remarkable innovations, patiently and without complaint pulled their difficult female share.

3. The number of templates should be kept to a minimum, but at the same time, the templates themselves should not be huge. How many templates should be and what their sizes should be decided by the template developers themselves, i.e. business. And there are no exact rules that uniquely determine the size of the templates. But there may be recommendations. One of them is the larger the template, the more computing resources the Generator will need to generate documents based on it. Therefore, you should not pack the whole world into one template.

In practice, the size of templates is determined by how many various documents can be generated from one template and how much they will be similar to each other, whether they relate to the same topic, etc. Templates of 100 pages or more are not uncommon.

When I first saw such huge templates, I, with all the confidence in the competence of the business, doubted the correctness of their choice. But in order to point out to the business his mistakes, you need to figure out what was wrong done and offer the best option, and this requires immersion in business practice, which, in turn, will require competitors from the IT employee who are not related to his current skills and skills. Therefore, with your charter you do not need to go to someone else’s monastery. You have to trust people, 100 pages means 100.

It is clear that template developers do not use the concepts of performance and do not consider the costs of computing resources. They are interested in the result. Nevertheless, they are interested in the generation of documents as quickly as possible.

Therefore, template markup should allow template developers to control the speed of generation, and template developers, of course, take advantage of such opportunities. The idea of ​​reducing the size of the template is that the template can be large, but consist of pieces that are loaded into memory as needed. Those. in the process of compiling the template, it is broken by the compiler into pieces according to the markup that will be loaded into memory at the time the document is generated, depending on which document is assembled from these pieces. Those. from the point of view of the developer of the template, the template can be large, but from the point of view of the Generator, the template is a small base part plus specific pieces that are substituted in their original place if necessary. This way of working with templates reduces the generation time and, accordingly, reduction of consumed computing resources. Responsibility for the speed of generation lies with this approach on the developer of the template (and he is a business employee).

4. The generator should be able to include articles, paragraphs, parts of other documents that are not initially available in the template. The template only indicates the place where such information can be inserted. Those. the generator should be able to synthesize a document from separate paragraphs and fragments of other documents. Here the main difficulty lies in combining heterogeneous external textual material with the main document. For example, the library may store individual clauses of the contract and, depending on the terms of the contract, these clauses should be inserted into the main document. When inserting, you need to align the font and the indentation, determine the numbers of clauses of the contract, calculate links to other clauses of the current document, as well as to clauses of another document (for example, the general contract), etc.

All this requires, on the one hand, a certain standardization of maintaining individual library items, and on the other hand, the template creator needs to have at hand a set of markup elements that allow fine-tuning of custom settings for non-standard text fragments (you never know from what sources documents (texts, tables , charts that need to be inserted into the generated document.) The library of points is maintained and supported by the business. Changing the text of a paragraph leads to a change in the text of all documents, including so a this point. In the process of practice to figure out which part of the same pattern is always the same with the other parts. In such cases, these pieces can be taken from the template library (or file folder, where are the common parts for various documents) and in case of changes (and they are not rare today) edit not the same text in different templates, but edit it in one place. To quickly find the desired fragment, it must be equipped with a search image.

5. The attributes of the display of text in the finished document should be sufficient to meet the needs of the business. This is the thickness, tilt, underlining and striking out of the text, its highlighting with a specific set of colors (each highlight color can carry a semantic load). Support for the ability to display corrections is very important when you can see the initial and corrected version of the text in the document. The user of the document should be able to switch from the display mode of the final form of the document to the display mode of edits (MS Word). The generator should be able to automatically create the table of contents of the document, as well as create footnotes.

Also, the template designer should have in the assortment a sufficient set of special characters that are unambiguously interpreted by the user of the document, such as: any kind of daw, squares with and without daws, index finger, etc. The generator must be able to insert pictures from other documents or from files (for example, QR code), create links to sites, be able to type individual characters of a word (for example, a code word) into cells of a usually single-line table. And, of course, a set of markup tags should be available that allows you to create tables of any complexity, with a variety of table headers and rows. In addition to such tags, markup tags are given that allow you to delete (compress) empty rows or columns of the table if they were not filled with data after generation.

6. The generator must be able to return the result to the calling program or save the final document to a file folder. The generator is implemented in the form of several WEB-service methods, one of which is the generation of a document or a stack of documents according to a template. You can call WEB-Service from any program or DBMS. If the result is saved to a file, it is very important to have a flexible, customizable subsystem that defines the specification of the folder itself and the name of the document file, which can be intelligent and include date, client name, etc. The usual practice of working with client’s documents leads to the creation of a folder where all the documents related to the transaction are located, including those not prepared by the generator, and it’s convenient for users to use the “wooden” file structure for working with documents.

When generating a bundle of documents according to one template, if the generated documents, in addition to the field values, are no different, you can significantly reduce the generation time, because all preparatory operations with the template are performed at first only once, and then a pack of documents is generated using this intermediate form of the template. For example, a set of documents on guarantees can have the same conditions of agreement for guarantors, therefore all documents should differ only in the data of the guarantors themselves, their name, address, position and so on. In this case, it is advantageous to use the generation of a pack of documents according to one template, and get a gain in the generation time due to the fact that the generator uses the mechanism for using the intermediate form of the template. Such functionality should also be supported by the generator.

7. Imagine a document, for example, a questionnaire in which there is a section that includes information about parents. And the client has no parents, and never had. In one place of the questionnaire, he will put a daw that he has no parents, and in another place the information about the parents will remain blank. After printing such a questionnaire, excess paper will be used. Indeed, if the client has no parents, then there should be no information about the parents in the document. And such moments that cause cognitive dissonance in documents are often a dime a dozen. The energy and labor spent on paper production is wasted, the life time of a person who views such a profile with empty fields is wasted. This should not be. It could have been earlier, in the pre-computer era, but now, with a document generator, this squandering should be stopped. And all that IT employees need to do is simply get an answer in advance to the question of whether the client has parents, and depending on the answer, request information about them or completely exclude the section about them from the document. Those. Before actually filling out the document fields, a questionnaire should be passed, the answers to the questions of which are arguments both for the function of obtaining a list of fields necessary for filling in the document, and for the arguments of the document generation function for a specific template.

Answers to questions “carve out” from the polymorphic template only those parts that correspond to the answers, just as Michelangelo carved his beautiful creations from marble. The compilation of the questionnaire is a separate topic, which I will not consider in detail in this article, so as not to “make noise” about the presentation of the topic about the document generator. I can only say that by marking up a template, you can automatically get a questionnaire for it, if the markup tags uniquely determine the answer to the question, and therefore the question itself. Those. if the tag for answering a question (this is a unique answer code) belongs to only one question, then the questionnaire is compiled automatically, and the sequence of questions can be borrowed from existing questionnaires, which is also not a complicated operation. Questions and answers to them can be as usual, compiled in design time, and dynamic, obtained from data in the DBMS. Questions can have only one answer or be ambiguous. Answers to questions can affect not only the composition of the fields appearing in the UI form for filling, not only the final form of the document, but also subsequent questions. For example, if you had parents, you can ask whether they are residents or not. But if they were not there, and you are from the Andromeda nebula, then the question of residence is meaningless. So, the markup employee must master at least two concepts, this should place the fields and tags of answers to questions in the template. Because Since ordinary people calmly hold 5-6 concepts on one topic in their head, the task of marking up so far does not seem to me too complicated. Having mastered two basic concepts of markup, an employee can do something similar that a programmer who wrote the program “Hello world” in a new programming language for himself. This, of course, is not enough, but this is already something.

8. We will not be cunning. “Why so many words, so much cod.” A backoffice worker, if he is not too lazy, can provide himself with templates. And why does he need a generator when he has his own templates for all typical occasions. If something changes in life, he will create a new template for himself and with the help of the Replace command he can quickly make the necessary document, for example, a standard contract. But here is a situation that does not work with this approach. Those. it works, of course, but somehow. Suppose, the primary version of the contract of guarantee is drawn up and transferred to the client, and then to the lawyers for approval. As a result of corrections, after a while, on the basis of this primary document, a new document appears in which many points of the original version are corrected, supplemented and all that.

Here a situation arises when an employee of the back office does not have a template, because for all occasions you can’t stock up on templates and you won’t try all the options for sureties. Here is the case when the generator can really show its advantage over the usual manual method of preparing documents. To do this, the generator must generate such an initial guarantee agreement, which, after making changes by the client and lawyers, can itself act as a template. Then the preparation of 10 or more documents of guarantees for different guarantors will take several seconds of the computer, will not contain errors, typos and all that any person can do at any time. Those. not only save time for preparing documents, but their quality increases, because the computer is not mistaken. To get such functionality, the generator must, at the time of the generation of the primary document, create on the fly hidden markup that exists in the form of bookmarks and is not visible to people working with the text of the document. Each bookmark is uniquely correlated with the name of the field, which means that you can substitute any values ​​in the place of the document marked with a bookmark, but only the substitution method will differ from the primary one. Unfortunately, if we talk about MS Word, bookmarks are not always saved in the document after making edits, it depends on the way the other text was entered instead of one. To eliminate the consequences of the disappearance of bookmarks, you have to make additional bookmarks, by which you can calculate the place of the disappeared bookmarks. Such a mechanism is fully functional, sorry that Microsoft is not doing anything to provide a standard solution to this problem. Well, maybe someday, when something is blown away or the climate changes, this problem will be solved.

9. Templates containing hidden markup, in comparison with the original templates, are rigid. But the degree of rigidity, fortunately, can be controlled. From the rigid template, you can get a hybrid (this is how puns are born), for this, in the part of the rigid template, you need to substitute the parts from the original template. Usually in a document there are places (hypervariable) that allow you to return to the place the markup that was in the original template before the document was generated. Naturally, such places should be marked in the template. As a result, even after edits by the client and lawyers, the template containing only bookmarks remains flexible enough. To those parts that have returned from the original (or any other) template, you can receive a questionnaire on the fly if markup with answers to questions is found in the returning parts of the template.

So, I looked at some of the individual “bricks” that you need to have in order to build a “building” called the “Document Generator”. I tried to focus your attention on the fact that the templates were created by the business itself and IT participation was minimal. Hopefully talking about hybrid templates could also be helpful.

It is clear that in many cases it is enough to have a simpler machine for preparing documents. In those organizations where there are no questions about parents and the documents are monotonous and static, you can use a slingshot instead of a gun. In the same organizations where there is an individual approach to customers and the process of preparing contracts is not automated
sufficiently, this article will help to determine the requirements for developers of that software product, which today must and can correspond to the description given by me in this article.

Also popular now: