How to write notes if you are a programmer

When a teacher makes a student write technical notes by hand at a technical university, something like this turns out:


Photo of two abstract sheets


This is the result of a program that generates handwritten text in custom handwriting. It can change the thickness of the pen and the color of the paste, write letters together or separately, supports writing in many different languages ​​and is potentially able to transfer words by syllables in many of them. Written in C ++ / Qt, there are versions for Windows and Linux. Then there will be a small analysis of the handwritten letter, a description of the different ways of imitating it, an analysis of the most interesting moments of the program and a link to the repository.


Why is programmatic generation of individual handwritten writing difficult? The fact is that handwritten writing is very variable, there is no such rigid predestination as when printing. For a plausible imitation of handwritten writing, one must take into account at least these features:


  • The type of letters depends on which letters are next
  • For each character, random and not very distortions are added, because of which absolutely identical characters in the text are usually not found.
  • Words can be written together or separately. You can also combine both spellings.
  • The width, height and slope of both characters and strings change.

All this is influenced by the manner of writing, the type of writing equipment, the backing, the speed of writing, the prevailing hand, the mood and well-being of the scribe, and much more. That is, you need not only to take into account a huge bunch of factors, but also to allow the user of the program to configure them at their discretion.


How is this usually done?


In practice, all this is rarely considered when necessary. For example, for a small handwritten print on a postcard or invitation, a regular OpenType handwritten font is used. However, under certain conditions, this may be suitable for the summary. On the network you can find instructions on how to create an individual font yourself and print it from MS Word, and for English-speaking users there are entire generators of handwritten fonts.


But with this method, plausibility suffers: the characters will be exactly the same, the lines will be perfectly smooth. There will be no chance, vyviglaznosti and chaos, which in my experience are characteristic of the vast majority of abstracts, and even for a neat abstract such impeccability is unnatural.


How to do better?


The main idea is this: we take printed text, for each character we take a handwritten glyph and place it on a virtual sheet of paper in the right place. At the same time, there should be several glyphs for the same symbol, and a specific one should be chosen randomly, otherwise there will be the same problems as OpenType fonts. Then it will be possible to introduce various distortions both on the entire sheet of paper, and for a single line or symbol. Probably, it will be possible to distort the glyphs in such a way that from one entered glyph you get an immediately ready set.


All the same can be done with ligatures and, accordingly, combinations of symbols in order to take into account the influence of adjacent letters on each other. Spoiler: unfortunately, my program does not support ligatures and cannot distort glyphs.


I am not the first to decide that software generation is a good idea, and I wrote my own program for creating manuscripts. I know the Bruise and Handwriter programs , as well as the Scribe service . Nevertheless, they have certain drawbacks that encouraged me to write my own program: Sinyak, for example, has only one glyph per character and some problems with usability, Handwriter is paid, and Scribe does not allow creating his own font.


Vector or bitmap font?


Since we are talking about an individual letter, it is necessary to provide for the opportunity to hammer our glyphs into the program. A fairly obvious solution is to give the user the opportunity to print a template to fill in and then take glyphs from the scan of the completed template. However, for this you need to solve the following tasks:


  • Clear the noise pattern
  • Recognize special labels on the template to understand which glyphs are to which character
  • Cut each glyph from the background

The difficulty is not just to do this, but to do it qualitatively for countless scanners with different characteristics, scanning parameters and different paper. I decided that I should not take up such a voluminous task.


So my program uses vector graphics. The font creation algorithm is now as follows: the user opens his favorite vector editor, draws the desired glyphs (preferably using a tablet), saves them, and then loads them into the program. The disadvantages of this solution are as follows:


  • You can forget about imitating any writing instruments in general, including ballpoint pens. Now the text seems to be written with a capillary pen, and I see no way to change this.
  • The time to create a complete set of glyphs increases several times in comparison with the time it takes to fill out a paper template, and not everyone can handle it.
  • Graphic tablets are less common among the average user than scanners, which further reduces the potential audience.

But there are different bonuses:


  • You can easily and without side effects change the color of the paste, the thickness of the pen and the size of the glyphs.
  • You can round and smooth the corners and ends of lines.
  • Connecting lines to simulate fused spelling look exactly like line letters.
  • Print quality now only depends on the printer.
  • You can partially automate the creation of your own font.

Placing characters on a sheet


Symbols are different: some are located strictly within the line, others protrude beyond its edge, and others lie on the upper or lower border. The protruding parts of the symbols may be located above or below the adjacent symbols. Therefore, for the correct placement of the character, you need to know what part of it will be within the line and how other characters need to be positioned relative to it.


Font Editor Screenshot
The yellow rectangle is just that part of the character that is within the line


In the font editor, you can just set the position of the character relative to the string and other characters, indicating the borders of the character with a yellow frame. And here, of course, symbols and specific glyphs are assigned.


Data for continuous writing


In addition to the yellow frame, two circular pointers are also visible in the font editor. These are the points where the connecting lines from adjacent letters will come. Yes, for continuous spelling of the word, the letters next to them are simply connected by straight lines. In theory, it would be better to use splines, but if you write letters without long tails, given that the program will draw ponytails for the user, this will not be striking.


Partial font creation automation


To specify all this manually for each symbol - you can shoot yourself. It is necessary to somehow simplify the process of creating your own font.


Firstly, there is an automatic loading of glyphs. When saving a glyph from your favorite vector editor, it is enough to name the file according to a certain template, and the program will assign it to the desired character. The pattern is: the character itself, then, if necessary, the number; can be separated by underscores. And, since Windows does not see a big difference between uppercase and lowercase letters, for uppercase you need to add the prefix "UP_". True, this trick does not work with all characters, because not everything can be used in the file name, so instead of the forbidden characters, you can write their name.


Secondly, in continuous spelling, the connecting line, as a rule, enters the beginning of the first line of the letter, and leaves the end of the last line. Since, as a rule, vector editors store information about lines in the sequence in which lines were drawn, we know which line was drawn first and which last. That is, most of the work on arranging round pointers can be done by the program. And takes it.


And thirdly, the program can also indicate the yellow rectangle itself, simply by placing it along the borders of the glyph. This is often wrong, but at least some percentage of correctly delivered data will already make life easier for the user.


Word wrap by syllables


To correctly break words into syllables, the algorithm of P. Hristov in the modification of Dymchenko and Varsanofiev is used. In short: using regular expressions describes two groups of letters, between which should be a hyphen. Then, using a sequential application of these rules, a specific word is divided into syllables, the hyphen closest to the edge is selected, and the entire right part of the hyphen is transferred to a new line.


Such rules already exist for the Russian language. They are not perfectly accurate, but supposedly cover 99% of all transfers. Also, I think they can be developed for some other languages. But not for everyone. For example, in English, translating words by syllables will require a much more complex algorithm, because words are transferred by sound rather than spelling.


Here are the rules for the Russian language:


  • X-LL
  • "G-GL"
  • GS-SG
  • SG-SG
  • GS-SSG
  • "GSS-SSG"

Here L is any letter, G is a vowel, C is a consonant, X is a letter from the set "b". To give users the ability to change the rules without modifying the source code, I put them in a separate file:


Migration rules file
I allowed myself to modify the original rules a bit to prevent one letter from breaking off the word.


Other features


Vector graphics make it possible to change the parameters of lines, in particular, to round their edges and smooth corners, so why not use this? In addition, I see no reason to somehow restrict the user in the ability to customize the sheet and font settings. An approximate idea of ​​the main settings can be obtained by looking at the screenshot:


Settings screenshot


A few more photos of the abstract


Clickable:


Photo 1


Photo 2


Photo 3


Source code


As promised - link to the repository: https://github.com/aizenbit/Scribbler


Also popular now: