About rasterizing source code

    I periodically see how people translate code into pictures on blogs so as not to deal with the buggy engine of a particular platform. In most cases, the authors just take a screenshot, but I went the more direct way - I built in the ability to “rasterize” the code in my own editor. This post is about how I did it. The post is also an illustration of what he describes, because the code is really rasterized here. All sources are here: http://butbucket.org/nesteruk/typografix .



    Where to start?
    Already there are solutions for syntax highlighting, which are used for example in HabraEditor . The solution itself is based on this project., Which I personally took and slightly reworked in order to have backlighting of such languages ​​as Boo on the same Habré. But how to rasterize the resulting HTML into a picture? It’s clear that you can, for example, use a real browser (the same IE, for example), display HTML in it, take a screenshot, clipping, and you're done. But I did not like this approach for two reasons:

    • Firstly, such a solution is too fragile - having extensive experience working with browser automation, I dare to say that such operations will regularly crash regardless of the browser (and IE will want to use it anyway due to better rendering of the text).
    • Secondly , I already had my own “rasterizer” by that time, but I didn’t want to lose my invested efforts.

    Let's first talk about the text rasterizer, about what kind of animal it is and why it was needed.

    Rasterizer
    See the headlines in this article? They are graphics, jpg files. Naturally, I did not create them individually in Photoshop so that I could later insert them into the post - for this, I just use a rasterizer , that is, a component that can take some markup at the input (a la HTML, but more interesting), but on output give a graphic file.

    Here is a small example: if I write in the markup Hello, World, I’ll get

    Hello world


    Accordingly, the markup that I use gives me the opportunity to rasterize both the usual bold and italic, and OT features like capitals, ligatures, etc.

    Ordinary, BoldandItalic


    All this is done with the help of tags like [b], [i],[sw]etc. For each individual segment of text, there is the following structure:





    All text (or code) that is rasterized is thus just a sequence of such elements. We have a class MarkupParserthat parses mark-up using a regular text comparison, applying markup to a particular element.





    All this uses a recursive parser that separates the markup from the text.



    The root element is the following method, which calls the editor (it is written in WPF itself).



    As you can see, a “prototype” of text formatting is passed to the method, which will be applied to all elements for which the headset, font size, color, etc. are not redefined.

    Now about how it is rasterized. There are a few steps. Firstly, since I use DirectWrite (a reminder: the DirectWrite “driver” for .Net is ready, but it doesn’t work in a 64-bit environment), I have a sea of ​​infrastructure that also uses COM.



    All this “goodness" one way or another appears in the creation of bitmaps or fotmatirovaniya text. The most interesting thing is that this IDWriteTypographyis where the set of OT features for a particular piece of text is determined.

    Then the prototype is parsed, and the markup is parsed:



    After that, each element is bypassed and the graphic properties that it needs are set. Here is a small example of how simple it is: The



    contents of the generated render target are subsequently copied to the bytes of that bitmap (meaningSystem.Drawing.Bitmap), which we "locked" before throwing through P / Invoke.

    Code rasterization
    So, we have HTML and a rasterizer, we need to get the correct markup. This is done simply:



    There is a small problem - we have to change [to \[because square brackets are used for markup. Nothing wrong. Now the last step is to prepare an empty one Bitmapand draw on it using our DirectWrite-rasterizer: It




    Conclusion
    was so easy to fit the existing infrastructure under the rasterizer. I hope you already see the results of his work. Yes, of course, you cannot do cut & paste with such text, but besides this, everything is very beautiful (IMHO), and most importantly, you can start using the rasterization feature for all kinds of annotations - right in the code. I think it's worth a try.

    Also popular now: