Development of a cross-platform editor with syntax highlighting based on wxStyledTextCtrl

From the sandbox

Good time of day

I want to tell you about my experience in developing an editor with syntax highlighting or a code editor. This is an editor for a custom language. The essence of this post is not a detailed description of the development process - I will focus only on the most interesting points, poorly consecrated or even omitted in the documentation.

Scintilla is one of the most well-known free components for creating code editors, wxStyledTextCtrl its wrapper in the library wxWidgets.
Scintilla supports most of the programming languages used, so adding a new lexer is often done by analogy with the closest language. This process is described in more detail here . This method is very simple to implement, but has one significant drawback - you will have to include all the scintilla code in your project. For small and relatively stable projects this is not very significant, but for bulky and actively developing projects it causes a lot of problems. Support is even more difficult if you use not the Scintilla component itself, but its wrapper, for example,wxStyledTextCtrl. An alternative is the so-called implementation of the lexer in the container .

Lexer integration in scintilla

First, a few words about a simple method. In general, to create your own lexer need to write only one function, which will be responsible for the coloring of the modified part of the code:
static void ColouriseDoc (unsigned int startPos, int length, int initStyle, WordList *keywordlists[], Accessor &styler). To colorize or stylize a specified range means to specify a style for each character. At the same time, the component is not worth it optimizes the text rendering itself, and also ensures that the range [startPos, startPos + length - 1]is an integer number of lines, which facilitates the implementation of the parser.

Implement a lexer in a container

I started by creating a class - a descendant from wxStyledTextCtrl as indicated in the example . As in the previous method, the lexer requires only one function, which in this case will be called upon an event EVT_STC_STYLENEEDED.
Experience has shown that if the size of the edited files does not exceed 50-60 lines, then you can safely ignore the range transmitted by the event and stylize the entire contents of the file, which greatly simplifies the lexer. If the number of lines is greater, then updating the style is better not from the line GetEndStyled(), but from the previous line. As in the previous method, the event is raised for an integer number of rows.

What they forgot to mention in the manual

One of the difficulties that I encountered already at this stage is that the event is triggered too often. A small study of the behavior of the component in the debugger showed that after styling the range [n, m], the last stylized symbol is not m at all. Most often, it was the previous symbol, sometimes the symbol is 2-3 characters separated from it. There were several reasons:

all custom styles must have indices greater than 0,
absolutely everything needs to be styled, including spaces and line breaks
styles should be applied sequentially, i.e. if you first apply the style to the range [0, 100], and then to [0, 10] - the last stylized is the 10th, not the 100th

I implemented a line-by-line parser and in order to avoid unnecessary event calls, I had to store the index of the last stylized character.

SetStyling via SetStyleBytes

In almost all examples, the use of style is indicated as a sequence of commands: in this case, the character is applied character-by-character, which is absolutely ineffective and it’s better to use this one, which allows you to convey to the component a pre-prepared vector of style indexes, to be honest, I have not found any cases in which to apply it is the SetStyling function.

StartStyling(...);

SetStyling(...);

StartStyling(...);

SetStyleBytes(...);

Russian letters

In order for the styles to correctly take on words with Russian letters, you need to correctly calculate the index of the last character by sequentially executing pos = PositionAfter(pos). A similar way of calculating the position must be applied to obtain part of the word when calling automatic substitution pos = PositionBefore(pos).

Hiding a piece of code (folding)

The implementation of this function did not cause any particular problems, and there were no gaps in the documentation. I SetFoldLevel(...)’ll just say folding is based on line levels , which, for example, I placed directly in the lexer (which is why a line-by-line parser is more convenient).

Autocomplete Function

Although the function is relatively detailed in the documentation, some things are missing. First of all, it turned out that the standard function of the component AutoCompShow(...)does not filter by the entered part of the word, but only positioning. Then the function AutoCompSelect()which should substitute the word chosen by the user turned out to be useless, because it often caused an exception for no apparent reason. Instead, it is better to set the flag AutoCompSetChooseSingle ().

Call Tips

Another equally convenient function wxStyledTextCtrlis displaying CallTipShow () call hints. Working with it is similar to working with the substitution function AutoCompShow(). It should be noted that they pass simultaneously.

Given the above features, wxStyledTextCtrlit allows you to implement a fairly complex lexer in the container.

Tags: