
F # language syntax highlighting in gedit
Introduction
Creating syntax highlighting for any language in gedit is carried out using the gtksourceview library. In general, the work of adding syntax highlighting of any language in gnome-edit consists in writing a file with the extension .lang, which is essentially an XML file that stores a description of the syntax of a particular language. Usually .lang files that gtksourceview supports are located in the /usr/share/gtksourceview-2.0/language-specs/ directory
Description of the .lang file format
Like any XML document, a .lang file consists of a root and nodes originating from the root of a document. In .lang, the root of the document is the tag. The root tag may contain the following attributes:
ID - Description identifier. Used for external links to this element and must be unique. It may contain letters, numbers, and underscores. In the value of the attribute ID, uppercase letters should not be used.
Name - The name of the language provided to the user.
Version - The version of the format. (GTKSOURСEVIEW uses 2.0)
Section - Determines in which section of the menu the language will be located: script, scientific, etc. (For the gedit menu)
Hidden - Hint, tooltip for the user.
All attributes except hidden and section are required.
Also, the root element may contain the following elements:
Metadata - the element contains a description of metadata. May contain the properties element, which contains one attribute. name , the value of which can be:
Mimetypes - Contains a list of media types
Globs - file extension of the language
line-comment-start - used to describe single-line comments
block comment-start - used to describe the start of multi-line comments
block-comment-end - used to describe the beginning multiline comments
Styles- contains a description of the styles used in the current language. Contains one element - style
Style - describes the style that is associated with a specific ID. It contains 3 elements:
ID - Style identifier
Name -
Map-to style name - Used to display the style with specific font and color.
Definitions - the main element of the root containing the definition of the language. It includes one attribute:
ID - ID of the employee to turn opredelnie regular expression describing the language Syntax
Sontext - the most important element contained is subject syntax. May contain the following elements:
Start- contains the initial regular expression of the current context
End - contains the ending regular expression of the current context
Include - contains a list of contexts The
same element can contain the following attributes:
ID - a unique context identifier
style-ref - highlight style for this context
Keyword - contains key words for a given context.
That's basically all standard elements, here are just the basic elements of gtksourceview.
Well, for starter .lang file describing the F # language:
text/x-fsharp
*.fs;
//
(*
*)
\b[A-Z][A-Za-z0-9_']*
\b[a-z][A-Za-z0-9_']*
\\((\\|"|'|n|t|b|r)|[0-9]{3}|x[0-9a-fA-F]{2})
^\s*#\s*
[!#$%&*+./>=@:\\^|~-]
\%{char-esc}
//
\(\*
\*\)
\%{preproc-start}(if(n?def)?|else|endif|light|region|endregion)\b
\%{preproc-start}if\s*false\b
\%{preproc-start}(endif|else|elif)\b
\%{preproc-start}if(n?def)?\b
\%{preproc-start}endif\b
`\%{cap-ident}
\%{cap-ident}(\.\%{cap-ident})*(?=\.)
\%{cap-ident}
"
"
('\%{char-esc}')|('[^\\']')
'\%{low-ident}
true
false
(?
(?!\%{symbolchar})
\.\.
::
=
@
~
->
|
:?
:?>
^
<-
&&
&
abstract
and
as
assert
asr
begin
class
default
delegate
do
done
downcast
downto
else
end
enum
exception
false
finaly
for
fun
function
if
in
iherit
interface
land
lazy
let
lor
lsl
lsr
lxor
match
member
mod
module
mutable
namespace
new
null
of
open
or
override
sig
static
struct
then
to
true
try
type
val
when
inline
upcast
while
with
* This source code was highlighted with Source Code Highlighter.
p.s. Официальный сайт GtkSourceView
p.p.s. Первый пост на Хабрахабр.