Morthan November 24, 2015 at 08:01

Nim Tutorial (Part 2)

Transfer
Tutorial

Note from the translator

The first part is here: "Nim Tutorial (Part 1)" The

translation was done for myself, that is, clumsily and in haste. The wording of some phrases had to give birth in terrible agony, so that they even remotely resemble the Russian language. Who knows how to write better - write in a personal, I will edit.

Introduction

“Repetition makes absurdity look like prudence.” - Norman Wildberger

(Original: "Repetition renders the ridiculous reasonable." - Norman Wildberger)

This document is a tutorial on the complex constructs of the Nim language . Remember that this document is somewhat outdated, and the manual has much more relevant examples on the complex features of the language.

Pragmas

Pragmas are Nim's accepted ways of telling the compiler extra information or commands without entering new keywords. Pragmas are enclosed in special curly brackets with dots {. and .}. They are not covered in this tutorial. See the manual or user manual for a list of available pragmas .

Object oriented programming

Although support for object-oriented programming (OOP) in Nim is minimalistic, powerful OOP techniques can still be used. OOP is considered as one of, but not the only, way to develop programs. It happens that the procedural approach simplifies the code and increases its efficiency. For example, using composition instead of inheritance often leads to better architecture.

The objects

Objects, like tuples, are designed to pack various values into a single structure. But objects have some features that tuples do not have: inheritance and hiding information. Since objects encapsulate data, the constructor of an object is T()usually used only in internal development, and for initialization, the programmer must provide a special procedure (it is called a constructor).

Objects have access to their type at runtime. There is an operator ofwith which you can check the type of object:

type
  Person = ref object of RootObj
    name*: string  # эта * означает, что `name` будет доступно из других модулей
    age: int       # а это поле будет недоступно из других модулей
  Student = ref object of Person # Student унаследован от Person
    id: int                      # с дополнительным полем id
var
  student: Student
  person: Person
assert(student of Student) # вернёт true
# конструируем объект:
student = Student(name: "Anton", age: 5, id: 2)
echo student[]

The fields of the object that should be visible outside the module in which they are defined are marked with an asterisk ( *). Unlike tuples, various object types are never equivalent. New object types can only be defined in the type section.

Inheritance is done using syntax object of. Multiple inheritance is not currently supported. If there is no suitable ancestor for the object type, then you can make an ancestor RootObj, but this is just an agreement. Objects that do not have an ancestor are implicitly declared as final. To introduce a new object that is not inherited from system.RootObj, you can use pragma inheritable(this is used, for example, in the GTK wrapper).

Reference objects can be used regardless of inheritance. This is not strictly necessary, but if non-referenced objects are assigned, for example, the let person: Person = Student(id: 123)fields of the child class will be truncated.

Note: for simple code reuse, composition ( “included” relationship ) is often preferable to inheritance ( “is” relationship ). . Because objects in Nim are value types, composition is as efficient as inheritance.

Mutually recursive types

Using objects, tuples, and links, you can model fairly complex data structures that depend on each other and are thus mutually recursive. In Nim, such types can only be declared inside a single type section. (Other solutions would require additional character lookups, which slows down compilation.)

Example:

type
  Node = ref NodeObj # отслеживаемая ссылка на NodeObj
  NodeObj = object
    le, ri: Node     # левое и правое поддеревья
    sym: ref Sym     # листья, содержащие ссылку на Sym
  Sym = object       # символ
    name: string     # имя символа
    line: int        # строка, в которой символ был объявлен
    code: PNode      # абстрактное синтаксическое дерево символа

Type conversion

Nim distinguishes between type casts and type conversions. The cast is done using the operator castand forces the compiler to interpret the binary data as the specified type.

Type conversion is a more elegant way to turn one type into another: it checks whether types can be converted. If type conversion is not possible, the compiler will either report this or an exception will be thrown.

The syntax for type conversion is: destination_type(expression_to_convert)(resembles a regular call).

proc getID(x: Person): int =
  Student(x).id

If xnot an instance Student, an exception will be thrown InvalidObjectConversionError.

Variant objects

There are situations for which the object hierarchy is an excess, and everything can be solved with simple variant types.

For instance:

# Это пример того, как абстрактное синтаксическое дерево могло бы быть
# смоделировано в Nim
type
  NodeKind = enum  # типы для различных узлов
    nkInt,          # лист с числовым значением
    nkFloat,        # лист со значением с плавающей запятой
    nkString,       # лист со строковым значением
    nkAdd,          # сложение
    nkSub,          # вычитание
    nkIf            # команда if
  Node = ref NodeObj
  NodeObj = object
    case kind: NodeKind  # поле ``kind`` является дискриминатором
    of nkInt: intVal: int
    of nkFloat: floatVal: float
    of nkString: strVal: string
    of nkAdd, nkSub:
      leftOp, rightOp: PNode
    of nkIf:
      condition, thenPart, elsePart: PNode
var n = PNode(kind: nkFloat, floatVal: 1.0)
# следующая команда вызовет исключение `FieldError`, поскольку значение
# n.kind не соответствует:
n.strVal = ""

As you can see from the example, in contrast to the object hierarchy, you do not need to make conversions between different object types. However, accessing the wrong fields of an object raises an exception.

Methods

In ordinary object-oriented languages, procedures (also called methods) are bound to a class. This approach has the following disadvantages:

adding a method to a class, the programmer either loses control over it, or makes clumsy workarounds if you need to work with the method separately from the class;
it is often unclear what the method should relate to: joinis it a string or array method?

Nim avoids these problems by not binding methods to classes. All methods in Nim are multimethods. As we will see later, multimethods differ from procedures only with dynamic binding.

Method Call Syntax

There is a special syntactic sugar for calling subroutines in Nim: construction obj.method(args)means the same as method(obj, args). If there are no arguments, then you can skip the brackets: obj.leninstead len(obj).

This method invocation syntax is not limited to objects; it can be used for any type:

echo("abc".len) # то же, что и echo(len("abc"))
echo("abc".toUpper())
echo({'a', 'b', 'c'}.card)
stdout.writeLine("Hallo") # то же, что и writeLine(stdout, "Hallo")

(Another point of view on the syntax of method calls is that it implements the missing postfix notation.)

This makes it easy to write "pure object-oriented code":

import strutils, sequtils
stdout.writeLine("Give a list of numbers (separated by spaces): ")
stdout.write(stdin.readLine.split.map(parseInt).max.`$`)
stdout.writeLine(" is the maximum!")

Properties

As you can see from the example above, Nim does not need get-properties: they are replaced by regular get-procedures called using the method invocation syntax. But assigning a value is another matter, for this you need a special syntax:

type
  Socket* = ref object of RootObj
    host: int # недоступен извне, нет звёздочки
proc `host=`*(s: var Socket, value: int) {.inline.} =
  ## сеттер адреса хоста
  s.host = value
proc host*(s: Socket): int {.inline.} =
  ## геттер адреса хоста
  s.host
var s: Socket
new s
s.host = 34  # то же, что и `host=`(s, 34)

(The example also shows inline-procedures.)

To implement array properties, you can overload the array access operator []:

type
  Vector* = object
    x, y, z: float
proc `[]=`* (v: var Vector, i: int, value: float) =
  # setter
  case i
  of 0: v.x = value
  of 1: v.y = value
  of 2: v.z = value
  else: assert(false)
proc `[]`* (v: Vector, i: int): float =
  # getter
  case i
  of 0: result = v.x
  of 1: result = v.y
  of 2: result = v.z
  else: assert(false)

The example is clumsy, since it is better to model a vector with a tuple that already has access to v[].

Dynamic binding

Procedures always use static binding. For dynamic snapping, replace the keyword procwith method:

type
  PExpr = ref object of RootObj ## абстрактный базовый класс для выражения
  PLiteral = ref object of PExpr
    x: int
  PPlusExpr = ref object of PExpr
    a, b: PExpr
# обратите внимание: 'eval' полагается на динамическое связывание
method eval(e: PExpr): int =
  # перекрываем базовый метод
  quit "to override!"
method eval(e: PLiteral): int = e.x
method eval(e: PPlusExpr): int = eval(e.a) + eval(e.b)
proc newLit(x: int): PLiteral = PLiteral(x: x)
proc newPlus(a, b: PExpr): PPlusExpr = PPlusExpr(a: a, b: b)
echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4)))

Note that in the example of designers newLitand newPlusare procedures, because they make better use of static binding, and evalhave a method, because it requires dynamic binding.

In a multimethod, all parameters that have an object type are used for binding:

type
  Thing = ref object of RootObj
  Unit = ref object of Thing
    x: int
method collide(a, b: Thing) {.inline.} =
  quit "to override!"
method collide(a: Thing, b: Unit) {.inline.} =
  echo "1"
method collide(a: Unit, b: Thing) {.inline.} =
  echo "2"
var a, b: Unit
new a
new b
collide(a, b) # на выходе: 2

As you can see from the example, calling a multimethod cannot be ambiguous: collide2 is preferable to collide1, since the resolution works from left to right. Thus Unit, Thingrather than Thing, Unit.

Performance note : Nim does not create a table of virtual methods, but generates dispatch trees. This avoids the costly indirect branching of method calls and allows embedding. But other optimizations, such as calculations at the compilation stage or deletion of dead code, do not work with methods.

Exceptions

In Nim, exceptions are objects. By convention, exception types end with "Error." The module systemdefines an exception hierarchy that you can bind to. Exceptions come from system.Exceptionproviding a common interface.

Exceptions must be thrown on the heap because their lifetime is unknown. The compiler will not allow you to throw an exception placed on the stack. All exceptions thrown must at least indicate the reason for their appearance in the field msg.

Exceptions are supposed to be thrown in exceptional cases: for example, if a file cannot be opened, this should not throw exceptions (the file may not exist).

Command `raise`

Exceptions are thrown using the command raise:

var
  e: ref OSError
new(e)
e.msg = "the request to the OS failed"
raise e

If the keyword is raisenot followed by an expression, then the last exception is thrown again . In order not to write the above code, you can use the template newExceptionfrom the module system:

raise newException(OSError, "the request to the OS failed")

Command `try`

The command tryhandles exceptions:

# читаем первые две строки текстового файла, которые должны содержать числа, и
# пытаемся сложить их
var
  f: File
if open(f, "numbers.txt"):
  try:
    let a = readLine(f)
    let b = readLine(f)
    echo "sum: ", parseInt(a) + parseInt(b)
  except OverflowError:
    echo "overflow!"
  except ValueError:
    echo "could not convert string to integer"
  except IOError:
    echo "IO error!"
  except:
    echo "Unknown exception!"
    # reraise the unknown exception:
    raise
  finally:
    close(f)

Commands are then tryexecuted until an exception is thrown. In this case, the corresponding branch will be executed except.

An empty block exceptis executed if an exception is not explicitly listed. This is similar to a branch elsein a team if.

If a branch is present finally, it is always executed after the exception handlers are executed.

The exception is absorbed in the branch except. If the exception is not processed, it propagates along the call stack. This means that if an exception occurs, the rest of the procedure, which is not inside the block finally, will not be executed.

If you need to get the current exception object or its message inside the branchexceptYou can use the procedures getCurrentException()and getCurrentExceptionMsg()from the module system. Example:

try:
  doSomethingHere()
except:
  let
    e = getCurrentException()
    msg = getCurrentExceptionMsg()
  echo "Got exception ", repr(e), " with message ", msg

Annotating Procedures with Excluded Exceptions

Using an optional pragma, {.raises.}you can specify that a procedure can raise a specific set of exceptions or not raise exceptions at all. If a pragma is {.raises.}used, the compiler will verify that it is true. For example, if you indicate that a procedure throws IOError, and at some point it (or one of the called procedures) throws another exception, the compiler will refuse to compile it. Usage example:

proc complexProc() {.raises: [IOError, ArithmeticError].} =
  ...
proc simpleProc() {.raises: [].} =
  ...

After you have similar code, if the list of thrown exceptions changes, the compiler will stop with an error pointing to a line in the procedure that stopped pragma validation and an exception that is not in the list. In addition, there will also be indicated the file and the line where this exception appeared, which will help you find the suspicious code, the change of which led to this.

If you want to add pragma {.raises.}to existing code, the compiler can also help you. You can add a pragma command to the procedure {.effects.}and the compiler will output all the effects that appear at that point (exception tracking is part of the Nim effects system). Another workaround to get a list of exceptions thrown by a procedure is to use the Nim commanddoc2, which generates documentation for the entire module and decorates all procedures with a list of exceptions thrown. You can read more about the effects system and related pragmas in the manual .

Generalizations

Generalizations are what allows Nim to parameterize procedures, iterators, or types using type parameters. They are most useful for creating high-performance type-safe containers:

type
  BinaryTreeObj[T] = object # BinaryTree это обобщённый тип с обобщённым
                            # параметром ``T``
    le, ri: BinaryTree[T]   # левое и правое поддерево; могут быть nil
    data: T                 # данные хранятся в узле
  BinaryTree*[T] = ref BinaryTreeObj[T] # тип, который экспортируется
proc newNode*[T](data: T): BinaryTree[T] =
  # конструктор узла
  new(result)
  result.data = data
proc add*[T](root: var BinaryTree[T], n: BinaryTree[T]) =
  # вставляем узел в дерево
  if root == nil:
    root = n
  else:
    var it = root
    while it != nil:
      # сравниваем данные элементов; используем обобщённую процедуру ``cmp``
      # которая работает с любым типом, имеющим операторы ``==`` и ``<``
      var c = cmp(it.data, n.data)
      if c < 0:
        if it.le == nil:
          it.le = n
          return
        it = it.le
      else:
        if it.ri == nil:
          it.ri = n
          return
        it = it.ri
proc add*[T](root: var BinaryTree[T], data: T) =
  # удобная процедура:
  add(root, newNode(data))
iterator preorder*[T](root: BinaryTree[T]): T =
  # Предварительно упорядоченный обход двоичного дерева. Поскольку рекурсивные
  # итераторы пока не реализованы, используется явный стек (который ещё и более
  # эффективен):
  var stack: seq[BinaryTree[T]] = @[root]
  while stack.len > 0:
    var n = stack.pop()
    while n != nil:
      yield n.data
      add(stack, n.ri)  # кладём правое поддерево на стек
      n = n.le          # и переходим по левому указателю
var
  root: BinaryTree[string] # инстанцируем BinaryTree как ``string``
add(root, newNode("hello")) # инстанцируем ``newNode`` и добавляем его
add(root, "world")          # инстанцируем вторую процедуру добавления
for str in preorder(root):
  stdout.writeLine(str)

The example shows a generalized binary tree. Depending on the context, square brackets are used either to enter type parameters, or to instantiate a generalized procedure, iterator, or type. As you can see from the example, generalizations work with overload: the best match is used add. The built-in procedure addfor sequences is not hidden and is used in the iterator preorder.

Patterns

Templates are a simple substitution mechanism that operates on Nim Abstract Syntax Trees (AST). Templates are processed on a semantic compilation pass. They are well integrated with the rest of the language, and they do not have the usual drawbacks of C-shy preprocessor macros.

To call a template, call it as a procedure.

Example:

template `!=` (a, b: expr): expr =
  # это определение существует в модуле System
  not (a == b)
assert(5 != 6) # компилятор перепишет это как: assert(not (5 == 6))

Operators !=, >, >=, in, notin, isnotare actually patterns: as a result, if you have overloaded the operator ==, the operator !=becomes automatically available and working properly (except for floating-point IEEE - NaNbreaking strict Boolean logic).

a > bturns into b < a. a in btransforms into contains(b, a). notinand isnotget the obvious meaning.

Templates are especially useful when it comes to lazy computing. Consider a simple procedure for logging:

const
  debug = true
proc log(msg: string) {.inline.} =
  if debug: stdout.writeLine(msg)
var
  x = 4
log("x has the value: " & $x)

There is a drawback in this code: if debugone is put in false, then it’s quite a costly operation $and &will continue to be executed! (The calculation of the arguments for the procedures has been done “greedy.”)

Turning the procedure loginto a template solves this problem:

const
  debug = true
template log(msg: string) =
  if debug: stdout.writeLine(msg)
var
  x = 4
log("x has the value: " & $x)

Types of parameters can be ordinary types or metatypes expr(for expressions),stmt (for commands) or typedesc(for type descriptions). If the type of the return value is not explicitly specified in the template, then it is used for compatibility with procedures and methods stmt.

If there is a parameter stmt, then it must be the last in the template declaration. The reason is that commands are passed to the template using special colon syntax ( :):

template withFile(f: expr, filename: string, mode: FileMode,
                  body: stmt): stmt {.immediate.} =
  let fn = filename
  var f: File
  if open(f, fn, mode):
    try:
      body
    finally:
      close(f)
  else:
    quit("cannot open: " & fn)
withFile(txt, "ttempl3.txt", fmWrite):
  txt.writeLine("line 1")
  txt.writeLine("line 2")

In the example, two commands are writeLinebound to a parameter body. The template withFilecontains utility code and helps to avoid a common problem: forget to close the file. Note that the command let fn = filenameguarantees that it filenamewill be evaluated only once.

Macros

Macros allow you to intensively transform code at the compilation stage, but cannot change the syntax of Nim. But this is not a very serious limitation, since the Nim syntax is quite flexible. Macros should be implemented on pure Nim, since the external function interface (FFI) is not allowed in the compiler, but in addition to this restriction (which will be removed sometime in the future), you can write any code on Nim and the compiler will run it at the compilation stage .

There are two ways to write macros: either generateNim source code and passing it to the compiler for parsing, or manually creating an abstract syntax tree (AST), which is fed to the compiler. To build an AST, you need to know how a particular Nim syntax is converted to an abstract syntax tree. AST is documented in the module macros.

When your macro is ready, there are two ways to call it:

calling a macro as a procedure (expression macro)
calling a macro using special syntax macrostmt(macro commands)

Expression Macros

The following example implements a powerful command debugthat takes any number of arguments:

# чтобы работать с синтаксическими деревьями Nim нам нужен API, который
# определен в модуле``macros``:
import macros
macro debug(n: varargs[expr]): stmt =
  # `n` это AST Nim, содержащее список выражений;
  # этот макрос возвращает список выражений:
  result = newNimNode(nnkStmtList, n)
  # перебираем аргументы, переданные в макрос:
  for i in 0..n.len-1:
    # добавляем в список команд вызов, который выведет выражение;
    # `toStrLit` конвертирует AST в его строковое представление:
    result.add(newCall("write", newIdentNode("stdout"), toStrLit(n[i])))
    # добавляем в список команд вызов, который выведет ": "
    result.add(newCall("write", newIdentNode("stdout"), newStrLitNode(": ")))
    # добавляем в список команд вызов, который выведет значение выражения:
    result.add(newCall("writeLine", newIdentNode("stdout"), n[i]))
var
  a: array[0..10, int]
  x = "some string"
a[0] = 42
a[1] = 45
debug(a[0], a[1], x)

The macro call expands to:

write(stdout, "a[0]")
write(stdout, ": ")
writeLine(stdout, a[0])
write(stdout, "a[1]")
write(stdout, ": ")
writeLine(stdout, a[1])
write(stdout, "x")
write(stdout, ": ")
writeLine(stdout, x)

Command Macros

Command macros are defined in the same way as expression macros. But they are invoked through an expression ending with a colon.

The following example shows a macro that generates a lexical analyzer for regular expressions:

macro case_token(n: stmt): stmt =
  # создаёт лексический анализатор из регулярных выражений
  # ... (реализация -- упражнение для читателя :-)
  discard
case_token: # это двоеточие сообщает парсеру, что это макрос команды
of r"[A-Za-z_]+[A-Za-z_0-9]*":
  return tkIdentifier
of r"0-9+":
  return tkInteger
of r"[\+\-\*\?]+":
  return tkOperator
else:
  return tkUnknown

Create your first macro

To guide you in writing macros, we will demonstrate how to turn your typical dynamic code into something that can be statically compiled. For example, we use the following code fragment as a starting point:

import strutils, tables
proc readCfgAtRuntime(cfgFilename: string): Table[string, string] =
  let
    inputString = readFile(cfgFilename)
  var
    source = ""
  result = initTable[string, string]()
  for line in inputString.splitLines:
    # Игнорируем пустые строки
    if line.len < 1: continue
    var chunks = split(line, ',')
    if chunks.len != 2:
      quit("Input needs comma split values, got: " & line)
    result[chunks[0]] = chunks[1]
  if result.len < 1: quit("Input file empty!")
let info = readCfgAtRuntime("data.cfg")
when isMainModule:
  echo info["licenseOwner"]
  echo info["licenseKey"]
  echo info["version"]

Presumably, this code fragment could be used in commercial programs to read the configuration file and display information about who bought the program. This external file could be generated upon purchase in order to include licensed information in the program:

version,1.1
licenseOwner,Hyori Lee
licenseKey,M1Tl3PjBWO2CC48m

The procedure readCfgAtRuntimewill open the given file name and return Tablefrom the module tables. File parsing is done (without error handling or boundary cases) using the procedure splitLinesfrom the module strutils. There are many things that can go wrong; remember that this explains how to run code at compile time, and not how to properly implement copy protection.

Implementing this code as a compilation step procedure will allow us to get rid of the file data.cfg, which otherwise would have to be distributed with the binary. Plus, if the information is really constant, then from the point of view of logic, it makes no sense to keep it in a mutableglobal variable, it is better if it is a constant. Finally, one of the most valuable features is that we can implement some checks at the compilation stage. You can perceive this as improved unit testing, which does not allow you to get a binary in which something does not work. This prevents the delivery to users of broken programs that do not start due to a failure in one small critical file.

Source code generation

Let's try to change the program so that at the compilation stage create a line with the generated source code, which we then pass to the procedure parseStmtfrom the module macros. Here is the modified source code that implements the macro:

 1  import macros, strutils
 2
 3  macro readCfgAndBuildSource(cfgFilename: string): stmt =
 4    let
 5      inputString = slurp(cfgFilename.strVal)
 6    var
 7      source = ""
 8
 9    for line in inputString.splitLines:
10      # Ignore empty lines
11      if line.len < 1: continue
12      var chunks = split(line, ',')
13      if chunks.len != 2:
14        error("Input needs comma split values, got: " & line)
15      source &= "const cfg" & chunks[0] & "= \"" & chunks[1] & "\"\n"
16
17    if source.len < 1: error("Input file empty!")
18    result = parseStmt(source)
19
20  readCfgAndBuildSource("data.cfg")
21
22  when isMainModule:
23    echo cfglicenseOwner
24    echo cfglicenseKey
25    echo cfgversion

The good thing is that almost nothing has changed! Firstly, the processing of the input parameter has changed (line 3). In the dynamic version, the procedure readCfgAtRuntimereceives a string parameter. However, in the macro version, although it is declared string, it is only the external interface of the macro. When the macro runs, it actually gets the object PNimNode, not the string, and we need to call the procedure strValfrom the module macros(line 5) to get the string passed to the macro.

Secondly, we cannot use the procedure readFilefrom the modulesystemdue to FFI restrictions at compile time. If we try to use this procedure (or any other, depending on the FFI), the compiler will give an error message saying that it cannot calculate the dump of the macro source code and add to it a printout of the stack showing where the compiler was at the time of the error. We can get around this limitation by using the procedure slurpfrom the module system, which was made specifically for the compilation stage (there is also a similar procedure gorgethat executes an external program and intercepts its output).

Interestingly, our macro does not return a runtime objectTable. Instead, it generates the Nim source code in the source variable. For each line of the configuration file, a constant variable will be generated (line 15). To avoid conflicts, we prefixed these variables cfg. In general, all the compiler does is replace the macro call line with the following code fragment:

const cfgversion= "1.1"
const cfglicenseOwner= "Hyori Lee"
const cfglicenseKey= "M1Tl3PjBWO2CC48m"

You can verify this yourself by adding a line with the output of the source code to the screen at the end of the macro and compiling the program. Another difference is that instead of calling the usual procedure quitto exit (which we could call), this version calls the procedure error(line 14). The procedure errordoes the same as quitbut in addition it also displays the source code and line number of the file where the error occurred, which helps the programmer find the error during compilation. In this situation, we would be pointed to the line that calls the macro, and not to the line data.cfgthat we are processing: we must control this ourselves.

Manual AST generation

To generate the AST, we, in theory, would need to perfectly know the structures used by the Nim compiler, which are presented in the module macros. At first glance, this seems like a daunting task. But we can use a macro dumpTreeby using it as a macro of commands, not a macro of an expression. Since we know that we want to generate a portion of characters const, we can create the following source file and compile it to see what the compiler expects from us :

import macros
dumpTree:
  const cfgversion: string = "1.1"
  const cfglicenseOwner= "Hyori Lee"
  const cfglicenseKey= "M1Tl3PjBWO2CC48m"

In the process of compiling the source code, we should see the output of the following lines (since this is a macro, compilation will be enough, you do not need to run any binaries):

StmtList
  ConstSection
    ConstDef
      Ident !"cfgversion"
      Ident !"string"
      StrLit 1.1
  ConstSection
    ConstDef
      Ident !"cfglicenseOwner"
      Empty
      StrLit Hyori Lee
  ConstSection
    ConstDef
      Ident !"cfglicenseKey"
      Empty
      StrLit M1Tl3PjBWO2CC48m

With this information, we already better understand what data the compiler needs from us. We need to generate a list of commands. For each source code constant, ConstSectionand is generated ConstDef. If we transferred all these constants to a single block const, we would see only one ConstSectionwith three descendants.

You may not have noticed, but in the example with the dumpTreefirst constant explicitly determines the type of constants. This is why the last two constants have a second child in the output tree Empty, and the first has a string identifier. So, in general, the definition constconsists of an identifier, an optional type (which may be an empty node) and a value. Armed with this knowledge, let's look at the finished version of the AST build macro:

 1  import macros, strutils
 2  
 3  macro readCfgAndBuildAST(cfgFilename: string): stmt =
 4    let
 5      inputString = slurp(cfgFilename.strVal)
 6  
 7    result = newNimNode(nnkStmtList)
 8    for line in inputString.splitLines:
 9      # Игнорируем пустые строки
10      if line.len < 1: continue
11      var chunks = split(line, ',')
12      if chunks.len != 2:
13        error("Input needs comma split values, got: " & line)
14      var
15        section = newNimNode(nnkConstSection)
16        constDef = newNimNode(nnkConstDef)
17      constDef.add(newIdentNode("cfg" & chunks[0]))
18      constDef.add(newEmptyNode())
19      constDef.add(newStrLitNode(chunks[1]))
20      section.add(constDef)
21      result.add(section)
22  
23    if result.len < 1: error("Input file empty!")
24  
25  readCfgAndBuildAST("data.cfg")
26  
27  when isMainModule:
28    echo cfglicenseOwner
29    echo cfglicenseKey
30    echo cfgversion

Since we started from the previous example of source code generation, we will only note differences from it. Instead of creating a temporary type variable stringand writing the source code into it as if it were written manually, we use the variable directly resultand create a command list node ( nnkStmtList) that will contain our descendants (line 7).

For each input line, we create a constant definition ( nnkConstDef) and wrap it with a constant section (nnkConstSection) Once these variables are created, we fill them hierarchically (line 17), as shown in the previous AST tree dump: the constant definition is a descendant of the section definition and contains the identifier node, an empty node (let the compiler guess what type is here) and a string literal with value.

Last tip on writing macros: if you are not sure if the AST you built looks normal, you can try using a macro dumpTree. But it cannot be used inside a macro that you write or debug. Instead, display the line generated treeRepr. If you add at the end of this example echo treeRepr(result), you will see the same output as when using the macro dumpTree. Call it at the end optionally, you can call it at any point in the macro with which you are having problems.

Tags: