Execution of user code on GO

In fact, this is all about smart contracts.

But if you do not quite imagine what a smart contract is, and generally far from the crypt, then what is a stored procedure in the database, imagine completely. The user creates pieces of code that then work on our server. It is convenient for the user to write and publish them, and we are safe to execute them.

Unfortunately, we have not yet developed security, so now I will not describe it, but I will give a few hints.

We also write on Go, and its runtime imposes some very specific limitations, the main one of which is, by and large, we cannot link to another project written not for him, this will stop our runtime every time we execute third-party code. In general, we have an option to use some kind of interpreter, for it we found quite sane Lua and completely insane WASM, But somehow I don’t want to hook clients on Lua, and now with WASM there are more problems than benefits, he is able to draft which is updated every month, so we will wait for the specification to settle. Use it as a second engine.

As a result of lengthy battles with his own conscience, it was decided to write smart contracts on GO. The fact is that if you build the execution architecture of a compiled GO code, you will have to make this execution in a separate process, as you remember, we are for security, and carrying out in a separate process is a loss of performance on IPC, although later, when we understood the scope of the executable of the code, it even became pleasant that we chose this solution. The thing is that it is scalable, although it adds a delay for each individual call. We can pick up a lot of remote execution environments.

A little more about the decisions to make it clear. Each smart contract consists of two parts, one part is the class code, and the second is object data, so by publishing the code on the same code, we can create multiple contracts that will behave in the same way but with different settings. , and with a different state. If you talk further - then this is about the block and not the topic of this story.

And so, we play GO

We decided to use the plugin mechanism, which is not that ready and good. It does the following, we compile what will be a plugin in a special way into the shared library, and then load it, find the characters in it and pass the execution there. But the snag is that GO has a runtime, and this is almost a megabyte of code, and by default this runt is also going to this library, and we have rantyped runtime everywhere. But now we have decided to go for it, being confident that we can beat it in the future.

Everything is done simply when you build your library, you collect it with the key - buildmode = plugin and get the .so file, which you then open.

p, err := plugin.Open(path)

Looking for the symbol you are interested in:

symbol, err := p.Lookup(Method)

And now, depending on whether it is a variable or a function, you either call it or use it as a variable.

Under the hood, this mechanism has a simple dlopen (3), we load the library, check that it is a plugin and give the wrapper over it, when creating the wrapper, all exported characters are wrapped in interface {} and remembered. If this is a function, then it should be brought to the correct type of function and simply called; if a variable, then it works as a variable.

The main thing to remember is that if a symbol is a variable, then it is global to the whole process and you cannot use it thoughtlessly.

If a type was declared in the plugin, then it makes sense to put it in a separate package so that the main process can work with it, for example, pass as arguments to the functions of the plugin. This is optional, you can not bathe and use reflection.

Our contracts are objects of the corresponding “class”, and at the beginning the instance of this object was stored in our exported variable, so we could create one more same variable:

export, err := p.Lookup("EXPORT")
obj := reflect.New(reflect.ValueOf(export).Elem().Type()).Interface()

And already inside this local variable, of the correct type, deserialize the state of the object. After the object is restored, we can call methods on it. After that, the object is serialized and added back to the repository, we called the method on the contract.

If you are interested in how, but too lazy to read the documentation, then:

method := reflect.ValueOf(obj).MethodByName(Method)
res:= method.Call(in)

In the middle, you also need to fill the array with in, empty interfaces containing the correct type of argument, if you are interested, see for yourself how it was done, the source code is open, although finding this place in the history will be difficult.

In general, everything worked for us, you can write code with something like a class, put it on the blockchain, create a contract of this class again on the blockchain, make a method call on it and the new state of the contract will be written back to the blockchain. Gorgeous! How to create a new contract with the code? Very simply, we have function constructors that return a newly created object, which is a new contract. So far, everything works through reflection and the user must write:

var EXPORT ContractType

So that we know which symbol is the representation of the contract, and in fact it was used as a template.

This we do not really like. And we hit hard.

Parsing

First, the user should not write anything superfluous, and secondly, we have an idea that the interaction of a contract with a contract should be simple, and be tested without raising the blockchain; the blockchain is slow and difficult.

Therefore, we decided to wrap the contract in a wrapper, which is generated on the basis of the contract and the wrapper pattern, in principle, a clear decision. Firstly, the wrapper creates an export object for us, and secondly it replaces the library with which the contract is collected when the user writes a contract, the library (foundation) is used for testing with mocks inside, and when the contract is published, it is replaced with the combat one, which works with the blockchain itself .

To begin with, the code should be disassembled and understand what we all have, to find a structure that is inherited from BaseContract in order to generate a wrapper around it.

This is done quite simply, we read the file with the code in [] byte, although the parser itself can read the files, it’s good to have the text that all AST elements refer to, they refer to the byte number in the file, and we want to receive further the code of structures as it is, we just take something of type.

func(pf *ParsedFile)codeOfNode(n ast.Node)string {
	returnstring(pf.code[n.Pos()-1 : n.End()-1])
}

The file we actually parsim and get the topmost node AST, from which we will produce a crawl file.

fileSet = token.NewFileSet()
node, err := parser.ParseFile(fileSet, name, code, parser.ParseComments)

Next, we go around the code starting from the top node, and collect everything interesting in a separate structure.

for _, decl := range node.Decls {
	switch d := decl.(type) {
	case *ast.GenDecl:
		…
	case *ast.FuncDecl:
		…
	}
}

Decls, this is already a list of everything that is defined in the file that is already parsed into an array, but this is an array of Decl interfaces that do not describe what is inside, so each element should be reduced to a specific type, here the authors of the language departed from their idea of using interfaces, the go / ast interface is rather a base class.

We are interested in the nodes of the types GenDecl and FuncDecl. GenDecl is a definition of a variable or type, and you need to check what is inside a type, and once again lead to a TypeDecl type, with which you can already work. FuncDecl is simpler - this is a function, and if it has the Recv field filled in, then this is the method of the corresponding structure. All this stuff we collect in a convenient storage, because then we use the text / template, but it does not have great expressive power.

The only thing we need to separately remember is the name of the data type, which is inherited from BaseContract, it is around it that we are going to dance.

Code Generation

And so, we know all the types and functions that are in our contract and we need to be able to make a method call on the object from the incoming method name and the serialized array of arguments. But after all, at the time of code generation, we know the entire contract device, so we add another file next to our contract file, with the same package name, into which we push all the necessary imports, the types are already defined in the main file and are unnecessary.

And most importantly, the wrappers over the functions. The name of the wrapper is supplemented with a prefix and now it is easy to search for the wrapper.

symbol, err := p.Lookup("INSMETHOD_" + Method)
wrapper, ok := symbol.(func(ph proxyctx.ProxyHelper, object []byte,
	data []byte)(object []byte, result []byte, err error))

Each wrapper has the same signature, so that when we call it from the main program, we don’t need unnecessary reflections, the only thing that function wrappers differ from method wrappers is that they don’t get or return the state of the object.

What do we have inside the wrapper?

We create an array of empty variables corresponding to the function arguments, put it into a variable of type array of interfaces, and deserialize the arguments to it, if we are a method, we also need to serialize the state of the object, in general, something like this:

{{ range $method := .Methods }}
funcINSMETHOD_{{ $method.Name }}(ph proxyctx.ProxyHelper, object []byte, data []byte) ([]byte, []byte, error) {
    self := new({{ $.ContractType }})
    err := ph.Deserialize(object, self)
    if err != nil {
        returnnil, nil, err
    }
    {{ $method.ArgumentsZeroList }}
    err = ph.Deserialize(data, &args)
    if err != nil {
        returnnil, nil, err
    }
{{ if $method.Results }}
    {{ $method.Results }} := self.{{ $method.Name }}( {{ $method.Arguments }} )
{{ else }}
    self.{{ $method.Name }}( {{ $method.Arguments }} )
{{ end }}
    state := []byte{}
    err = ph.Serialize(self, &state)
    if err != nil {
        returnnil, nil, err
    }
{{ range $i := $method.ErrorInterfaceInRes }}
    ret{{ $i }} = ph.MakeErrorSerializable(ret{{ $i }})
{{ end }}
    ret := []byte{}
    err = ph.Serialize([]interface{} { {{ $method.Results }} }, &ret)
    return state, ret, err
}
{{ end }}

The attentive reader will ask, and what is the proxyhelper? - this is a combine object that we still need, but for now we use its ability to serialize and deserialize.

Well, anyone reads asks, “but these are your arguments, where are they from?” Here is also a clear answer, but the text / template stars from the sky are not enough, that's why we calculate these lines in the code, and not in the template.

method.ArgumentsZeroList contains something of type

var arg0 int = 0
Var arg1 string = “”
Var arg2 ackwardType = ackwardType{}
Args  := []interface{}{&arg0, &arg1, &arg2}

And Arguments respectively contains “arg0, arg1, arg2”.

In this way, we can call up anything we want with any signature.

But not any answer can be serialized, the fact is that serializers work with reflection, but it does not give access to the unexported fields of structures, that's why we have a special proxy method that takes an error interface object and creates a foundation object from it. Error, which differs from the usual one in that the error text is in it in the exported field, and we can serialize it, albeit with some loss.

But if we use a code-generating sterilizer, even we will not need it, we are compiled in the same package, we have access to non-exported fields.

What if we want to call a contract out of a contract?

You do not understand the depth of the problem, if you think that it is easy to call a contract out of a contract. The fact is that the correctness of another contract should confirm the consensus and the fact of this call should be put signed on the blockchain, in general, simply compile with another contract and call it on its method - it will not work, although I really want to. But we are friends of programmers, for this we should give them the opportunity to do everything on the line, and hide all the tricks under the hood of the system. Thus, the development of a contract is conducted as if with direct calls, and contracts pull each other transparently, but when we collect a contract for publication, we send a proxy instead of another contract, which only knows its address and call signatures about the contract.

How would all this be organized? - We'll have to store other contracts in a special directory that our generator will be able to recognize and create a proxy for each imported contract.

Ie if we met:

import  “ContractsDir/ContractAddress"

We write it to the list of imported contracts.

By the way, for this you no longer need to know the source code of the contract, you only need to know the description that we have already collected, so if we publish such a description somewhere, and all the calls go through the main system, then we don't really care what another contract is written in the language, if we can call methods on it, we can write a stub for it on Go that will look like a package with a contract that can be called directly. Napoleonic plans, let's get down to implementation.

In principle, we already have a proxyhelper method, with this signature:

RouteCall(ref Address, method string, args []byte) ([]byte, error)

This method can be called directly from a contract, it calls a remote contract, returns a serialized response, which we need to parse and return to our contract.

But it is necessary for the user to look like:

ret := contractPackage.GetObject(Address).Method(arg1,arg2, …)

Let's get started, firstly in the proxy you need to list all the types that are used in the signatures of the contract methods, but as we remember, we can take its textual representation for each AST node, so the time has come for this mechanism.

Next, we need to create a type of contract, in principle, he already knows his class, only the address is needed.

type {{ .ContractType }} struct {
    Reference Address
}

Further, we need to somehow implement the GetObject function, which will return an instance proxy to the address on the blockchain, which can work with this contract, and for the user, it looks like the actual instance of the contract.

funcGetObject(ref Address)(r *{{ .ContractType }}) {
    return &{{ .ContractType }}{Reference: ref}
}

Interestingly, the GetObject method in user debugging mode is directly a BaseContract structure method, but it is not there, nothing prevents us from doing the SLA, doing what is convenient for us. Now we are able to create a proxy contract, the methods of which we control. It remains to actually create methods.

{{ range $method := .MethodsProxies }}
func(r *{{ $.ContractType }}) {{ $method.Name }}( {{ $method.Arguments }} ) ( {{ $method.ResultsTypes }} ) {
    {{ $method.InitArgs }}
    var argsSerialized []byte
    err := proxyctx.Current.Serialize(args, &argsSerialized)
    if err != nil {
        panic(err)
    }
    res, err := proxyctx.Current.RouteCall(r.Reference, "{{ $method.Name }}", argsSerialized)
    if err != nil {
   		panic(err)
    }
    {{ $method.ResultZeroList }}
    err = proxyctx.Current.Deserialize(res, &resList)
    if err != nil {
        panic(err)
    }
    return {{ $method.Results }}
}
{{ end }}

There is the same story with the construction of the list of arguments, since we are lazy and store exactly the ast.Node method, then calculations require a lot of type conversions that the templates do not know, so everything is prepared in advance. With functions, everything is seriously more complicated, and this is the topic of another article.

Our functions are object constructors and there is a great emphasis on how objects are actually created in our system, the fact of creation is registered on a remote executor, the object is transferred to another executor, it is checked and actually stored, and there are many ways to save it. This area of knowledge is called a crypt. And the idea is simple in principle, a wrapper, inside which only the address is stored, and the methods that serialize the call and pull our singleton harvester that does the rest. We can not use the transmitted proxyhelper, because the user did not give it to us, so we had to make it a singleton.

Another trick - in fact, we still use the context of the call, this is an object that stores information about who, when, why, why our smart contract was called, based on this information, the user decides whether to execute it at all, and if you can how.

Previously, we passed the context simply, it was an unexpired field in the BaseContract type with a setter and a getter, and the setter allowed the field to be set only once, respectively, the context was set before the execution of the contract, and the user could only read it.

But the problem is, the user only reads this context, if he makes a call to some kind of system function, for example, proxies of another contract, then this proxies will not receive any context, since no one passes it to him. And here come the goroutine local storage. We decided not to write our own, but to use github.com/tylerb/gls.

It allows you to set and take context for the current gorutiny. Thus, if a gorutin was not created within the contract, we simply set the context in gls before launching the contract, now we give the user not a method, but just a function.

funcGetContext() *core.LogicCallContext {
	return gls.Get("ctx").(*core.LogicCallContext)
}

And he uses it happily, but we use it, for example, in RouteCall (), in order to understand which contract is causing someone now.

The user can, in principle, create a gorutin, but if he does this, the context will be lost, so we need to do something with it, for example, if the user uses the go keyword, then the parser needs to wrap such calls into our wrapper, which the context remembers, creates and will restore the context in it, but this is the topic of another article.

Together

We like the principle of how the GO toolchain works, in fact it’s a bunch of different commands that do one thing, which are executed together when you do a go build, for example. We decided to do the same, one team puts the contract file in the temporary directory, the second one puts a wrapper for it and calls the third one several times, which creates a proxy for each contract, the fourth one compiles it, the fifth publishes it on the blockchain. And there is one command to run them all in the right order.

Hooray, we now have toolchain and runtime to run GO from GO. There are still a lot of problems, for example, you need to somehow unload the unused code, you need to somehow determine that it is hanging and restart the suspended process, but all these are tasks that are clear how to solve.

Yes, of course, the code written by us does not pretend to librarianship, it cannot be used directly, but to read an example of working code generation is always great, I didn’t have enough of it in my time. Accordingly, part of the code generation can be viewed in the compiler , and how it runs in the performer .

Tags: