
We program in the world of Minecraft
Habr, hello! While everyone is discussing AI in the Pacman world , we’ll start making our Minecraft AI with the Malmo framework from Microsoft Research. Pacman will also appear here. If you love the cubic world, or you would like to start studying artificial intelligence, or you have children with whom you cannot find common hobbies, or you just got interested in the topic - please, under the cat.

In this article I will try to cover several topics:
I met a toy when I was already a student. This did not stop me from postponing all my personal, working and academic goals that day, and completely withdrawing into the cubic universe. Then she let me go only after a month, but still I am happy to drop by sometimes to run an hour around my beloved world.
For me, Minecraft was a continuation of my favorite childhood toy - Lego, correcting its main drawback: a constant lack of details. An analogue of Lego with unlimited details, which could be better.
I would like to emphasize the absence of cruelty in this game. You can kill zombies or run up, jump from a cliff. Nobody argues. But the lack of blood is very pleasing, as is the cute visualization of the birth of a new life.

Minecraft has a very vague concept of the final goal. Of course, you can pump and kill the dragon, proudly saying that you completed the game. But no one does. The main thrill of the Minecraft world is that each time you can come up with your own personal goal: to explore the world and find a cave with hiding places, build your dream home, learn the basics of electricity or go to a server with a friend and make each other all kinds of traps. The lack of goals in the game is, in my opinion, its main advantage. Minecraft provides a huge scope for creativity, with almost no limits.
Studying the subject, I accidentally found out that the world of Minecraft is not limited to game, merch, summer plays and fan art. The whole series is shot in the game, and - unexpectedly - they are quite popular. In my opinion, this is funny.
I was very pleased with the news about the availability of an open source programming framework in the Minecraft world. I firmly believe that in the future, in the vast majority of professions, basic programming skills may be needed. In my opinion, a framework based on your favorite toy is a great way to show your child the exciting world of programming.
The Malmo framework was created by the joint efforts of several researchers whose main goal was to adapt an interesting world to experiments in the field of artificial intelligence. AI algorithms are still relatively few, and they all have enormous potential for more detailed study and improvement. I really like that Microsoft creates additional motivation to learn the unknown.
How
The main process is as follows: in one window you need to raise the server and client. There is a script for this
You can run as many instances as you like
You can implement the logic for each of the men in the code, and you can also control the character yourself with all the familiar AWSD keys.
In addition to a server with a client and a file with logic, we also have an xml file with a description of the initial state of the world. The authors do not insist on its existence, and in their examples they often put it in a line and store it in code, but, in my opinion, it is more convenient to immediately make it a separate file, adding the necessary pieces as necessary.
The authors took care of us and made an impressive number of examples , adding a description to them .
My advice: do not try to start from scratch, take the first example as the base. Nothing happens in it, we just create the simplest flat world and join the character. In the while loop at the end, you can optionally add an action to what is happening. For example, write there:
And enjoy the first steps of your hero. Note that the so-called default is ContinuousMovementCommands. Think of the commands given to the character as changing the position of the lever. By saying
Commands will be executed in a split second. Do not forget to insert periodic lines
In the xml file you can set the game mode:
Set the initial time, character position, customize the world: make it flat or close to reality.
This code will draw you Pakman, who eats balls and goes into a rainbow crater:

Finally, in xml, you can add the necessary coordinates to add an overview to the character:
By default, we don’t have the opportunity to look around and get information about the nearest blocks. Nevertheless, we can say that we want to know what is around us. Please note that in this case we need to use the relative coordinates counted from the cube with the feet of the hero. As a result of executing a similar line:
We will get an array with strings. Each row is a textual representation of the type of one of the cubes.
In this way, you can create an AI that explores the world, searches for something and does not die for silly reasons. The simplest option without using machine learning I implemented here .
Of course, the first thing I wanted to see for the implementation of AI algorithms in malmo was the ability to move discretely. On the AI issue, there are already enough difficulties, and I do not want to add to everything else a constant adjustment of the direction and speed of movement.
We include the necessary in xml like this:
Unfortunately, this will not be enough. To move discretely, your initial position must be strictly in the center of the cube:
Entire coordinates will put you at the intersection of cubes, the character will refuse to move, you will not see any warnings and errors. The tutorial also does not warn about this. I spent about 4 hours to understand the essence of the problem and make the x and z coordinates half. (y is responsible for height and does not play a role in this story).
In addition, the researchers added some nice features to solve the problem of training with reinforcement (Reinforcement Learning). Algorithms of this type involve the constant rewarding or punishment of artificial intelligence for certain actions. The developers thought over this moment and added the ability to register these actions / events in xml, saving the code from constant identical checks. You can also set the end of the game upon the occurrence of an event:
For example, here we constantly punish the character a little bit for every step that did not result in victory; strongly reward for victory and punish for death; finally, complete the round in the event of death or winning.

The authors of the framework gave us an amazing opportunity to immerse ourselves in our beloved world on the other hand. Malmo is in beta so far, in many situations he ... forces him to improve his troubleshooting skills. Nevertheless, its advantages outweigh all its disadvantages, and the fact that the sources are publicly available on github allows us to independently finish the right place or create an issue to fix critical bugs.
The authors of the project, for reasons that are understandable to me, do not mention in any of the articles the opportunity to educate children on the basis of the framework: the child is unlikely to cope with the struggle with small but frequent bugs. Nevertheless, I am sure that if a parent helps his child and programs with him, this will give excellent results and allow you to spend time with benefit.
In addition to the framework itself, Microsoft also hosted a competition based on a platform called the Malmo Challenge. It was intended to encourage scientists and researchers to work on collaborative algorithms. The competition started about six months ago, and the results appeared on June 5.
The essence of the challenge is the following: we have a flat world, a fence of complex shape, a pig is running inside the pen and 2 people are walking. Our task is to create an AI for one of the characters who can interact with the second, so that together they drove the pig into an enclosed space. The second character can behave randomly, can be controlled by a person, another AI, it can even be a second instance of your own AI.

In this case, you can get the maximum number of points by catching a pig, or you can get a small number of points by jumping into a puddle from the side. You will not get anything if your partner decides to jump into a puddle, refusing to interact with you.
This task is generally called deer hunting . It was formulated in the 18th century by Jean-Jacques Rousseau. Despite the impressive age of the problem, it is still unclear which algorithm most effectively solves the problem.
I am pleased to share with you the results of the competition . I was very surprised by the distribution of places in the standings.
The first place was taken by the project.teams from the UK. The authors soberly appreciated the severe lack of time, realized that they were unlikely to have time to adapt complex existing algorithms for the task. They chose Bayesian conclusion to determine the type of partner, as well as Markov chains for direct gameplay. And they won.
Participants in the second place decided to take the most complex of existing solutions, they used DNN, Reinforcement learning, DQN, A3C model ... And all this did not help them get around Bayes and Markov chains.
To summarize the article with the idea that you need to be simpler.
If you also want to try creating your own AI, join our Russian-language chat about neural networks in Telegram. There you can ask questions you are interested in, as well as share your achievements.
A video with my story about Malmo at a meeting of the Petersburg Python mitap has already appeared on my Youtube channel . There are also notes from my other lectures and other chatter about IT.

In this article I will try to cover several topics:
- I will express my opinion about the craziness of children on a cubic toy
- I'll tell you about the main idea of Malmo
- I will show some examples with code and give an understanding of where to go next.
- I'll tell you about the idea and results of the Malmo Challenge
Minecraft: My Background
I met a toy when I was already a student. This did not stop me from postponing all my personal, working and academic goals that day, and completely withdrawing into the cubic universe. Then she let me go only after a month, but still I am happy to drop by sometimes to run an hour around my beloved world.
For me, Minecraft was a continuation of my favorite childhood toy - Lego, correcting its main drawback: a constant lack of details. An analogue of Lego with unlimited details, which could be better.
I would like to emphasize the absence of cruelty in this game. You can kill zombies or run up, jump from a cliff. Nobody argues. But the lack of blood is very pleasing, as is the cute visualization of the birth of a new life.

Minecraft has a very vague concept of the final goal. Of course, you can pump and kill the dragon, proudly saying that you completed the game. But no one does. The main thrill of the Minecraft world is that each time you can come up with your own personal goal: to explore the world and find a cave with hiding places, build your dream home, learn the basics of electricity or go to a server with a friend and make each other all kinds of traps. The lack of goals in the game is, in my opinion, its main advantage. Minecraft provides a huge scope for creativity, with almost no limits.
Studying the subject, I accidentally found out that the world of Minecraft is not limited to game, merch, summer plays and fan art. The whole series is shot in the game, and - unexpectedly - they are quite popular. In my opinion, this is funny.
I was very pleased with the news about the availability of an open source programming framework in the Minecraft world. I firmly believe that in the future, in the vast majority of professions, basic programming skills may be needed. In my opinion, a framework based on your favorite toy is a great way to show your child the exciting world of programming.
Malmo: the main idea
The Malmo framework was created by the joint efforts of several researchers whose main goal was to adapt an interesting world to experiments in the field of artificial intelligence. AI algorithms are still relatively few, and they all have enormous potential for more detailed study and improvement. I really like that Microsoft creates additional motivation to learn the unknown.
Technical points
Installation
Despite the strict adherence to the instructions, you may encounter a number of problems during the installation process. My problems were mainly due to the fact that some components were already installed, but the version was different. All problems are treated with the help of a well-known site .
Support for OS and programming languages
Despite the bold statement about support for all three popular OSs, it seemed to me that testing was properly conducted only for Windows. Defeating installation problems, your headache on Windows promises to end. On Linux, the problems are likely to continue, as the raised server crashes periodically without giving reasons. If you continue my experiments - be sure to write in the comments about your experience.
The authors tried to support a large number of popular languages and made bindings for C #, C ++, Lua, Python2 and Java. I chose Python.
How to play programming in Malmo
The main process is as follows: in one window you need to raise the server and client. There is a script for this
./Minecraft/launchClient.*
. After the server has risen, in another window you can run the code with the main logic to control the character. How to find out that the server has risen? Everything is extremely logical: you will see a running instance of Minecraft with the initial menu inside, and the inscription will proudly flaunt in the terminal Building 95%
. You can run as many instances as you like
launchClient
. In this case, the first launched instance will be the server, as well as the client, which is one character. All subsequent instances will connect to an already raised server, adding an additional character to the world.You can implement the logic for each of the men in the code, and you can also control the character yourself with all the familiar AWSD keys.
In addition to a server with a client and a file with logic, we also have an xml file with a description of the initial state of the world. The authors do not insist on its existence, and in their examples they often put it in a line and store it in code, but, in my opinion, it is more convenient to immediately make it a separate file, adding the necessary pieces as necessary.
The authors took care of us and made an impressive number of examples , adding a description to them .
My advice: do not try to start from scratch, take the first example as the base. Nothing happens in it, we just create the simplest flat world and join the character. In the while loop at the end, you can optionally add an action to what is happening. For example, write there:
agent_host.sendCommand("move 1")
And enjoy the first steps of your hero. Note that the so-called default is ContinuousMovementCommands. Think of the commands given to the character as changing the position of the lever. By saying
"move 1"
, you will take more than one step. You will run until you give a command "move 0"
. Such a code in practice does not budge the man:agent_host.sendCommand("move 1")
agent_host.sendCommand("move 0")
Commands will be executed in a split second. Do not forget to insert periodic lines
"time.sleep(X)"
. I am sure that you know where to get information about the rest of the teams (although, in my experience, it’s easier to view the tutorial diagonally and then look for the right one in the source). In the xml file you can set the game mode:
Set the initial time, character position, customize the world: make it flat or close to reality.
This code will draw you Pakman, who eats balls and goes into a rainbow crater:
Finally, in xml, you can add the necessary coordinates to add an overview to the character:
By default, we don’t have the opportunity to look around and get information about the nearest blocks. Nevertheless, we can say that we want to know what is around us. Please note that in this case we need to use the relative coordinates counted from the cube with the feet of the hero. As a result of executing a similar line:
grid = observations.get(u'floor3x3', 0)
We will get an array with strings. Each row is a textual representation of the type of one of the cubes.
floor3x3: ['lava', 'obsidian', 'obsidian', 'lava', 'obsidian', 'obsidian', 'lava', 'obsidian', 'obsidian']
In this way, you can create an AI that explores the world, searches for something and does not die for silly reasons. The simplest option without using machine learning I implemented here .
Features for AI
Of course, the first thing I wanted to see for the implementation of AI algorithms in malmo was the ability to move discretely. On the AI issue, there are already enough difficulties, and I do not want to add to everything else a constant adjustment of the direction and speed of movement.
We include the necessary in xml like this:
Unfortunately, this will not be enough. To move discretely, your initial position must be strictly in the center of the cube:
Entire coordinates will put you at the intersection of cubes, the character will refuse to move, you will not see any warnings and errors. The tutorial also does not warn about this. I spent about 4 hours to understand the essence of the problem and make the x and z coordinates half. (y is responsible for height and does not play a role in this story).
In addition, the researchers added some nice features to solve the problem of training with reinforcement (Reinforcement Learning). Algorithms of this type involve the constant rewarding or punishment of artificial intelligence for certain actions. The developers thought over this moment and added the ability to register these actions / events in xml, saving the code from constant identical checks. You can also set the end of the game upon the occurrence of an event:
For example, here we constantly punish the character a little bit for every step that did not result in victory; strongly reward for victory and punish for death; finally, complete the round in the event of death or winning.
Malmo: conclusion
The authors of the framework gave us an amazing opportunity to immerse ourselves in our beloved world on the other hand. Malmo is in beta so far, in many situations he ... forces him to improve his troubleshooting skills. Nevertheless, its advantages outweigh all its disadvantages, and the fact that the sources are publicly available on github allows us to independently finish the right place or create an issue to fix critical bugs.
The authors of the project, for reasons that are understandable to me, do not mention in any of the articles the opportunity to educate children on the basis of the framework: the child is unlikely to cope with the struggle with small but frequent bugs. Nevertheless, I am sure that if a parent helps his child and programs with him, this will give excellent results and allow you to spend time with benefit.
Malmo Challenge: history and results
In addition to the framework itself, Microsoft also hosted a competition based on a platform called the Malmo Challenge. It was intended to encourage scientists and researchers to work on collaborative algorithms. The competition started about six months ago, and the results appeared on June 5.
The essence of the challenge is the following: we have a flat world, a fence of complex shape, a pig is running inside the pen and 2 people are walking. Our task is to create an AI for one of the characters who can interact with the second, so that together they drove the pig into an enclosed space. The second character can behave randomly, can be controlled by a person, another AI, it can even be a second instance of your own AI.

In this case, you can get the maximum number of points by catching a pig, or you can get a small number of points by jumping into a puddle from the side. You will not get anything if your partner decides to jump into a puddle, refusing to interact with you.
This task is generally called deer hunting . It was formulated in the 18th century by Jean-Jacques Rousseau. Despite the impressive age of the problem, it is still unclear which algorithm most effectively solves the problem.
I am pleased to share with you the results of the competition . I was very surprised by the distribution of places in the standings.
The first place was taken by the project.teams from the UK. The authors soberly appreciated the severe lack of time, realized that they were unlikely to have time to adapt complex existing algorithms for the task. They chose Bayesian conclusion to determine the type of partner, as well as Markov chains for direct gameplay. And they won.
Participants in the second place decided to take the most complex of existing solutions, they used DNN, Reinforcement learning, DQN, A3C model ... And all this did not help them get around Bayes and Markov chains.
To summarize the article with the idea that you need to be simpler.
If you also want to try creating your own AI, join our Russian-language chat about neural networks in Telegram. There you can ask questions you are interested in, as well as share your achievements.
A video with my story about Malmo at a meeting of the Petersburg Python mitap has already appeared on my Youtube channel . There are also notes from my other lectures and other chatter about IT.