bbchallenge June 15, 2016 at 12:04

Black Box Challenge Summary

Hello, Habr! Three months ago, we announced the start of the BlackBox Challenge machine learning competition , and it has recently ended. In this post, the organizers of the competition will talk about how everything went.

Inspired by the results of Google DeepMind on reinforcement learning , we realized how great it is when a system does not use human expertise, but learns to understand the environment. We decided to make a competition in which participants need to create just such a system.

What kind of challenge?

The BlackBox Challenge format is a synthesis of the classic format of machine learning competitions (as on the Kaggle website ) and artificial intelligence programming competitions (for example, the Russian AI Cup ). The participant was asked to write a bot that plays a game with unknown rules - at each step, the bot is given 36 variables that describe the state of the environment, and it must perform one of four actions.

On the one hand, the competition turned out to be interactive - it was necessary to write an agent who interacts with the external environment. On the other hand, the laws of this environment were unknown to the participants - this forced them not to use a priori knowledge about the structure of the game, but to use modern machine learning methods .

Summary

The competition lasted three months, during which time 3347 solutions were downloaded, of which 1459 are non-trivial solutions that do not match the published example (baseline agent).

The number of registered participants is 1360, of which 415 downloaded at least one solution.
93 participants were able to surpass baseline at the validation level.

Prizes

The prize fund amounted to 800,000 rubles:

1st place: 300,000 rubles
2nd place: 175,000 rubles
3rd place: 125,000 rubles
4-8 places: Xbox One
special prize of 100,000 rubles for the most interesting solution chosen by DCA experts

In the last weeks of the competition in the leaderboard there was a fierce struggle, and the fate of the prizes was decided by a few points.

By a wide margin, the insight participant was the winner of the competition with a score of 4693 points at the final level.

Participants from the second to fifth places - 5vision, alexandrbugaychuk, grmel89 and wrwrwr - go very close to each other. The gap between the results of the 2nd and 5th places is less than 150 points! This is amazingly small, and to understand, we built the graphs of the best solutions at the validation and final levels (note that the graphs are built for the best solutions at the final level).

It can be seen from them that the decisions of these participants themselves are also very close and the difference in results is due to the randomness inherent in the game. This time, fortune was on the side of 5vision and alexandrbugaychuk, congratulations! Prize 6-8 places were taken by VictorGNC, cosionix and AGilmullin (Kesha), breaking the base bot by more than 1000 points. This is a great result.

Participants SDil and ottogin close the top ten, also overtaking the base bot by more than 1000 points.
A full table of final results is available here .

Nomination “The Most Interesting Solution”

In addition to the main set of prizes, we also played the nomination “Most Interesting Solution”, in which DCA experts evaluated the elegance and prospects of the participants' approaches.

Most of the solutions turned out to be multi-parameter models in which the parameters changed randomly, often using evolutionary algorithms. The quality of the model was determined by the result at one of the game levels. Judging by the results, such approaches have been quite effective. In a similar way, our linear bot (baseline) was received.

There were, however, several participants who acted differently and also achieved good results. It was difficult for DCA experts to choose the most interesting solution, but in the end, the solution of the 5vision team won, who managed to implement an elegant idea using policy iteration. The team receives an additional 100,000 rubles.

I would also like to note the decisions of guillermobarbadillo - the only one who managed to apply Q-learning, ottogin - for the found method of supervised neural network training and, of course, insight - for an unusual and effective approach to sampling.

What's next

We have opened a verification system for those who want to solve the black box for their pleasure and test ideas for which they did not have enough time.
Judging by the reviews, many liked this format of the competition, so soon we plan to hold the competition with a new interesting interactive task.

For cooperation, we are available at wow@blackboxchallenge.com

Thank you for participating!

Tags: