The developers explained how the AI ​​beat no-limit hold'em poker professionals at a distance of 120,000 hands


    Professional poker player Jason Les talks to Professor Tuomas Sandholm from Carnegie Mellon University during a headzap with the bot Libratus. Jason lost nearly a million conditional dollars to the program, more than any other professional

    . Recently, developers of weak AI systems often compare the effectiveness of their programs in a game standoff against a person. That is, just in games. The computer has already defeated a man in checkers, chess and go. In these games with full information at any time of the game, all players have full information about the state of the game, that is, about the position and all possible moves of any of the players.

    Unlike such deterministic situations, in games with incomplete information, some of the information about the state of the game is hidden from the player — for example, the opponent’s card. Unlimited Texas Hold'em is just one of these games. In addition to the opponent’s closed cards, an element of uncertainty is added here due to the arbitrary size of each bet. With this in mind, the number of possible outcomes is estimated at 10,161 .

    Perhaps Texas Hold'em is the most popular game in the world with incomplete information. Billions of dollars are played online every day. The use of bots was strictly forbidden before, but now the owners of poker rooms have a new reason to monitor the processes on the player’s computer, since the Libratus program reliably picks up stacks on headdresses even from the best professionals.

    The winning match of Libratus against four poker professionals was held January 11-30, 2017 in the framework of the competition “Brains vs. AI .


    The stacks of the Libratus program and four opponents during the 20 days of the

    AI competition played 120,000 hands in head-ups and remained in positive territory for $ 1,766,250 conditional dollars. The players themselves were very impressed with the game of the program, which skillfully changed its strategy every day, adapting to the actions of the players.

    Of course, the game was not for real money, so the players themselves were somewhat relaxed and not too responsible about the game than if they were playing with their own money. Yes, and they had to spend at the computer many hours of sessions every day, which is physically exhausting. Nevertheless, such a reliable payoff program can not fail to impress. Over 14 big blinds for a hundred hands. According to the developers' estimates, winning such a sum over such a long distance excludes the influence of luck with a probability of 99.7%, that is, this is a truly significant victory.

    Now the developers of the program at Carnegie Mellon University have published a scientific article in which they explained the architecture and principles of teaching AI, which beat the poker professionals.

    In short, to simplify the calculations, the program has grouped 10,161possible outcomes for similar hands (for example, flush to the king and flush to the ladies) and similar bet sizes. Libratus consists of three modules. The first one is a detailed pre-compiled strategy for how to play in the first rounds (range of hands for raising from each position). Further, the strategy is spelled not so hard. The second strategy depends largely on the course of the game, that is, the fallen cards and the opponent's behavior with regard to its ranges and statistics. The third model is a strategy of the game especially against unpredictable opponents, that is, people. This third strategy is constantly being modified in real time. If a person took some unexpected maneuver for the program, then she saved it and entered it into her model, changing that with the new data and self-improvement.

    According to the developers, successful work in situations with incomplete information gives the AI ​​an advantage not only in games. The fact is that such situations are ubiquitous in real life. Virtually all of human life, almost all social and economic relations are “games” with incomplete information. Therefore, possession of appropriate tools is extremely important for the successful survival of AI in the real world. In practice, such programs can be used, for example, to develop effective strategies in security systems, economic models, political models and other systems with incomplete information.

    The techniques used in the Libratus program are largely independent of the sphere of application, and therefore they can be used in programs of other purposes.

    Research Articlepublished December 17 in the journal Science (doi: 10.1126 / science.aao1733, pdf ).

    Also popular now: