plug-in ml-agent for unity

More recently, ml-agents have updated to v0.4. For those who do not know this is an open source plugin, which is an environment for training agents in unity. They can be trained using reinforcement learning, imitation learning, neuroevolution, or other machine learning techniques using the Python API. It also provides a number of modern algorithms (based on TensorSlow), which allow you to create smarter NPCs for your games.

What do you need to implement machine learning in Unity?


Download this plugin can link . You will need a unity-envonment folder. Before this, create an empty project. In the Assets folder, add the contents of the Assets folder with unity-envonment. Do this with ProjectSettings. Please note that if you add this plugin to an existing project, first create an empty project, complete the above description above and create a package (Assets-ExportPackage) and then just import it into your existing project. This is necessary so that you do not lose the existing ProjectSettings settings in your project.

I will show you how to implement machine learning using the example of a soccer game (this scene already exists in the plugin). It has two teams, blue and red. Each team has a striker and a goalkeeper.


Let's go back to the plugin itself. In order for it to work on stage we need at least one Academy, Brain and Agent object. The interaction scheme can be viewed below.



On the stage are the object Academy and its subsidiaries GoalieBrain, StrikerBrain. As we can see from the diagram, each Academy can have several Brains, but each Brain should have one Academy.

Agent - these are our game objects that we want to train.



Next we move on to the code. The SoccerAcademy class, which is added to the GameObject Academy inherited from the Academy class, is preceded by the addition of a namespace using MLAgents. After that, we should override the methods.

publicoverridevoidAcademyReset()
    {
    }
    publicoverridevoidAcademyStep()
    {
    }

And add links to created Brain

public Brain brainStriker;
    public Brain brainGoalie;

Add the Brain component to GoalieBrain, StrikerBrain and assign them to the Academy.



The AgentSoccer class is inherited from Soccer. Add it to each player and indicate which Brain will be used.



In the AgentSoccer script we describe the logic of the players. In the overridden CollectObservations () method, we add a list of rays that will help players monitor and respond to gameplay.

publicoverridevoidCollectObservations()
    {
        float rayDistance = 20f;
        float[] rayAngles = { 0f, 45f, 90f, 135f, 180f, 110f, 70f };
        string[] detectableObjects;
        if (team == Team.red)
        {
            detectableObjects = newstring[] { "ball", "redGoal", "blueGoal",
                "wall", "redAgent", "blueAgent" };
        }
        else
        {
            detectableObjects = newstring[] { "ball", "blueGoal", "redGoal",
                "wall", "blueAgent", "redAgent" };
        }
        AddVectorObs(rayPer.Perceive(rayDistance, rayAngles, detectableObjects, 0f, 0f));
        AddVectorObs(rayPer.Perceive(rayDistance, rayAngles, detectableObjects, 1f, 0f));
    }

The overridden AgentAction method is an analogue of Update. In it, we call the move method and distribute the rewards.

publicoverridevoidAgentAction(float[] vectorAction, string textAction)
    {
        // Existential penalty for strikers.if (agentRole == AgentRole.striker)
        {
            AddReward(-1f / 3000f);
        }
        // Existential bonus for goalies.if (agentRole == AgentRole.goalie)
        {
            AddReward(1f / 3000f);
        }
        MoveAgent(vectorAction);
    }

If you touch the ball, push it

voidOnCollisionEnter(Collision c)
    {
        float force = 2000f * kickPower;
        if (c.gameObject.tag == "ball")
        {
            Vector3 dir = c.contacts[0].point - transform.position;
            dir = dir.normalized;
            c.gameObject.GetComponent<Rigidbody>().AddForce(dir * force);
        }
    }

Reset commands when the target is completed.

publicoverridevoidAgentReset()
    {
        if (academy.randomizePlayersTeamForTraining)
        {
            ChooseRandomTeam();
        }
        if (team == Team.red)
        {
            JoinRedTeam(agentRole);
            transform.rotation = Quaternion.Euler(0f, -90f, 0f);
        }
        else
        {
            JoinBlueTeam(agentRole);
            transform.rotation = Quaternion.Euler(0f, 90f, 0f);
        }
        transform.position = area.GetRandomSpawnPos(team.ToString(),
                                                    agentRole.ToString());
        agentRB.velocity = Vector3.zero;
        agentRB.angularVelocity = Vector3.zero;
        area.ResetBall();
    }

Learning options


You can train agents using the unity editor or using TensorFlow . As for unity, in Academy you need to specify the punishment and the reward for each Brain.



It is also worthwhile to specify Brain by its TypeBrain by setting the External value. They also come in the following types:

  • External - where decisions are made using the Python API. Here, the observations and rewards generated by Brain are redirected to the Python API via an external communicator. The Python API then returns the corresponding action the agent must perform.
  • Internal - where decisions are made using the built-in model TensorFlow. The embedded TensorFlow model is a science policy, and Brain directly uses this model to determine actions for each Agent.
  • Player - where decisions are made using real keyboard input or controller. Here, the human player controls the Agent, and the observations and rewards collected by the Brain are not used to control the Agent.
  • Heuristic - where decisions are made using hard-coded behavior. This is similar to how most character behavior is currently defined and can be useful for debugging or comparing how an Agent with hard-coded rules is compared with an Agent whose behavior has been trained.

Also popular now: