Unity ML-Agents How Reinforcement learning can be added with Unity

3 min readFeb 2, 2020

Introduction.

Ml-Agents is an open source plugin to train the agents using Reinforcement Learning , imitation learning. Tensor flow based implementation is used for training. Reinforcement learning is type of machine learning which enable agent to learn from an interactive environment to maximize his reward.

Agent have environment and can take available actions. By taking these actions our environment can give positive or negative rewards to agent. Agent have to learn to maximize his rewards by keeping aware about environment state.

Brain.

The primary goal f brain is to learn and teach agent. We can assign one or more agent to single brain. For agent training we have three types of brains.

Learning brain.
Heuristic brain.
Player brain.

Learning brain is used for agent training. We create brain, add agent’s observation to input the environment state to our algorithm (PPO), Brain needs to learn from these observations and provide action inputs (continuous or discrete) to our agent.

Heuristic brain is about implementing the decision class. we hand code the agent action and decision making process.

Player brain can be used for agent action testing. We choose player brain, assign the actions on given keys. to verify the agent actions. In player mode we cannot train the agent.

Proximal policy optimization (PPO)

PPO is an reinforcement learning algorithm. PPO has become the default reinforcement learning algorithm at OpenAI because of its ease of use and good performance.
In Unity ML-agent we have .yaml file. We can tune the algorithm parameters as per our problem.

Agent & Academy

We add the ml-agents folder to our project directory so we can implement the classes. For agent we extends the Agent class which have following methods to override.

InitializeAgent()
CollectObservation()
AgentAction()
AgentReset()
Done()
AddReward()
SetReward()

Initialize Agent describes the initial state of the agent when it will be created. It is called once in a while.

Collect observation takes the agent parameters and feed them to brain, idea is to be aware of agent movement in the environment. This function keep in loop to get new values.

Agent action provides the random values for agent, against collect observation inputs. Each iteration agent action is called to feed new values to agent,

Agent reset is called when we want agent to reset at some point by calling Done function.

Add reward is used to teach agent by giving positive or negative rewards on specific conditions.

Set reward is used to set agent rewards on some fix value.

Academy

Is used to iterate this process. This is main class which is governing each process iteration. We can extends Academy class and add custom code if we want to make any changes to agent.

Installation workflow

Download anaconda 3.6 python and install and clicked on add path to environment variable. The version 3.7 is not working with Unity.
Download and install VS Code after installation of anaconda.
Run anaconda to make confirm and check python environment variables is created or not
Create conda environment, activate it and install tensor flow
Then you have to download and install Visual Studio Unity Package (Optional for now).
Download mlAgents from git repository and place in a folder.
Go to mlAgents folder and install pip dependencies
Cuda installation is optional
Start Unity and open and select SDK Unity folder as pre-defined projects
Install TensorFlowSharp Plugin
Create variable ENABLE_BARRACUDA
Open scene from assets and set its brain.
Run this command

mlagents-learn config/trainer_config.yaml –run-id=firstRun –train

End Notes

This article gives complete understanding of Unity Ml-agents. Basic examples can be tested. For more information about installation and creating own project follow given links.

Let’s Connect on Instagram iamjunaidrana or Fiverr.