Gaming AI: Unity with Deep Reinforcement Learning

Published in

Nerd For Tech

8 min readMar 11, 2021

Image credit: https://pixelkin.org/2017/08/24/studies-reveal-video-gamings-effects-on-the-brain/

On my journey with Data Science, I heard Reinforcement Learning is used in likes of Game Development and Autonomous Driving. As per Wiki definition:

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward.[1]

So Non-Player Characters (NPC’s) within a game environment becomes obvious candidates for such intelligent agents. And then I watched this video by Paris Buttfield-Addison at North Bay Python (2019) that got me hooked.

In this post I will describe my experience of developing a simple Unity game with Unity ML-Agents Toolkit (open source project for Deep Reinforcement Learning based on pytorch). Hoping game development and machine learning enthusiasts willing to tread similar paths can benefit from:

highly visual representation of Reinforcement Learning enabled by ML-Agents
a new gamer-hobbyist perspective as mine

What have I built?

A simple 2D game where a ball (intelligent agent as described above) tries to dodge particles without any user inputs. The end game looks as follows.

Gaming AI demo

How have I built it?

Below diagram serves me as a framework for building Unity games with ML-Agents.

The above framework translates into following steps towards my current objective:

Create a development environment
Create a Unity 2D project
Define a single scene as environment for the agent (ball in this case) for training and testing
Add observations, rewards and behavior to the ball (agent)
Test the game with user inputs (heuristic)
Train the ball (learning)
Deploy the model to draw inferences based on the pretrained model

I will elaborate each step in the following sections. If you are already familiar to Unity and ML-Agents, feel free to clone the project and try running it yourself.

Create a development environment

I will need the following:

Unity — Got Version 2019.4.20f1 which I believe is the latest stable version at the time of writing the post. To download and install, visit Unity page and choose a plan that suits your requirement.
Python — I use Anaconda. Created a v3.6 virtual environment as conda create — name ml-agents python=3.6
Code Editor — I am used to VS Code. To enable using VS Code from Unity go to Edit>Preferences>External Tools>External Script Editor. Default online editor with Unity should work fine too.
ML-Agents — I used release_13. Installation instructions are available in ml-agents documentation. For purposes of this exercise, I will need com.unity.ml-agents package (cloned the repo and added the package from disk to the project after creation) and pytorch and ml-agents python packages.

Create a Unity 2D project

Simple — Go to Unity Hub > New > Select Unity Version > 2D Template > Project Name = Unity2DBallML > Create

Note: Check in project pane for ML Agents in Packages. If not, add com.unity.ml-agents using package manager.

Define the scene

(1) Add the platform

From File panel, add GameObject > 3D Object > Cube.
In Inspector, I set Transform properties of the Cube. Position is set to X=0, Y=-4 and Z=0 and Scale is set to X=12, Y=1 and Z=1.
In Inspector, Add Component > Physics 2D > Box Collider 2D. It is required for the ball to roll on the platform.

(2) Add the ball

From File panel, add GameObject > 3D Object > Sphere.
In Inspector, I set Transform properties of the Sphere. Position is set to X=0, Y=0 and Z=0 (to create ball falling from top effect whenever game starts/resets) and Scale is set to X=1, Y=1 and Z=1.
In Inspector, Add Component > Physics 2D > Box Collider 2D.
In Inspector, Add Component > Physics 2D > Rigidbody 2D. It is required for the ball to have gravity.

(3) Add the particles that the ball needs to dodge

From File panel, add GameObject > Effects > Particle System.
In Inspector, I set Transform properties of the Particle System to place it well above the visible scene to create particles falling from space effect. So position should be set to X=0, Y=7 and Z=-1.
By default Particles move along Z axis. In 2D space it will create a flying to top effect. I want it to drop along Y axis. To do that I add a gravity modifier of 0.3 in Particle System properties in Inspector.
By default Particles generate from center and move along a cone. I want the particles to be randomly generated across the whole length of the platform. For that I need to go to Shape of Particle System and change Shape to Box, Emit From to Volume and Scale to X=12, Y=1 and Z=1.
I want to detect collisions with the ball — so I enable Collision property of Particle System with Type as World (rest can be left to defaults).
To allow some time before the particles start appearing, I add Start Delay as 1 in Particle System properties in Inspector.
Lastly I change start speed to 0.1 in Particle System properties in Inspector.

(4) Group all the game objects (this step is redundant however will be useful if I like to train multiple agents at the same time)

Create an empty game object in Hierarchy pane and name it TrainingArea
Set it’s position parameters to X=0, Y=0 and Z=0
Drag the ball, platform and particle system into TrainingArea

At this point I have got all the game objects that I need. If I hit play, the ball lands on the platform and particles start falling from top. That’s as much graphics that you gonna see here — what a shame 🥱.

Note: If you are interested to learn more on Physics component(like Collider, Rigidbody used above) or Particle System then refer to Unity manual.

Add observations, rewards and behavior to the ball

Now time to get to coding. I select the ball in Hierarchy pane. Go to Inspector > Add Component > New Script > SphereAction and open it up on code editor. Here’s how the C# script should look like:

Let’s inspect the script -

I include the necessary packages.
Define my class and variables.
Initialize function assigns the ball, it’s speed, particle system and particles within it so that it can be referred through out the class.
OnEpisodeBegin function re-positions the ball at starting point and resets particle system whenever an episode starts or resets. I will define the end of an episode later.
CollectObservations defines the observations to be sent to the brain (ML algorithm or pretrained model) during training or inference to decide the actions based on inputs. In this example, I collect the position and speed of the ball and position and speed of the particles.
OnParticleCollision is a built-in function which is invoked when particle system detects collision against a rigid body (ball in our case). I set a counter every time collision is detected.
OnActionReceived function defines the actions and rewards. Let’s talk about actions first. In my case, the function receives input on where to position the ball and it’s used to move towards that position by AddForce function with the predefined speed. What about rewarding? I penalize the agent highly (-1) on particle collision and end the episode. If the ball falls off the platform then also I end the episode. And finally I reward it with a small value (0.1) every time it successfully dodges a particle.
Heuristic function is added to the ball to be able to move continuously on X axis on user input (based on right and left keys).

I have covered the hard bits by now (Ok, I get it wasn’t hard but isn’t that the idea 😏). Now there’s couple of things left.

Add Component Behavior Parameters to the ball (through Inspector) and set Behavior Name to 2DSphere and Space Size to 8 (I send 8 observations to the brain — each position vector comprises of 3 numbers for the ball and particles and their velocities).
Add Component Decision Requestor to the ball (through Inspector) and set Decision Period to 10.

Test the game with user inputs (heuristic)

I am all set to test the learning environment. Before I hit run, need to change Behavior Parameters > Behavior Type to Heuristic to indicate that the agent actions will be guided manually. And there, I start moving the ball with the left and right arrows. I suck at it — just like any other game I play. I guess it’s my skewed measure of a decent game.

Train the ball (learning)

Let’s get to the shiny parts. I chose Proximal Policy Optimization (PPO) for training my agent which is the default Deep Reinforcement Learning algorithm within Unity ML-Agents. I will spare the maths behind but if you interested to learn more on PPO then visit this page.

Training exercise will be executed outside of Unity as shown in the framework. To start training, I need to:

Select the ball and change Behavior Parameters > Behavior Type back to Default in Inspector (I have set it to Heuristic for the earlier step)
Setup hyperparameters — Create the 2DSphere.yaml as shown below. If you have cloned ml-agents repo then you may like to save it within ml-agents/config/ppo as a standard practice.

Execute training — I run the following command from ml-agents directory in command prompt— mlagents-learn config/ppo/2DSphere.yaml --run-id=2DSphereFirstRun. After a bunch of messages and warnings, it prompts to start the game in Unity. I hit play and training starts.

To monitor the training process, I use tensorboard. And here’s how it looks at the end of training.

The trends to note are whether reward and episode length are increasing consistently. From the graphs above, I see the lines are not stable — which indicates the model needs more training. I will let it go for now but you can change max_steps: 500000 in hyperparameters and try training.

Deploy the model to draw inferences

The trained model is saved as ml-agents/results/2DSphereFirstRun /2DSphere.onxx . It needs to be moved to my Unity project folder — a standard practice will be to copy over to Unity2DBallML/Assets/Models. The model file is now accessible from the project pane within Unity. I select the ball, and drag the model file to Behavior Parameters > Model in Inspector.

That’s it! I hit play and the ball draws inferences from the model to dodge particles while staying on the platform all by itself as shown in the video.

If you are building your own project while following this post and stuck at any step, clone my project from git and add it through Unity Hub to compare. Hope the post helps on your Gaming AI ventures.