Unity Job System in Practice. How we increased FPS from 15 to 70 in our game By Yurii Yenko Lead Unity 3D Developer | Aug 12, 2024 Ask a Question Hi everyone! I’m Yurii Yenko, I’m the lead developer at RetroStyle Games. Now our team is working on Ocean Keeper — our first game on PC and console. In this article, I want to tell you how we use the Job System from Unity in our game and talk about multithreading in general. It’s a very powerful tool for solving optimization problems, which is less often mentioned in gamemade than we would like. What is multithreading, how does it work, and why do we need a Job System By default, Unity executes game code on a single thread that is created at the beginning of the game — this is called the Main Thread.The Main Thread can create other threads. The code on these threads will be executed in parallel, and they will synchronize the result when they are done. Creating separate threads for each separate task is not a good solution — creating and closing a thread is a rather hard task that should be avoided. It is better to keep a few threads with a long lifespan — for this purpose you can create a Thread Pool. You take a free thread from the pool and execute your task on it, and when the task is completed you return the thread to the pool.Here we may face the problem — there may be more threads in the pool than processor cores and then we will encounter a problem like Context Switching. This is a situation when processor cores constantly switch between processing different threads because one core can process only one thread at a time. This brings us back to the Job System and how Unity has made it easy for us to work with all of the above. The Job System by default creates Worker Threads and their handlers, one for each processor core, and thus avoids the Context Switching problem. The user now only needs to create a task to be executed on another thread and schedule its execution — Job System will disperse the tasks to the free worker threads by itself. Safety System It should be mentioned that no less important and painful problem when writing multithreaded code is Race Condition — an error when the result of task execution depends on the order or speed of task execution on different threads. Most often it happens because two or more threads try to write information to the same place at the same time and therefore it is almost impossible to predict the result that will be finally written. The worst thing is that it can be very difficult to catch this error when writing code, and afterward, it is hard to realize that it exists at all, because it often does not generate a bug. And if the bug has already occurred, it is difficult to debug it — again, everything depends very much on the time and order of task execution. Even breakpoints have a big impact on this, so the symptoms of a bug often disappear while searching for a bug.To save us from long code debugging, Unity has integrated the Safety System along with the Job System. This is the system that is needed to look for and notify about all possible risks of Race Condition and protect the game from bugs that may be caused by it.The Safety System isolates data between different Jobs by copying them and thus eliminates the Race Condition problem. Additionally, this system imposes many restrictions on data declaration and usage and notifies you of any deviations found. Job System in Practice Let’s see how the Job System works in practice. First, we’ll use the most basic example to understand the theory, and then we’ll look at examples from Ocean Keeper and how we used the Job System there. A little disclaimer. Further code may be simplified and some points may be omitted to make it easier to understand and to make the example clearer. Basic introduction and test example Working with the Job System starts with writing a task (Job) that can be executed on different threads. You can do this by declaring a structure that inherits from one of the interfaces below: 1IJobIJob is a standard interface for creating a single parallel task.2IJobParallelForIJobParallelFor is an interface for creating a task that must repeat for many objects.3JobParallelForTransformJobParallelForTransform is an interface for creating a task that uses the Transform component of an object, which must repeat for many objects. First task Let’s take the standard IJob interface and create our first task. The Execute method contains the code that will be executed on another thread. Right now we have a simple math operation there and a message to the console to see when the task has been executed.To run a task, you need to create and schedule it somewhere. You can do it anywhere you want, it will look like this: If you execute this code, you will see the following message order in the console. We see messages from the main thread first, then messages from the task. This happens because the Schedule() method doesn’t stop the main thread, and the scheduled task will be executed when there is a free worker thread. Transferring data to jobs It is possible to pass data to jobs, which they will work with. To do this, you must first declare the data inside the Job structure, and then during its creation on the main thread simply pass what you need to it. It should be understood that Job Systems and Job’s can work only with unconvertible types (Blittable Type). We passed it into the structure, so we would not have any problems during game development. Reading data from completed tasks Now we can transfer data to a task, now let’s figure out how to get it from there. To pass data between the main thread and the worker threads Unity supports a shared memory called Native Container. This is a safe wrapper over unmanaged memory, which holds a pointer to the unmanaged allocation and thus allows you to bypass the Safety System restrictions, which keeps the result of the task execution in an isolated copy.To send and receive data from the task, we create a NativeArray (a list that supports shared memory) with an int type and size of 2 elements. On the zero position we will store the input data, and on the first position we will store the source data.It should be noted that container data can be accessed on the main thread only when the scheduled task has finished. To make sure that the task is finished on the main thread, we need to get the JobHandle handler and call the Complete() method, which will stop the main thread until the scheduled task is finished. In the console, we see a warning that we have not freed the container and this has caused a memory leak. To avoid this, we call the Dispose() method on the container when we don’t need it anymore. data.Dispose(); There is not much use for such a task now, because the main thread is still waiting for the task to be executed right after its declaration, so we have not optimized anything much. It is recommended to call the Complete() method as late as possible — at the moment when the data from the container is needed. NativeContainer initialization Let’s take a closer look at the initialization of NativeArray: NativeArray<int> data = new NativeArray<int>(2, Allocator.TempJob); In its constructor, we see two parameters. The first one is the size of the array and the second one is its lifetime. Let’s take a closer look at the main ones: 1TempTemp is a shortest lifetime (1 frame), but fastest allocation. This container cannot be passed to Job, because its lifetime is too short2TempJobTempJob is a container with this parameter exists for 4 frames and this is enough to pass it to a task, but the speed of allocation is less than in Temp;3PersistentPersistent is the slowest allocation, but its lifetime is equal to the lifetime of the whole application. Boids, an almost bookish example of using the Job System The Job System is most often mentioned when optimizing Boids, a swarm intelligence that simulates the behavior of most pack animals (birds, cows, sheep, etc.). In our game, we use this algorithm to simulate the behavior of fish.Let’s consider the initial example without optimizations. As you can see, the algorithm looks very complicated, because for each fish we need to calculate the distance to all other fish, and then we need to throw a bunch of physical calculations to calculate the logic of avoiding obstacles.Now let’s optimize the algorithm with Job System. This time we use other interfaces for creating threaded tasks — IJobParallelFor and IJobParallelForTransform. A detailed introduction to IJobParallelFor IJobParallelFor is an interface for creating a threaded task that is executed repeatedly. Job System allows you to break these tasks into batches and scatter them across different threads. It is most often used to optimize certain calculations for large lists, so another index is additionally added to the Execute method. public void Execute(int index){} IJobParallelForTransform is similar to IJobParallelFor, but additionally, it allows you to work with the Transform component of an object using TransformAccess. We will analyze it a little later.Let’s return to the optimization of the Boids algorithm. Let’s divide the algorithm into 3 parts: 1FIRST STEPCalculation of fish acceleration.2Second StepCalculation of obstacle avoidance.3Third StepFish movement. Calculation of fish acceleration This task will include calculations related to neighboring fish and calculation of the effect of various rules (Alignment, Cohesion, Separation) on fish acceleration. The code of the task will look like this: It’s almost the same as in the example without optimizations, but for now, we only record the acceleration data, we will use it to change the Transform component of the object later. In this example, it is important to pay attention to the attributes near NativeArray. By default, when declaring a variable that is linked to shared memory, the Job System allows both read and write operations on that variable. This may not be completely safe and affects optimization. To improve task performance, you can specify at once what access Job will have to this variable.The [NativeDisableParallelForRestriction] attribute allows us to write and read data from NativeArray by indices that are initially not available to our parallel task (since a separate part of the array is allocated for IJobParallelFor).This is a dangerous solution that can easily lead to RaceCondition, but in our case, we agreed to use this attribute for clarity so that the strength of interference avoidance can be calculated. RayCastDirectionHits is an array that collects the result of checking raycasts for nearby obstructions. For each fish, there can be multiple directions where to look for obstacles. Therefore, in one array we record all the results of raycasts (Number of all fish * number of directions to check for obstacles) and then get by indexing the ones we need for a particular fish. Calculating obstacle avoidance To find the nearest obstacles we need to run RayCast commands. Unity Job System gives us a convenient API with which we can access the physics engine from other threads. To do this, as you can see in the example, we need to declare two arrays – an array of commands, and an array of results. And then use RayCastCommand.ScheduleBatch() to transfer the execution of raycasts to other threads.You can see that we have PrepareRayCastJob – the task is created to create commands for each fish and do it in multithreaded mode. It’s important to pay attention to the lines of code where we schedule our tasks. At the end of each task, we pass a reference to the JobHandle of the previous task — this is called JobDependencies.By passing in the current task’s argument to another task’s handler, the Job System will know that before it can start executing the current task, it must first finish executing the past task that was passed in as an argument. In this way, we can create chains of Job execution. Fish movement There won’t be anything special about this task except that it will inherit from IJobParallelForTransfrom, so in the Execute method we will additionally get another TransformAccess structure that will allow us to manipulate the Transform component of the object. The main thread and how to get this whole chain up and running First, we need to declare and initialize all the necessary data for the fish: After that, we can move on to creating and scheduling the jobs themselves, the method will look like this: Optimization results Before optimizations, we had 13 fps when spawning 1000 fish. After our optimizations, we were able to increase our frame rate to 37 frames. Burst Compiler Unity has developed a BurstCompiler specifically for the Job System, which speeds up the execution of tasks a lot. For Unity to execute Job’s using BurstCompiler, you only need to add the [BurstCompile] attribute above the tasks themselves. After we have added the attribute we have 67 frames per second. Spawning Enemies In Ocean Keeper we need to spawn a lot of enemies, and finding spawn points for them can be a very expensive operation, as we need to:the spawn point was on NavMesh;was under a protective dome (a mechanic in the game that limits the player’s combat zone);was at a certain distance from the player.Finding such a position for a thousand enemies can be a very difficult operation, but the Job System allowed us to find at least one million such points without any visible slippage. The task itself looks like this: The main difficulties were:random number generator (created it with Unity. Mathematics library, you can see how later);NavMesh API inside Job System;NavMesh API was a challenge for us because there is very little information. It is not easy to pass something to Jobs; the package is general experimental. Therefore, it is better to use it only as a last resort.On the main thread, the code looks like this: Optimization results Without optimizations, searching a million items takes 140ms. With optimizations, the search for a million positions takes 4ms. P.S. The EventSystem.Update method is called since the search starts when the button is pressed.Consequently, the operation was accelerated by 35. Difficulties when using Job System The main problem with the Job System is that it has a lot of limitations, unlike, for example, the standard ThreadPool. But thanks to these limitations Unity development was able to make such an efficient tool for multithreading and therefore Job is much faster than the same Task.Another issue that also prevents the Job System from being quickly integrated into projects is the lack of information. The tool is relatively new, many packages are still experimental, and there is little information on forums. Therefore, Job development often a lot of iteration and testing. One last thing There are no reasons not to use Job System, and the impact of potentially heavy operations being written in Job System is very large. There is a popular practice to optimize projects already at the end when the main bottlenecks are clear. However, not any code can be easily rewritten for Jobs, so it is desirable to notice potentially heavy operations at once and translate them to multithreading. Thanks to the Safety System, the risk of catching some serious error is very small, so there is nothing to lose here. Moreover, if your project is written on ECS, the Job System can allow you to port a game even to a calculator. Submit Rating Average rating 4.4 / 5. Vote count: 12 No votes so far! Be the first to rate this post. We are sorry that this post was not useful for you! Let us improve this post! Tell us how we can improve this post? Submit Feedback ❤️ 👽 😎 🎮 👻 Get the Latest News in GameDev industry 😎