top of page

Specialization

Thread pool and optimization

Background

While studying at The Game Assembly we are given a period of 8 weeks, the first four at 10 hours/week and the remaining is 20 hours/week, where we get to learn about an specific topic of out own choosing.  

Optimization

My choice was optimization. I decided to increase the overall performance of an old game project from school made using an in house engine. The first improvement I did was implementing multithreading via a thread pool in the project. To test the performance increase I added enemies to the scene with threading and without threading to compare the two.  I also implemented geometry instancing for static meshes to improve the overall experience. 

As you can see I got a performance increase of ca 2-2.5 times the old frame rate when rendering 1500 enemies with about 17000 vertices each at the same time using a Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz and a NVIDIA GeForce RTX 2080.

Thread pool.gif

Thread pool running with a frame rate arount 50 fps.

Without thread pool with a frame rate arount 20 fps.

Thread pool

The thread pool is using a job system that pushes jobs via a lambda function and executes them back to front. It has a counter to be able to see if there are any unfinished jobs to be able to halt the program if needed. I never needed to do this since the program ran fine without it and all the systems works without being dependent on each other. 

 

With more time on my hands I would have implemented job stealing and the ability to child jobs to a parent job.

Job funcion.png
Get job.png
Batching jobs

While doing the threading I came to the conclusion that pushing many small jobs decrease the performance of the game. The solution was pushing bigger jobs and partitioning the amount of enemies each thread updates.

Lessons learned

While being keen on increasing performance I scoured through the internet for more possible and fun ways to do this. I came across a technique to write all the matrices for the animated models bones into a texture. Since this sounded like fun I implemented it to enable instanced skeletal meshes as well.

 

Sadly it turned out that the graphics card did actually perform worse when implementing this on skeletal meshes because of the issue with recalculating the texture each frame and sending it to the graphics card. 

 

Would I do it all over again i probably would implement it for looping animations whose world position is static since then you can calculate the texture once and then use the same texture for all the animations. But then again it would be better to use structured buffers to achieve the same thing.

​

Refactoring a project which is more or less done and trying to implement a lot of improvements that the code wasn't designed for is a hard. At the same time its a fun challenge since when you improve it you feel a sense of achievement.

“I am part of The Game Assembly’s internship program. As per the agreement between the Games Industry and The Game Assembly, neither student nor company may be in contact with one another regarding internships before April 23rd. Any internship offers can be made on May 5th, at the earliest.”

Download CV

in
All pixel art by Alexander Andersson
Couldron.gif
bottom of page