(Re)Introducing: The Profiler

Once upon a commit, in a branch far, far away, existed an ancient warrior long forgotten. According to legends, the warrior will return shall the kingdom ever became endangered by the Great Delay. And so the warrior… Sorry, this is boring. Guess who’s back! Yes, the Profiler. How did you…? Right, it’s on the title.

When Crimild was still young and unstable (now it’s just old but still unstable) there was a bit of code used to measure execution time for specific tasks. That was in the days prior to SourceForge, so I’m talking about ancient code. As a matter of fact, that implementation for a real-time profiler was based on the one described in the Game Programming Gems 1 book, including the enhancements proposed by Richard “SuperPig” Fine in his Enginuity series of articles.

But the beauty of that code has to do with its simplicity and how it manages to provide a lot of information in real-time. Take a look at the following image, where we can see the output of the real-time profiler:

Screen Shot 2015-02-17 at 12.42.01 PM

At the right-most column, we can see the names of all the active profiles. Indentation is used to indicate a parent-child relationship. For example, we can see the scene and physics update pass running in the background loop, as well as the different steps involved in rendering the scene using the forward render pass.

The first three columns to the left provide the minimum, average and maximum time for each profiler (measured in milliseconds). The current time is displayed next, along with the “call count” for each profiler. Then, we can see that the message dispatching task is executed 34 times in the background thread, adding to an average of 1.2 milliseconds per iteration. On the other hand, swapping buffers is executed only once per frame but its average time is 10 milliseconds! If we ever need to optimize our simulation, that should be our starting point. If only I knew how to do that…

Anyway, the original profiler code was indeed very old, so I was forced to change it quite a bit before using it. For example, I added support for multi-threading (hence the two roots in the last column) and implemented a text output that can be displayed on screen. By default, the profiler information is show in the terminal window.

Using the profiler in code is very, very simple as show in the next example:

void someFunction( void )
{
   CRIMILD_PROFILE( "My profiler" )
   // do some work here
}

The CRIMILD_PROFILE macro creates a profiler instance, which will be destroyed once the function (or block) is completed. Then, the execution time will be computed and saved to be later displayed on screen. Because of the nature of profiling itself, the code to be measured should be meaningful. There isn’t much information gained while profiling a code that is executed at the nano-second level and called only once.

Going a little bit further, there is a profiling task running in the main loop that, among other things, computes averages and resets the counters. The profiling task is disabled by default, meaning that it won’t interfere with our code unless we explicitly enable it. For that purpose, a couple of messages are provided to enable/disable the task and its output, which in turn can be customized to be rendered on-screen or displayed in the terminal where the simulation is running.

In conclusion, a real-time profiler provides a lot of useful information about what’s going on in our simulation without interfering (too much) with the actual code that is being profiled. It is, by no means, intended to replace professional profilers, but it’s still a great tool and requires minimal effort from the developer. Hopefully, it will help me catch any performance problem in future releases.