Optimizing Render Graphs

Last week we talk about what render graphs are and how they help us build customizable pipelines for our projects due to their modularity.

But render graphs are not only useful because of their modularity. There are also other benefits when we want to optimize our pipeline.


Since each render pass may generate one or more FBOs (each including several render targets), it would be great if we can find a way to reuse them and/or their attachments. Otherwise, we’ll quickly run out of memory on our GPU.

How do we achieve reusability? Simple. Let’s go back to the simple deferred lighting graph we saw on our previous post.

The Depth attachment is a full-screen 32-bit floating point texture and it’s pretty much unique since no other attachments share that texture format. We will assume that the rest of the attachments (normal, opaque, lighting, etc.) are also full screen, but they have an RGBA8 color format. 

By looking at the graph, it’s clear that the Normal attachment is no longer needed once we’ve accumulated all lighting information (since no other render pass makes use of it). Therefore, if we manage to schedule the passes correctly, we can reuse that attachment for storing the result of the translucent pass, for example.

An that’s it. Thanks to our graph design, we can easily identify which inputs and outputs each render pass has at the time of it’s execution. We also know how many passes are linked with any given attachment.

There’s a catch, though.

Let’s assume we want to generate a debug view like this one:

Top row: depth, normal, opaque and translucent. Left column: opaque+translucent, sepia tint and UI

In order to achieve that image, we need to modify our render graph to make it look like this:

The final frame (Debug Frame) is created by the Debug render pass, which reads from several of the previously created attachments in order to compose the debug frame that is displayed. This prevents us from reusing attachments completely because all of them are now needed at all times. That might be an acceptable loss in this scenario because it’s only for debug, but you definitely need to plan each dependency correctly if you want to maximize reusability.

For my implementation, I’ve decided to reuse only attachments, while FBOs are created and discarded on demand. This helps minimize memory bandwidth as well as providing maximum flexibility for creating offscreen buffers.


Another advantage of using render graphs is that we’re able to identify which nodes are actually relevant to achieve the final frame during the graph compilation time. That is, of all the nodes in the graph, we’re only interested in keeping and executing only those are actually connected to the final node, which is the resulting frame for the graph.

For this reason, each render graph define which attachments serves as the resulting frame for the entire process. Depending on which attachment is set as the final frame, some render passes will become irrelevant and should be discarded.

Once again, look at the debug render graph above, paying special attention to the debug nodes at the right.

We have two possible final frames. The one that only contains the scene (bottom center) and the debug one (bottom right).

If we set the scene frame as the resulting frame, then the Debug Pass will be discarded since its result is no longer relevant and the final render graph will look like the one at the very top of this post. Then, after compiling the render graph, the passes will be executed as following:

That’s great, but why? Why adding extra nodes that are going to be discarded anyway? Well, you shouldn’t do that… except that by doing so it will allow you to create something like an ubber-pipeline, including debug nodes and different branches too. Then, by defining which one is the actual final frame (maybe using configuration flags), you can end up producing different pipelines. I know, it might seem counterintuitive at first, but in practice it’s really useful.

Closing comments

I’m going to leave it here for now, since this article has already become much longer than expected. 

Render graphs are kind of an experimental feature at the time of this writing, but I’m hoping they will become one of the key players in the next major version of Crimild. Together with shader graphs, they should help me create entire modular pipelines in plain C++ and forget about OpenGL/Metal/Vulkan (almost) completely. 

Now it’s time to prepare one more release before the year ends 🙂

One thought on “Optimizing Render Graphs

  1. Pingback: Crimild v4.10.0 is out | Crimild

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.