Crimild v4.10.0 is out

This is it.

Crimild v4.10.0 is out and this will be the last of the 4.x versions. From now on I’ll be focused on the next major release for Crimild which will bring a lot of changes. 

But first let’s talk about what’s included in v4.10.0:

Render Graphs

I talked about them in the last couple of posts. Render graphs are great for creating highly modular render pipelines by combining different nodes representing render passes and attachments.

There are many nodes included in this release and many more will come in future versions.

Shader Graphs

Although shader graphs were actually introduced in v4.9, I ended up refactoring them to work in a similar way as render graphs do, simplifying both the internal implementation as well as the API.

Now, each node in the graph represents either a variable or an expression, and there’s also a way to discard nodes that are not relevant to the end result. 

The translation to GLSL mechanism has also being simplified, but I guess it could received a little more love in the future.

Most importantly, this newer API allowed me to create… 

Crimild Shading Language

Well, it’s not an actual programming language, but more of a set of functions providing us a way to write shaders in plain C++, disregarding the actual graphics API using for rendering. 

Of all the new features includes in v4.10.0, this is the one that got me more excited and I’m really looking forward to start creating shaders this way.

UI Canvas & Layout

Last but not least, I started working on several tools for creating UI elements, either in screen or world space. As it is right now, only basic UI elements can be created but there’s support for a very expressive set of layout constraints to arrange them in a canvas, with a size defined independently of the actual screen resolution and aspect ratio.

Minor fixes and updates

As usual, new releases come with a bunch of fixes and minor updates to existing features and v4.10.0 is not the exception.

There are a couple of new containers: Digraph and Set.

In addition, many changes have been made to how render resources are internally handled.

And I finally fixed some math bugs that have been causing issues for quite some time.

Please refer to the complete release notes on Github.

What’s next?

Crimild v4.10.0 includes a lot of (experimental) features that are going to became critical players in v5, Crimild’s next major release. 

The biggest goal for next year will be to refactor the entire rendering system, which has become quite limited and it’s time for it to level up. I’ll be focusing first on improving the existing OpenGL renderer before moving to Vulkan. Not sure what will happen with Metal support, though. 

Indeed, next year is going to be very exciting…

Advertisements

Optimizing Render Graphs

Last week we talk about what render graphs are and how they help us build customizable pipelines for our projects due to their modularity.

But render graphs are not only useful because of their modularity. There are also other benefits when we want to optimize our pipeline.

REUSING ATTACHMENTS

Since each render pass may generate one or more FBOs (each including several render targets), it would be great if we can find a way to reuse them and/or their attachments. Otherwise, we’ll quickly run out of memory on our GPU.

How do we achieve reusability? Simple. Let’s go back to the simple deferred lighting graph we saw on our previous post.

The Depth attachment is a full-screen 32-bit floating point texture and it’s pretty much unique since no other attachments share that texture format. We will assume that the rest of the attachments (normal, opaque, lighting, etc.) are also full screen, but they have an RGBA8 color format. 

By looking at the graph, it’s clear that the Normal attachment is no longer needed once we’ve accumulated all lighting information (since no other render pass makes use of it). Therefore, if we manage to schedule the passes correctly, we can reuse that attachment for storing the result of the translucent pass, for example.

An that’s it. Thanks to our graph design, we can easily identify which inputs and outputs each render pass has at the time of it’s execution. We also know how many passes are linked with any given attachment.

There’s a catch, though.

Let’s assume we want to generate a debug view like this one:

Top row: depth, normal, opaque and translucent. Left column: opaque+translucent, sepia tint and UI

In order to achieve that image, we need to modify our render graph to make it look like this:

The final frame (Debug Frame) is created by the Debug render pass, which reads from several of the previously created attachments in order to compose the debug frame that is displayed. This prevents us from reusing attachments completely because all of them are now needed at all times. That might be an acceptable loss in this scenario because it’s only for debug, but you definitely need to plan each dependency correctly if you want to maximize reusability.

For my implementation, I’ve decided to reuse only attachments, while FBOs are created and discarded on demand. This helps minimize memory bandwidth as well as providing maximum flexibility for creating offscreen buffers.

DISCARDING NODES

Another advantage of using render graphs is that we’re able to identify which nodes are actually relevant to achieve the final frame during the graph compilation time. That is, of all the nodes in the graph, we’re only interested in keeping and executing only those are actually connected to the final node, which is the resulting frame for the graph.

For this reason, each render graph define which attachments serves as the resulting frame for the entire process. Depending on which attachment is set as the final frame, some render passes will become irrelevant and should be discarded.

Once again, look at the debug render graph above, paying special attention to the debug nodes at the right.

We have two possible final frames. The one that only contains the scene (bottom center) and the debug one (bottom right).

If we set the scene frame as the resulting frame, then the Debug Pass will be discarded since its result is no longer relevant and the final render graph will look like the one at the very top of this post. Then, after compiling the render graph, the passes will be executed as following:

That’s great, but why? Why adding extra nodes that are going to be discarded anyway? Well, you shouldn’t do that… except that by doing so it will allow you to create something like an ubber-pipeline, including debug nodes and different branches too. Then, by defining which one is the actual final frame (maybe using configuration flags), you can end up producing different pipelines. I know, it might seem counterintuitive at first, but in practice it’s really useful.

Closing comments

I’m going to leave it here for now, since this article has already become much longer than expected. 

Render graphs are kind of an experimental feature at the time of this writing, but I’m hoping they will become one of the key players in the next major version of Crimild. Together with shader graphs, they should help me create entire modular pipelines in plain C++ and forget about OpenGL/Metal/Vulkan (almost) completely. 

Now it’s time to prepare one more release before the year ends 🙂

Customizing render pipelines with render graphs

Attempting to work with advanced visual effects (like SSAO) and post-processing in Crimild has always been very painful. ImageEffects, introduced a while ago, were somewhat useful but limited to whatever information the (few) available shared frame buffers contained after rendering a scene. 

To make things worse, maintaining different render paths (i.e. forward, deferred, mobile) usually required a lot of duplicated logic and/or code and sooner or later some of them just stopped working (at this point I still don’t know why there’s code for deferred rendering since it has been broken for a years at least).

Enter Render Graphs…

WHAT ARE RENDER GRAPHS?

Render graphs are a tool for organizing processes that take place when rendering a scene, as well as the resources (i.e. frame buffers) that are required to execute them.

It’s a relatively new rendering paradigm that achieves highly modular render pipelines which can be easily customized and extended. 

WHY ARE Render Graphs HELPFUL?

First of all, they provide high modularity. Processes are connected in a graph like structure and they are pretty much independent of each other. This means that we can create pipelines by plugging in lots of different nodes together. 

Do you need a high fidelity pipeline for AAA games? Then add some nodes for deferred lighting, SSAO, post-processing and multiple shadow casters.

Do you have to run the game in a low level hardware or mobile phone? Use a couple of forward lighting nodes and simple shadows. Do you really need a depth pre-pass?

In addition, a render graph helps with resource management. Each render pass may produce one or more textures but, do we really need as many textures as passes? Can we reuse some of them? All of them? 

Finally, technologies like Vulkan, Metal or DX12 allow us to execute multiple processes in parallel, which is amazing. But it comes with the cost of having to synchronize those processes manually. A render graph helps to identify synchronization barriers for those processes based on the resources they are consuming.

OK, BUT HOW DO THEY WORK?

Like I said above, a render graph defines a render pipeline by using processes (or render passes) and resources (or attachments), each of them represented as a node in a graph. Here’s a simple render graph implementing a (simplified) deferred lighting pipeline:

The graph is composed by two types of nodes: Render Passes (circles) and Attachments (squares). Passes may read from zero, one or multiple attachments and write to at least one attachment. Attachments are the only way to connect passes together.

For example, in the image above, the Depth Pass will produce two attachments: Depth and Normal. The later one is only needed for lighting accumulations, but the Depth attachment is used multiple times (lighting, opaque and translucent render passes).

Once lighting accumulation is complete, its result is blended together with the one produced by the opaque render pass. Then, we blend the resulting attachment with the one written by the translucent render pass to achieve the final image for the frame.

The following images shows the final rendered frame (big image), as well as each of the intermediate attachments used for this pipeline. Notice that even the UI is rendered in its own texture.

Top row: depth, normal, opaque and translucent. Left column: opaque+translucent, sepia tint and UI

If you want to read more about render graphs, here are a couple of links to articles I used as reference for my own implementation:

In the next weeks I’m going to explain how render graphs help to optimize our pipeline by reusing attachments and discarded irrelevant passes.

Enjoy your coffee!