Customizing render pipelines with render graphs

Attempting to work with advanced visual effects (like SSAO) and post-processing in Crimild has always been very painful. ImageEffects, introduced a while ago, were somewhat useful but limited to whatever information the (few) available shared frame buffers contained after rendering a scene. 

To make things worse, maintaining different render paths (i.e. forward, deferred, mobile) usually required a lot of duplicated logic and/or code and sooner or later some of them just stopped working (at this point I still don’t know why there’s code for deferred rendering since it has been broken for a years at least).

Enter Render Graphs…

WHAT ARE RENDER GRAPHS?

Render graphs are a tool for organizing processes that take place when rendering a scene, as well as the resources (i.e. frame buffers) that are required to execute them.

It’s a relatively new rendering paradigm that achieves highly modular render pipelines which can be easily customized and extended. 

WHY ARE Render Graphs HELPFUL?

First of all, they provide high modularity. Processes are connected in a graph like structure and they are pretty much independent of each other. This means that we can create pipelines by plugging in lots of different nodes together. 

Do you need a high fidelity pipeline for AAA games? Then add some nodes for deferred lighting, SSAO, post-processing and multiple shadow casters.

Do you have to run the game in a low level hardware or mobile phone? Use a couple of forward lighting nodes and simple shadows. Do you really need a depth pre-pass?

In addition, a render graph helps with resource management. Each render pass may produce one or more textures but, do we really need as many textures as passes? Can we reuse some of them? All of them? 

Finally, technologies like Vulkan, Metal or DX12 allow us to execute multiple processes in parallel, which is amazing. But it comes with the cost of having to synchronize those processes manually. A render graph helps to identify synchronization barriers for those processes based on the resources they are consuming.

OK, BUT HOW DO THEY WORK?

Like I said above, a render graph defines a render pipeline by using processes (or render passes) and resources (or attachments), each of them represented as a node in a graph. Here’s a simple render graph implementing a (simplified) deferred lighting pipeline:

The graph is composed by two types of nodes: Render Passes (circles) and Attachments (squares). Passes may read from zero, one or multiple attachments and write to at least one attachment. Attachments are the only way to connect passes together.

For example, in the image above, the Depth Pass will produce two attachments: Depth and Normal. The later one is only needed for lighting accumulations, but the Depth attachment is used multiple times (lighting, opaque and translucent render passes).

Once lighting accumulation is complete, its result is blended together with the one produced by the opaque render pass. Then, we blend the resulting attachment with the one written by the translucent render pass to achieve the final image for the frame.

The following images shows the final rendered frame (big image), as well as each of the intermediate attachments used for this pipeline. Notice that even the UI is rendered in its own texture.

Top row: depth, normal, opaque and translucent. Left column: opaque+translucent, sepia tint and UI

If you want to read more about render graphs, here are a couple of links to articles I used as reference for my own implementation:

In the next weeks I’m going to explain how render graphs help to optimize our pipeline by reusing attachments and discarded irrelevant passes.

Enjoy your coffee!

Advertisements

Color masks and occluders

One of the newest features that will be included in the next major release for Crimild (coming soon) is the support for color masks and invisible occluders for our scenes.

Occluders are objects that block the visibility path, fully or partially hiding whatever is beyond them. An invisible occluder behaves in the same way, and while the object itself is not drawn it still prevents objects behind it to be rendered.

For example, in the following scene the teapot (in yellow) is an occluder. Other objects are orbiting around it and the scene is rendered normally.

screen-shot-2016-09-28-at-12-34-10-pm

Original scene with color mask enabled for all channels

By playing around with the color mask and turning it off for all channels we can avoid the teapot itself for being drawn, yet it still blocks the objects that are passing behind it. The green plane is not being affected by this behavior (that’s on purpose).

The effect is a pretty cool one and has a lot of applications in games and simulations. It’s specially useful in augmented reality to mix real-life objects with virtual ones.

 

Praise the Metal – Part 6: Post-processing

I knew from the start that Le Voyage needed some kind of distinctive feature in order to be recognized out there. The gameplay was way too simple, so I focused on presentation instead. Since the game is based on early silent films, using some sort of post-processing effect to render noise and scratches was a must have.

About Image Effects

In Crimild, image effects implement post-processing operations on a whole frame. Ranging from simple color replacement (as in sepia effects) to techniques like SSAO or Depth of Field, image effects are quite powerful things.

Le Voyage makes use of a simple image effect to achieve that old-film look.

Screen Shot 2016-05-15 at 3.03.19 PM

There are four different techniques applied in the image above:

  1. Sepia tone: All colors are mapped to sepia values, which is that brownish tint. No more blues, reds or grays. Just different sepia levels.
  2. Film grain: noise produce in films due to luminosity changes (or so I was told)
  3. Scratches: Due to film degradation, old movies display scratches, burnts and other kind of artifacts after some time.
  4. Vignetting: The corners of the screen look more dark than the center, to simulate the dimming of light. This technique is usually employed to frame important objects in the center of the screen, as in closeups.

All of these effects are applied in a single pass for best performance.

How does it work?

Crimild implements post-processing using a technique called ping-pong, where two buffers are switched back-and-forth, serving as both source and accumulation while processing image effects.

A scene is rendered into an off-screen framebuffer, which is designated as source buffer. For each image effect, the source buffer is bound as a texture and used to get the pixel data that serves as input for an effect. The image effect is computed and rendered to a second buffer, known as accumulation buffer. Then, source and accumulation are swapped and, if there are more image effects to apply, the process starts again.

When there are no more image effects to be processed, the source buffer will contain the final image that will to be displayed on the screen.

Confused? Then the following image may help you (spoiler alert: it won’t):

IMG_0548

Ping-pong buffer. For each image effect, the source and destination buffers are swapped. Once all effects have been processed, the source buffer contains the final image

Le Voyage only uses one image effect that applies all four stages in a single pass. No need for additional post-processing is required. If you want to know more implementing the old-film effect, you can check this great article by Nutty.ca in which I based mine.

Powered by Metal

Theoretically speaking, there’s no difference concerning how this effect is applied in either Metal or OpenGL, as the same steps are required in both APIs. In practice, the Metal API is a bit simpler concerning handling framebuffers, so the code ends up more readable. And there’s no need for all that explicit error checking that OpenGL needs.

If you recall from a previous post where we talk about FBOs, I said that we need to define a texture for our color attachment. At the time, we did something like this:

renderPassDescriptor.colorAttachments[ 0 ].texture = getDrawable().texture;

That code sets the texture associated with a drawable to the first color attachment. This will render everything on the drawable itself, which in our case was the screen.

But in order to perform post-processing, we need an offscreen buffer. That means that we need an actual output texture instead:

MTLTextureDescriptor *textureDescriptor = [MTLTextureDescriptor 
   texture2DDescriptorWithPixelFormat: MTLPixelFormatBGRA8Unorm
                                width: /* screen width */
                               height: /* screen height */
                            mipmapped: NO];
 
id< MTLTexture > texture = [getDevice() newTextureWithDescriptor:textureDescriptor];

renderPassDescriptor.colorAttachments[ 0 ].texture = texture;

The code above is executed when generating the offscreen FBO. It creates a new texture object with the screen dimensions and set it as the output for the color attachment. Notice that we don’t pass any data as the texture’s image since we’re going to write into it.

Keep in mind we need two offscreen FBOs created this way, one that will act as source and the other as accumulation.

Once set, binding this offscreen FBO will make our rendering code to draw everything on the texture itself instead of the screen, which will become our source FBO.

Then we proceed to render the image effect, using a quad primitive and that texture as input and producing the final image into the accumulation FBO, which is then presented on the screen since there are no more image effects.

Again, this is familiar territory if you already know how to do offscreen rendering in OpenGL. Except for a tiny little difference…

Performance

When I started writing this series of posts, I mentioned that one of the reasons for me to work with Metal was that the game’s performance was poor on the Apple TV. More specifically, I assumed that the post-processing step was consuming most of the processing resources for a single frame. And I wasn’t completely wrong about that.

Keep in mind that post-processing is always a very expensive technique, both in memory and processing requirements, regardless of the API. In fact, the game is almost unplayable on older devices like an iPod Touch or iPhone 4s just because of this effect (and those devices don’t support Metal). Other image effects, like SSAO or DOF are even more expensive, so you should try and keep the post-processing step as small as possible.

I’m getting ahead of myself, since optimizations will be subject of the next post, but it turns out that Metal’s performance boost over OpenGL not only lets you display more objects in screen, but also allows for more complex post-processing effects to be applied in real-time too. Even without optimizing the shaders of the image effect, I noticed a 50% increment in performance just by switching the APIs. That was amazing!

Et Voilá

And so the rendering process is now completed. We started with a blank screen, and then proceeded to draw objects in an offscreen buffer. Finally, we applied a post-processing effect to achieve that old-film look giving the game that unique look. It was quite the trip, right?

In the next and (almost) final entry in this series we’re going to see some of the amazing tools that will allow us to profile applications based on Metal, as well as some of the most basic optimization techniques that lead to great performance improvements.

To be continued…