Long Live And Render (VII)


Yes! Shadows are finally working (again).

A higher resolution video can be found here

This is a big achievement because it’s the first real use of multiple render passes and shared attachments.

And it makes everything look nicer.

I always said that Vulkan is difficult to work with, but I do like how easy it is use attachments as textures. I guess I reached the point where I’m actually seeing the benefits of this new API (other than just better performance, of course).

Regarding shadows, they’re created from a single directional light. You might have noticed that the shadows are actually incorrect, because directional lights are supposed to cast parallel shadows using an orthographic projection. I am using a perspective projection instead (shown in the little white rectangle at the bottom right corner), but just because it makes the final effect look nicer. The final implementation will have correct shadows for directional lights, of course.

Let’s talk about descriptors sets

In Vulkan, descriptors are used to bind data to each of the different shader uniforms. Resembling newer OpenGL versions, Vulkan allows us to group multiple values into descriptors (uniform buffers), reducing the number of bind function calls.

But that’s only the beginning. In Vulkan, we can also group multiple of those descriptors together into descriptor sets, and each of them can be bound with a single draw command. So, we only need to create one big set with all the descriptors required for all shaders, then bind it with a single function call and be done with it, right? Well, not really (*).

Where’s the catch, then? We do want to minimize the number of descriptor sets, of course, but as the number of sets decreases, the amount of data we need to send in each of them increases. Therefore, a huge, single set approach leads to binding the whole set once per each object and render pass. Not ideal.

What we actually need is to group descriptors together depending on the frequency in which they are updated during a frame.

For example, consider shaders requiring uniforms like model, view and projection matrices to compute a vertex position. The last two of those matrices are only updated whenever the camera changes, which means that their values remain constant for all objects in our scene. On the other hand, the model matrix only needs to be updated when a model changes its pose. If the camera does change but the object itself remains stationary, there’s no need to update the model matrix. This is specially true when rendering the scene multiple times, like when doing shadows or reflections.

Then, we need two different sets, both of them being updated at different times. The first set contains the view and projection matrices and is updated only if the camera frame changes. The second set only contains the model matrix and it’s updated once per object (regardless of in which render passes is used).

In practice, shaders need much more data than just a bunch a matrices. There are colors, textures, timers, bone animation data, lighting information, etc. But we cannot create too many sets either since each platform defines a different limit for how many descriptor sets we can bind at the same time (Vulkan specs says that the minimum is four). Therefore, I’ve consider creating the following groups:

  • Render Pass specific descriptors
    This are all the descriptors that change once per render pass. Things like view/projection matrices, time, camera properties (FOV, near and far planes), each of the lights in the scene, shadow maps, etc.
  • Pipeline/Shader specific Descriptors
    Values that are required for shaders to work, like noise textures, constants, etc.
  • Material specific Descriptors
    These are the values for each property in the material, like colors, textures, normal maps, light maps, ambient occlusion maps, emission color, etc.
  • Geometry Descriptors
    These are values that affect only geometries, like the model matrix, bone indices, light indices, normal maps, etc.

Separating material and geometry descriptors is important. For example, if we’re rendering shadows, we don’t need the object colors. Just its pose and animations, for example.

Most importantly, these groups can change and be mixed however we like. If we update descriptors for a render pass, the scene will be rendered in a completely different way. We can also change materials without affecting the topology of the objects.

Up next…

There are lots of Vulkan features that I haven’t even look at yet, but there’s one in particular that I need to implement before I’m able the merge branch into the general development one: compute operations.

I want to be able to execute compute operations in the GPU for image filtering and/or particle systems, but that requires a lot of more work.

June is going to be a busy month…

(*) I actually did that a while ago when working on the Metal-based renderer. I did not really understand at the time how uniforms were supposed to be bound, so I made one big object including everything. That’s the reason why there’s only one shader in Le Voyage and no skeletal animation, basically.

Long Live and Render (VI)

In my last post I made it clear that there were several problems with my latest frame graph changes. Here I am today, a couple of weeks later, and I’m going to tell you how I managed to fix all three of them (well, two and a half), as well as making some bonus improvements on the way. 

Removing strong references

I made frame graphs to keep strong references to resources and render passes because it made sense at the time. But if any particular resource (like textures or vertex buffers) is no longer attached to a scene, there’s no point in keeping them alive since they won’t be rendered anyway, right?

This problem was pretty easy to solve, actually.

I only had to switch smart pointers for weak ones in most places, preventing any explicit or implicit strong reference to resources and render passes. Notice that I said *most* places, since the frame graph does allocate some internal objects (in the form of nodes) and I do need strong references for those.

There’s obvious side effect, though. Now it is mandatory for developers to keep track of all created objects in their apps because the engine might not do it automatically. Otherwise you’ll end up with crashes due to null pointers or invalid access to deleted memory. It’s an acceptable price to pay.

I could have added a storage policy to customize this behavior, but I do think that this is the right way. And I can add that policy later if I feel the need for it.

Moving on…

Automatic object registration to frame graphs

Another problem with my latest approach is that we need to add/remove objects to/from the frame graph manually. As I explained before, this is not only cumbersome but also very error prone. Specially now that the frame graph no longer keeps strong references to its objects and forgetting to remove a resource that was previously deleted from the scene will result in a very difficult crash to debug. 

I made a couple of decisions to solve this one:

First, any object that can be added to a frame graph will attempt to automatically do so during its construction, provided a frame graph instance exists. It will also attempt to remove itself automatically during destruction.

How can we tell if a frame graph exists? Easy: frame graphs are singletons (oh, my God! He said the S… word). After giving it some thought, I realize I don’t need more than one frame graph instance at any given point in time. I might be wrong about this, but I can’t think of any scenario where two or more instances are needed. I guess time will tell.

Second problem solved.

Rebuilding the frame graph automatically

To be honest, I didn’t want to spent too much on this problem at the moment. Fixing this particular issue could easily become a full-time job for several days or weeks, so I went with the easiest solution: whenever we add or remove and object, the frame graph is automatically flagged as dirty and will be completely rebuilt from scratch at some point in the future (i.e. during next simulation loop). 

This “solution” works, but is far from the most efficient one, of course. After all, why rebuilding the complete frame graph if we’re only adding a new node to the scene? Do we really need to record every single command buffer again? Well, I guess not. Maybe a change in a scene should not recreate command buffers for post-processing resources, since those are independent of the number of nodes in the 3D space. But the rules for making these decisions are not that simple. What if the nodes that were added to the scene are new lights and they change the way we do tone mapping, for example? 

Rebuilding everything from scratch is the safest bet here for now. And, in the end, this behavior is completely hidden inside the frame graph implementation and can (and will) be improved in the future. 

So, I say this is partially fixed, but still acceptable.

Bonus Track: Fixing viewports

I was not expecting to make any more changes, but when I was cleaning up the offscreen rendering demo I noticed a bug in the way viewports were set for render passes.

I wanted to have more control over the resolution at which render passes are render. For example, I needed an offscreen scene at a lower resolution but it wasn’t working correctly (ok, it was not working at all).

Now it does, which allows me to have render passes at different resolutions:

Notice how an offscreen rendering at lower resolution produces only a pixelated reflection. I can do the same for the whole scene, too. And I can combine them at will. Pretty neat.

Up Next

I’m quite happy with these fixes and the frame graph feels a lot more robust now.

And what is perhaps more important, the frame graph API can be made completely hidden to end users too. It would be easy to provide a default frame graph instance (as part of a simulation system, for example) and then an application developer can add/remove objects at will from scenes without worrying about the frame graph at all.

The next step will be to improve shader bindings (aka descriptor sets), which is something that is still very cumbersome. Or maybe I’ll do something more visual, like shadows. Or both 🙂

See you next time!

Long Live And Render (V)

Frame graphs…

When I started working on this crusade to support Vulkan and improve the rendering system I knew that this day would come eventually.

Well, they day is finally here, although it wasn’t a just day, but more like a couple of months.

This is the story of how I implemented frame graphs in Crimild. And how I completely missed the point and now I need to start over.

What is a Frame Graph?

There are several interpretations of what Frame Graphs are. And I even have my own one too.

I like to think of them as knowledge. Knowledge about the resources and processes that are required to render a frame on screen. That is, everything from buffers and images to render passes and presentation. And knowledge about how those resources and process relate with each other. With that knowledge, I’m able to answer questions like:

  • Which processes depend on a given resource?
  • What resources are generated by each render process?
  • What resources are required to generate the final image (or any of transient ones)?
  • In which order do we need to execute each of the different render passes?

It’s easy to visualize dependencies and execution order when you have only a couple of render passes that generates just a few images. Yet those things become very complex very fast as as more and more effects and intermediate resources are needed.

Where things started to get complicated

While working on Render Graphs, which were OpenGL-based, things were pretty simple. There were only two concepts: render passes and images (a.k.a. attachments). A render pass connects to another through one or more images, which themselves are the result of other render operations (or a file). That was it. Like I said: simple yet powerful.

Enters Vulkan. All of a sudden now we need to keep track of sub-passes, buffers, images, views, etc, as well as dependencies between them and multiple synchronization barriers.

My first approach was to just create an abstract rendering layer starting adding new classes for all those new concepts and using an adjacency matrix to handle dependencies between.

That was my first mistake.

Things become complicated very fast and I saw myself attempting to understand how to connect everything. My notebook was filled with pages looking like this:

I even went as far as designing a set of nodes in draw.io in order to create diagrams to hopefully make things clearer:

It was crazy and I wasn’t getting any results. More importantly, I was getting stuck and becoming very frustrated.


At that point I took some time off to gain some perspective and to rethink what I was doing and where I needed to go. As it turned out, the solution had been in front of me since the beginning: KISS.

I already had a working abstraction for a render graph based on OpenGL which not only was working correctly but also it was simple and very powerful. So, why not use something like that for Vulkan too?

Then, I started over and implemented a simple frame graph using just render passes and attachments. No explicit sub-passes, no explicit framebufers definitions, no explicit dependencies. And guess what: it worked.

Starting from that simple implementation, I keep adding features as needed instead of trying to tie everything up front from day one. A few days later I had a working example that used offscreen render passes to simulate reflections:

But even then I had to do a lot of hacks to make it work and the resulting API was not really useful.

And then I realized I had failed again.

Why did I fail?

There are several problems with my current implementation of frame graphs:

  1. The frame graph keeps strong references to resources. Although this made some sense at first, the truth is there is no easy way to delete resources anymore. If a player destroys an enemy in a game, we need not only to destroy some objects in the scene, but also manually remove each of the render resources from the frame graph, which is not only a pain, but also it could easily lead to memory leaks if you forget anything.
  2. When constructing the scene, we need to explicitly add resources to the frame graph. While some of them could be added automatically when compiling the graph (I’m doing that right now), others, like render passes and attachments, can’t. This can easily lead to all kinds of problems as well.
  3. We need to rebuild the entire frame graph from scratch whenever resources or passes are added/removed, since we don’t have an automatic way to know when that happens. This is not only a problem for the developer, but also a performance hit.

Back to square one.

Still, I’m happy that at least I have something working, even if now I need to throw it away and start again. Maybe I’ll have more luck next time.