Long Live And Render (VII)


Yes! Shadows are finally working (again).

A higher resolution video can be found here

This is a big achievement because it’s the first real use of multiple render passes and shared attachments.

And it makes everything look nicer.

I always said that Vulkan is difficult to work with, but I do like how easy it is use attachments as textures. I guess I reached the point where I’m actually seeing the benefits of this new API (other than just better performance, of course).

Regarding shadows, they’re created from a single directional light. You might have noticed that the shadows are actually incorrect, because directional lights are supposed to cast parallel shadows using an orthographic projection. I am using a perspective projection instead (shown in the little white rectangle at the bottom right corner), but just because it makes the final effect look nicer. The final implementation will have correct shadows for directional lights, of course.

Let’s talk about descriptors sets

In Vulkan, descriptors are used to bind data to each of the different shader uniforms. Resembling newer OpenGL versions, Vulkan allows us to group multiple values into descriptors (uniform buffers), reducing the number of bind function calls.

But that’s only the beginning. In Vulkan, we can also group multiple of those descriptors together into descriptor sets, and each of them can be bound with a single draw command. So, we only need to create one big set with all the descriptors required for all shaders, then bind it with a single function call and be done with it, right? Well, not really (*).

Where’s the catch, then? We do want to minimize the number of descriptor sets, of course, but as the number of sets decreases, the amount of data we need to send in each of them increases. Therefore, a huge, single set approach leads to binding the whole set once per each object and render pass. Not ideal.

What we actually need is to group descriptors together depending on the frequency in which they are updated during a frame.

For example, consider shaders requiring uniforms like model, view and projection matrices to compute a vertex position. The last two of those matrices are only updated whenever the camera changes, which means that their values remain constant for all objects in our scene. On the other hand, the model matrix only needs to be updated when a model changes its pose. If the camera does change but the object itself remains stationary, there’s no need to update the model matrix. This is specially true when rendering the scene multiple times, like when doing shadows or reflections.

Then, we need two different sets, both of them being updated at different times. The first set contains the view and projection matrices and is updated only if the camera frame changes. The second set only contains the model matrix and it’s updated once per object (regardless of in which render passes is used).

In practice, shaders need much more data than just a bunch a matrices. There are colors, textures, timers, bone animation data, lighting information, etc. But we cannot create too many sets either since each platform defines a different limit for how many descriptor sets we can bind at the same time (Vulkan specs says that the minimum is four). Therefore, I’ve consider creating the following groups:

  • Render Pass specific descriptors
    This are all the descriptors that change once per render pass. Things like view/projection matrices, time, camera properties (FOV, near and far planes), each of the lights in the scene, shadow maps, etc.
  • Pipeline/Shader specific Descriptors
    Values that are required for shaders to work, like noise textures, constants, etc.
  • Material specific Descriptors
    These are the values for each property in the material, like colors, textures, normal maps, light maps, ambient occlusion maps, emission color, etc.
  • Geometry Descriptors
    These are values that affect only geometries, like the model matrix, bone indices, light indices, normal maps, etc.

Separating material and geometry descriptors is important. For example, if we’re rendering shadows, we don’t need the object colors. Just its pose and animations, for example.

Most importantly, these groups can change and be mixed however we like. If we update descriptors for a render pass, the scene will be rendered in a completely different way. We can also change materials without affecting the topology of the objects.

Up next…

There are lots of Vulkan features that I haven’t even look at yet, but there’s one in particular that I need to implement before I’m able the merge branch into the general development one: compute operations.

I want to be able to execute compute operations in the GPU for image filtering and/or particle systems, but that requires a lot of more work.

June is going to be a busy month…

(*) I actually did that a while ago when working on the Metal-based renderer. I did not really understand at the time how uniforms were supposed to be bound, so I made one big object including everything. That’s the reason why there’s only one shader in Le Voyage and no skeletal animation, basically.

Long Live And Render (V)

Frame graphs…

When I started working on this crusade to support Vulkan and improve the rendering system I knew that this day would come eventually.

Well, they day is finally here, although it wasn’t a just day, but more like a couple of months.

This is the story of how I implemented frame graphs in Crimild. And how I completely missed the point and now I need to start over.

What is a Frame Graph?

There are several interpretations of what Frame Graphs are. And I even have my own one too.

I like to think of them as knowledge. Knowledge about the resources and processes that are required to render a frame on screen. That is, everything from buffers and images to render passes and presentation. And knowledge about how those resources and process relate with each other. With that knowledge, I’m able to answer questions like:

  • Which processes depend on a given resource?
  • What resources are generated by each render process?
  • What resources are required to generate the final image (or any of transient ones)?
  • In which order do we need to execute each of the different render passes?

It’s easy to visualize dependencies and execution order when you have only a couple of render passes that generates just a few images. Yet those things become very complex very fast as as more and more effects and intermediate resources are needed.

Where things started to get complicated

While working on Render Graphs, which were OpenGL-based, things were pretty simple. There were only two concepts: render passes and images (a.k.a. attachments). A render pass connects to another through one or more images, which themselves are the result of other render operations (or a file). That was it. Like I said: simple yet powerful.

Enters Vulkan. All of a sudden now we need to keep track of sub-passes, buffers, images, views, etc, as well as dependencies between them and multiple synchronization barriers.

My first approach was to just create an abstract rendering layer starting adding new classes for all those new concepts and using an adjacency matrix to handle dependencies between.

That was my first mistake.

Things become complicated very fast and I saw myself attempting to understand how to connect everything. My notebook was filled with pages looking like this:

I even went as far as designing a set of nodes in draw.io in order to create diagrams to hopefully make things clearer:

It was crazy and I wasn’t getting any results. More importantly, I was getting stuck and becoming very frustrated.


At that point I took some time off to gain some perspective and to rethink what I was doing and where I needed to go. As it turned out, the solution had been in front of me since the beginning: KISS.

I already had a working abstraction for a render graph based on OpenGL which not only was working correctly but also it was simple and very powerful. So, why not use something like that for Vulkan too?

Then, I started over and implemented a simple frame graph using just render passes and attachments. No explicit sub-passes, no explicit framebufers definitions, no explicit dependencies. And guess what: it worked.

Starting from that simple implementation, I keep adding features as needed instead of trying to tie everything up front from day one. A few days later I had a working example that used offscreen render passes to simulate reflections:

But even then I had to do a lot of hacks to make it work and the resulting API was not really useful.

And then I realized I had failed again.

Why did I fail?

There are several problems with my current implementation of frame graphs:

  1. The frame graph keeps strong references to resources. Although this made some sense at first, the truth is there is no easy way to delete resources anymore. If a player destroys an enemy in a game, we need not only to destroy some objects in the scene, but also manually remove each of the render resources from the frame graph, which is not only a pain, but also it could easily lead to memory leaks if you forget anything.
  2. When constructing the scene, we need to explicitly add resources to the frame graph. While some of them could be added automatically when compiling the graph (I’m doing that right now), others, like render passes and attachments, can’t. This can easily lead to all kinds of problems as well.
  3. We need to rebuild the entire frame graph from scratch whenever resources or passes are added/removed, since we don’t have an automatic way to know when that happens. This is not only a problem for the developer, but also a performance hit.

Back to square one.

Still, I’m happy that at least I have something working, even if now I need to throw it away and start again. Maybe I’ll have more luck next time.

Rendering to Texture

Happy 2017! I know, it’s almost February, but better late than never, right?

[EDIT: I was so excited that this demo actually worked that I didn’t even realized this wasn’t the first post of the year]

Anyway, I just finished this demo and I wanted to share it with you. Simply put, I’m rendering lots of characters on screen using the impostor technique, which relies on rendering a single model in an offscreen buffer. Then, multiple quads are drawn using the output of that buffer as texture.

Et voilá!

In order to achieve this goal, I had to make some changes in how scenes are rendered (is this the first refactor of the year? No). Since we’re using multiple cameras, we needed a way to define which one is the main one (we cannot rely on the Simulation to that for us any longer). Also, a new render pass is required to draw the model on a texture. And so the OffscreenRenderPass (were you expecting something else?) was born. But that’s pretty much it.

As usual, this is a new feature and therefore… unstable. Do not try it at home (yet).