The Ghost of Refactors Past…

…has come to hunt me once again. Although this time it’s not because of mistakes that I did. Instead, the problem lies in something that I missed completely.

Can you spot the problem in the following code?

void SomeObject::save( Stream &s )
{
   std::size_t count = getElementCount();
   s.write( count );
}

Well, it turns out std::size_t is NOT PLATFORM INDEPENDENT. And here’s the twist: I knew that since, well, forever, but I never paid any attention to it. That is, until it became a problem. And the problem was EVERYWHERE in the code.

First thing first. The C++ standard has this to say about std::size_t:

typedef /*implementation-defined*/ size_t;

 

What’s that supposed to mean? Basically, std::size_t may have different precision depending on the platform. For example, in a 32-bit architecture, std::size_t may be represented as a 32-bit unsigned integer. Something similar happens in a 64-bit platform.

Wait.

std::size_t is supposed to help portability, right? That’s its whole purpose. And that’s true, of course. There’s nothing wrong with std::size_t itself.

Check the code above again. Go on, I’ll wait.

So, whenever we create a [binary] stream to serialize the scene, we can’t use std::size_t because it’s just plain wrong. It will be saved as a 32-bit integer in some platforms and 64-bit in others. Later, when the stream is loaded, the number of bytes read will depend on whatever the precision of the current platform is, regardless of what was used when saving. See the problem?

This means that we can’t share binary streams between different platforms because the data might be interpreted in different ways, leading to errors when generating scenes.

For the past few years, my main setup have been OS X and iOS, both 64-bit platforms. But one day I had to use streaming on 32-bit Android phones and, as you might have guessed by now, all hell break loose…

Entering crimild::types

I had to made a call here: either we can keep using std::size_t everywhere and handle the special case in the Stream class itself; or we can make use of fixed precision types for all values (specially integers) and therefore guaranteeing that the code will be platform independent.

I went for the second approach, which seems to me to be right choice. At the time of this writing, the new types are:

namespace crimild {

   using Int8 = int8_t;
   using Int16 = int16_t;
   using Int32 = int32_t;
   using Int64 = int64_t;

   using UInt8 = uint8_t;
   using UInt16 = uint16_t;
   using UInt32 = uint32_t;
   using UInt64 = uint64_t;

   using Real32 = float;
   using Real64 = double;

   using Bool = bool;
 
   using Size = UInt64;
}

As you can see, crimild::Size is always defined as a 64-bit unsigned integer regardless of the platform.

Yet that means I need to change every single type definition in the current code so it uses the new types. As you might have guessed, it’s a lot of work, so I’m going to do it on the fly. I think I already tackled the critical code (that is, streaming) by the time I’m writing this post, but there’s still much to be reviewed.

New code already makes use of the platform-independent types. For example, the new particle system is employing UInt32, Real64 and other new types and–

Oh, right, I haven’t talked about the new particle system yet… Well, spoiler alert: there’s a new particle system.

You want to know more? Come back next week, then 🙂

 

 

 

The hazards of a binary file format

For years we’ve been told that we have to store our assets using a binary file format (either proprietary or one of the multiple standard ones) and for a long time I followed this rule by heart with Crimild. In fact, the Streaming system currently implemented in Crimild uses a custom binary file format to store all of the assets in a single file. But is this really the best choice for our engine?

We all know the benefits of a binary file format. First, they (usually) load faster than text based files. In many cases, they will require less storage than text files (I’m not talking about compression, just the fact that a number usually requires a handful of bytes depending on its precision regardless of its actual value), which is perfect for things like networking as well. And, depending on the format in which our data is stored, we might not even have the need to transform the data to suit our memory structures.

But despite those benefits, there are a couple of drawbacks that can make our lives miserable, specially during development time.

Why I don’t like binary files anymore?

It turns out I made a huge (and rookie) mistake in Streaming sub-system. As it is right now, you can save an entire 3D scene into a single binary file, including textures, shaders, scripts and pretty much any other asset. While this may seem like a good design decision, it’s definitely a drawback during development time.

Think about this: any scene in our project can become big enough so that it requires at least two people to work with it. The most common scenario is to have an artists for graphical assets and a script programmer to implement the game logic. Then, each of them has his own copy of the scene and from time to time they have to merge their modifications. But in the real world we know for sure that merging binary files can be a nightmare and anyone familiar with any SCM software like SVN or Git knows that a conflict in a binary file probably leads to a lot rework.

How often could this happen? Well, from my experience this becomes a day-to-day issue even for small teams, leading to a lot of wasted time and frustration. And it’s not an issue with games development only. It happens with Word documents, the Xcode project file itself, visual assets, audio assets, you name it. I’ve been using Unity3D for quite some time now and to be honest this is my only complain about it. After all the effort the guys at Unity made to create such an incredible tool, I still don’t understand why they are using binary files for their projects. Recently, they have recognized this as a problem and they’ll try to address it in their next major version, which seems that I’m in the right track too.

XML as an alternative

I’m considering using XML to store scene information, having all assets (like textures, shaders, scripts, etc) as separated files and using some sort of source URL to reference them. Why XML? First, because it’s a text based format. Even in the worst case when an automatic merging tool cannot deal with a conflict, there’s always a chance to do it manually. Plus, we can see the changes made using a diff application.

Secondly, by defining an XML schema developers and artists can modify a scene without having an editor. A simple mechanism can be implemented to reload a scene or parts of it (like reload all shaders) in order to minimize the turnaround time.

Am I ditching the support for binary files?

Of course not. XML can be very effective for development, but of course when things get real and we need speed, it’s not the best choice. Then, I’m still keeping the binary file format support, but I intent to use it only for storing scenes that are ready to be deployed in the final version of our game, since they are not supposed to be changed at that point. A simple tool can be implemented to grab any scene in XML and all of its referenced assets and then compile them into a single binary file.

Final thoughts

So, are we ought to use a binary file format in our projects? Definitely. But we don’t have to use them all the time. During development time, our focus should be on the team and their needs in order to guarantee the best outcome. We always have the chance to develop that set of scripts or tools that will automatically convert all of our assets into the best possible format for our game when actually needed.