The hazards of a binary file format

For years we’ve been told that we have to store our assets using a binary file format (either proprietary or one of the multiple standard ones) and for a long time I followed this rule by heart with Crimild. In fact, the Streaming system currently implemented in Crimild uses a custom binary file format to store all of the assets in a single file. But is this really the best choice for our engine?

We all know the benefits of a binary file format. First, they (usually) load faster than text based files. In many cases, they will require less storage than text files (I’m not talking about compression, just the fact that a number usually requires a handful of bytes depending on its precision regardless of its actual value), which is perfect for things like networking as well. And, depending on the format in which our data is stored, we might not even have the need to transform the data to suit our memory structures.

But despite those benefits, there are a couple of drawbacks that can make our lives miserable, specially during development time.

Why I don’t like binary files anymore?

It turns out I made a huge (and rookie) mistake in Streaming sub-system. As it is right now, you can save an entire 3D scene into a single binary file, including textures, shaders, scripts and pretty much any other asset. While this may seem like a good design decision, it’s definitely a drawback during development time.

Think about this: any scene in our project can become big enough so that it requires at least two people to work with it. The most common scenario is to have an artists for graphical assets and a script programmer to implement the game logic. Then, each of them has his own copy of the scene and from time to time they have to merge their modifications. But in the real world we know for sure that merging binary files can be a nightmare and anyone familiar with any SCM software like SVN or Git knows that a conflict in a binary file probably leads to a lot rework.

How often could this happen? Well, from my experience this becomes a day-to-day issue even for small teams, leading to a lot of wasted time and frustration. And it’s not an issue with games development only. It happens with Word documents, the Xcode project file itself, visual assets, audio assets, you name it. I’ve been using Unity3D for quite some time now and to be honest this is my only complain about it. After all the effort the guys at Unity made to create such an incredible tool, I still don’t understand why they are using binary files for their projects. Recently, they have recognized this as a problem and they’ll try to address it in their next major version, which seems that I’m in the right track too.

XML as an alternative

I’m considering using XML to store scene information, having all assets (like textures, shaders, scripts, etc) as separated files and using some sort of source URL to reference them. Why XML? First, because it’s a text based format. Even in the worst case when an automatic merging tool cannot deal with a conflict, there’s always a chance to do it manually. Plus, we can see the changes made using a diff application.

Secondly, by defining an XML schema developers and artists can modify a scene without having an editor. A simple mechanism can be implemented to reload a scene or parts of it (like reload all shaders) in order to minimize the turnaround time.

Am I ditching the support for binary files?

Of course not. XML can be very effective for development, but of course when things get real and we need speed, it’s not the best choice. Then, I’m still keeping the binary file format support, but I intent to use it only for storing scenes that are ready to be deployed in the final version of our game, since they are not supposed to be changed at that point. A simple tool can be implemented to grab any scene in XML and all of its referenced assets and then compile them into a single binary file.

Final thoughts

So, are we ought to use a binary file format in our projects? Definitely. But we don’t have to use them all the time. During development time, our focus should be on the team and their needs in order to guarantee the best outcome. We always have the chance to develop that set of scripts or tools that will automatically convert all of our assets into the best possible format for our game when actually needed.

5 thoughts on “The hazards of a binary file format

  1. Eduard Gibert

    Hello. I’m interested in your project and I would like to contact with you to ask you some questions about your engine.

    Thanks for your time.

    1. Indeed. Although from my experience you have to start working on the binary format (or whatever your production code requires) as soon as possible. When streaming support was added to the engine I was forced to refactor most of the low-level classes in Crimild and it was extremely complex.

      The strategy that I’m considering at the moment is to implement a complete development cycle (develop->integrate assets->build->export to platform X->deploy) with the minimum amount of features (including both binary and XML formats) and then iterate as requirements start to grow. Nothing new here, but it’s the best way to ensure that everything is working.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.