Game Development by Sean

Resource Pipelines (Part 3 - References & Handles)

Table of Contents

A fundamental choice to make for an engine and its associated pipeline is the method by which resources will be referenced.

Series Index

Reference Types in Engine Code

Inside the core of a game engine, it’s important to think about how a resource identifier is represented. Note that we’re talking about the identifier, not a handle; that is, we’re talking about the way in which an unloaded resource is identified and the “name” that will be used to locate and find the resource.

The obvious choice here would be to use a primitive type like a string or an integer that directly stores a resource identifier. Plenty of real-life big engines get by using string names, for example, and there’s nothing inherently wrong with that approach. Unity for instance even exposes GUIDs as strings (instead of more condensed 128-bit opaque value types) in many of its public interfaces. Strings will definitely work.

There are of course numerous potential downsides to the naive approach of using such simple types.

From purely a programming interface perspective, statically typing resource identifiers brings with it all the usual benefits of static typing. For an example of how a lack of resource reference types can hurt, let’s look at Unity again (though Unity is by no means the only example; it’s just one of the better-known ones).

Unity uses both GUIDs and string paths as resource identifiers; the new Addressables package uses an arbitrary user-provided string. The Unity GUIDs are represented as strings rather than System.Guid objects or the like. The end-result is that it’s often very easy to accidentally pass a string containing a GUID to an API expecting a path or vice versa. Code can be written defensively using variable naming, like Apps Hungarian or even just name suffixes like assetGuid vs assetPath, but this falls apart in generic code like LINQ queries where variable names may not even be present.

That alone is reason enough to shy away from using primitive types. However, that’s not even the biggest problem with using primitives for resource identifiers in engine.

Reflection

The real problem with a lack of reference types in the engine is the inability for tooling to automatically do the right thing.

Let’s look once more at Unity. A key feature of the Unity engine is the automatic serialization of objects and their properties. This serialization is used to store component data on disk as well as used to auto-generate reasonably high-quality editing UIs for that game data. Quite a few games never need to write any extensive custom tooling within Unity since the default serialization-based editing experience is fairly complete for most use cases.

With this serialization support, however, Unity has no way to know the semantics of a particular field implicitly. By that I mean that Unity can’t know if a string property represents text to display to the user, the name of a bone on a rig, or the path to an asset. The editing experience falls back to the simplest case of presenting the string as an unconstrained text box.

If that string really were to be an asset path, Unity would not know that. The asset reference would be “invisible” to the engine’s dependency tooling, which defeats various built-in features like the ability to generate complete asset bundles.

If one looks under the hood, we see that when using (for example) a GameObject property, Unity is actually serializing the reference in a much richer fashion. The generated UI displays a nice picker that makes it easy to find objects or prefabs and link them via selection or drag-n-drop. The on-disk format is serialized with the GUID of the asset, which is more durable than the raw asset path (GUIDs are Unity’s solution to the file renaming problem as outlined in the previous installment of this article series). The same is true for various other types of properties, such as Texture2D or Mesh properties.

When we want to extend to custom resource types that aren’t easily or ideally expressed as one of Unity’s native types (or as a MonoBehaviour child), a custom solution is required. Unity offers solutions, and unsurprisingly, one of those is to use a custom type with a custom “picker” and serializer defined via reflection; with some elbow grease, that allows developers to make their own custom resources that are just as richly supposed in the UI as any of Unity’s built-in types (unfortunately, Unity today does not allow for custom dependencies, but that should be coming in future updates).

It’s worth noting perhaps that the Unity Addressables package is basically just an expression of this very idea: its C# code that exposes new wrapper types with custom pickers and serialization support that enables rich UI, stable on-disk serialization, and some dependency tracking.

Now don’t let all that Unity talk muddle our conversion too much, though. Just about everything above applies just as well to other engines and to other languages besides C#.

A bespoke C++ engine still likely has some form of reflection system used to serialize game assets to files or to render debug or editor UI. Said serialization system may not be integrated into the language in the same fashion as with C#, but said system is still possible and are very common throughout the games industry.

In C++, serialization and reflection might require use a code generator, maybe some kind of IDL tool, or maybe just a lot of templates and macros and other generic programming tricks, but few industrial-grade engines are lacking a solution. Using a primitive int or vocabulary std::string, the burden still rests on the programmer to do things right just like the C# code in Unity does.

Optimization

Even without reflection, using explicit types can be a benefit to making code Just Work(tm) more often. Using an explicit type ensures that function overloads can automatically select the correct behavior and ensures that automatic reflection and other binding code can select the right behavior. That all is true of C++, C#, Rust, and even dynamic languages like Python or JavaScript can benefit from such typing in their APIs.

Types also allow variants in different build modes. I previously mentioned that some engines which use paths as their resource identifiers will actually use path hashes at run-time. Those engines will keep string paths around for debugging in some form. Keeping those strings in the same object as the identifier makes that debugging even easier, since the user’s IDE will typically allow viewing both the hash integer and path string together.

In such engines, using a custom reference type ensures that the string can be bundled with the hash in non-production builds while the reference types might be just plain POD integer wrappers in production builds.

Platform could also be a factor; maybe the PC build keeps the extra debugging-friendly data while builds for more constrained mobile or console devices are kept svelte.

Using custom types makes this a lot easier. C or C++ users* *should note that using macros, typedefs, or type aliases can accomplish many of the same goals; those don’t enable function overloading selection but, used carefully, they can still at least make it easy to change the type of resource identifiers in different build modes. I’ll strongly suggest just using concrete types, though, as the extra richness helps both developers and the developers’ tools to catch mistakes.

Dependency Contexts

There’s one more reason to prefer more typing around resource identifiers. That reason is dependency contexts. We’ll talk about this a bit more in-depth in the next installment; the key thing to note for this topic is that not all dependencies are equal. Specifically, while some dependencies are mandatory (aka “hard”), others are merely optional (aka “soft”).

For example, a character game object probably has a dependency on a mesh and some textures (and animations, FX, sounds, etc.). The character cannot be instantiated into the game without these resources being present; otherwise the character would not be renderable, for example. These dependencies are mandatory (at run-time, anyway) for the character to be useful.

Other dependencies may be less important. The character might reference a table of possible loot that the character drops if defeated by a player. This loot doesn’t need to be loaded into memory until the moment the character is defeated, and even then only the randomly-selected loot needs to be loaded and not all the rest.

This distinction impacts various elements of tooling (and we’ll talk about that next time). For today, the key thing to keep in mind is the preloading and engine might want to do at run-time. That is, the engine might want to know that when it loads the character, it should go ahead and also start loading those mandatory graphical resources.

Such support thus either requires that developers explicitly write preloading logic for every possible resource type in the game (characters, equipment, levels, props, vehicles, etc.), or that the engine be able to automatically detect resource references and their dependency context.

Using explicit resource identifier types allows this. It allows the developer to express the difference between hard_resource<texture> and soft_resource<texture>, for example. That allows either automatic reflection-based serialization systems or even simple ol’ fashioned function overloading to enable smarter logic in the engine and hence allows the developers’ code to be both more concise and more correct.

Ultimately, what I’m really saying with this whole post - and which shouldn’t be terribly surprising to any fan of static typing - is that using types allows code to express the semantics of its data and allows tools and library code to act intelligently.

Dynamic Identifiers and Scripts

A funny circumstance I’ve spotted in the wild multiple times are game code bases that follow all of the advice above, use clear semantically-rich static types for resource identifiers, have big rich libraries and tool suites that handle dependencies and editing UIs and all that jazz, but then throw it all away in the gameplay or scripting layer.

In other words, their scripts are filled with things like:

loot = spawn_object(loot_base_resource_identifier + random(0, 10))

That might actually be in a script like a Lua file, or it may even be in the engine’s native code (e.g. C++). The latter has come in the form of things like physics_body = load_havok(get_my_name() + ".hkx") for example. Either way, it’s bad.

The underlying reason why that kind of construct is bad is that it looses all those semantics gained by using types. The meaning of loot_base_resource_identifier is essentially inscrutable to the engine or tools; it doesn’t know that there’s 10 possibilities, or that the code might spawn one of those 10 possibilities.

When resource identifiers are present, never generate them at run-time. That hides them from the engine. Instead, if automatic generation is required, do so either in the tools at authoring time or in the pipeline at resource conversion time.

If random selection of a resource is required, use something like a table of possible resource identifiers. Require scripts or game play code to explicitly reference that table (which might be its own resource and hence have its own unique resource identifier). In that way, the reference to the table and the references within the table itself are both visible to the engine and supporting tooling.

Resource preloading may not be a concern for every engine. In the next article, we’ll cover other important reasons to care about ensuring all these resource references are visible to tooling.

Summary

Independent of the actual form resource identifiers take in an engine, prefer to use concrete distinct types to wrap those identifiers. This exposes the identifiers themselves and their other semantic context to the engine and supporting tooling and allows them to be differentiated from other strings or integers.