(this)</code> to be safe. Without this, creating two shared_ptr
s that reference the same object will cause a crash or worse when one of the two shared reference counts reach zero but the other is still alive.</p></strike>
Update: While this implementation detail was true of some of the shared_ptr
implementations I had used, it is not true of Boost nor of Microsoft's implementation, and given the feedback I've received on this post, it's also not true of GNU's libstdc++ nor Clang's libc++. That'll teach me for not double-checking the most popular implementations. The only way to get the fused allocation on these modern implementations is to use make_shared
, and using enabled_shared_from_this
unnecessarily will actually just.
This at least show cases one important point: if shared ownership is needed with shared_ptr
that use must be encoded into the type of the object. This is true of most other shared ownership semantics possible in C++. shared_ptr
simply hides this fact and lets the wrong thing be done far too easily.
Reference Bouncing
The copy constructor is a problem. This is a general theme of many C++ types, especially in light of C++11 and its move semantics. Because copy construction is explicit, copies can be made in all kinds of scenarios where it's not ideal. The specific problem with shared_ptr
is that every copy must dereference the pointer to the reference count to increment the value. Memory access aren't free, especially in cases where the copies are part of ownership transfer and the object has no other need to be dereferenced.
Related to the copy constructor is the destructor. Many naive pieces of code that involve copy constructors will also necessarily invoke a destructor. Where the copy constructor has to increment the reference count, the destructor must decrement it.
A safer handle type could only allow implicit moves (in the few cases moves are implicit) and require explicit copies via a .copy()
method or the like. Even before C++11, the copy constructor can be made private and unimplemented to ensure it cannot be misused.
Atomics Overhead
The shared_ptr
semantics require that the type be safe to use with multiple threads. This in turn requires that the reference count itself by managed using atomic integers such that increments and decrements will be synchronized properly. If the object being shared does not need to be shared between threads, the overhead of an atomic is unnecessary.
Even when a shared object is passed between threads, the ideal case is one in which the object is created on one thread and consumed on another. The transfer of the ownership can be expressed with a move rather than a copy. There is again no need for a shared object in this case. Not only can a custom shared_ptr
replacement avoid unnecessary reference modifications, it could also detect any cases where the object is bouncing between threads in debug builds and raise errors.
Safe Weak Pointers
One of the perceived advantages of shared_ptr
is that it enabled use of weak_ptr
. There are certainly a great many potential use cases for a safe weak reference. A Spaceships-style game for instance may want to allow a homing missile to hold a reference to the target it has locked on to but in a way that allows the target to be destroyed safely without causing the homing missile code to crash.
There are other ways of achieving this goal than weak_ptr
. Games have been using weak object handles long before even Boost's version of shared_ptr
and weak_ptr
came around.
weak_ptr
can have its own disadvantages, too. If it's used with the enable_shared_from_this
base class then the memory of the shared resource will not be freed until the last weak_ptr
is reset. Of course, as we went over before, not using enable_shared_from_this
can be problematic in other ways.
If shared_ptr
is being used, weak_ptr
is an essential tool. It's a fallacy to think that weak_ptr
is essential or that as a result shared_ptr
is essential. Alternatives to weak_ptr
will be covered later when I go over alternatives to shared_ptr
.
More on Ownership
The problems with shared_ptr
go beyond the implementation in C++. Shared ownership semantics are problematic even in languages like C# where resources are automatically reclaimed once unused. It's useful to examine automatic GC'd languages as a case study on shared ownership problems.
Automatic GC Woes
Automatic GC systems are often billed as solving issues with memory leaks. This is not quite true, however. Sure, a large class of potential leaks are out of the picture once an automatic GC is used (especially once that doesn't use reference counting and has no issues with cyclic references). Logical errors can still result in an excess of objects - ones with live references - are kept around despite never being used, however.
A simple thought exercise that clearly illustrates the issue is to think of a list of recently used files in a GUI application. These lists are often capped to 10 entries or so; if the data structure has more than the UI will display, those entries are inaccessible. An application could use a linked list or dynamic array and simply append an element every time a file is opened. This data structure will grow without limit as the user opens files despite only a small subset of it ever actually being accessed by the program's logic. This thought exercise is very similar to a class of bugs I've actually seen in real software; the point is simply that having a GC does not remove the need for a software engineer to actually think about what they're doing. It doesn't solve all ownership problems. The problems outlined previously with the conceptual foils of shared ownership apply as much to a language like C# as they do to C++.
The debugging issues mentioned above about tracking down the owner of a shared reference keeping an object alive is a very real problem in C#, especially for games. Good debugging tools for C# let the developer see the memory usage characteristics of their application. They might show call stacks of when an object was allocated, but I've yet to ever see a C# memory debugger that can show all objects holding references to the target object. If there is a memory spike/exhaustion/leak of same kind, the developer's only recourse is to follow the code and find every place that could refer to the target object and painstakingly track the behavior of each.
It's important to remember that all these issues with GC and shared_ptr
are not limited to just memory. File handles, sockets, application state, and so on are are all resources that are cleaned up. The automatic GC systems are all about memory. Some support "finalization" so that objects holding a file handle or the like will properly clean up those handles. In many cases, however, it is important to have deterministic ordering to the release of these resources, or at the very least to ensure that they are cleaned up as soon as possible after the object becomes unreferenced. Leaving a socket handle alive for too long can exhaust the file/socket handle namespace - which is a much more limited resource than memory - or even cause some unintentional network behavior if the socket is open. Internal application state can also be an issue and non-deterministic release of certain state-managing objects could lead to a variety of bugs.
RAII
RAII (Resource Acquisition Is Initialization) is a paradigm common to C++ that deals with many of the same issues as an automatic GC. unique_ptr
and shared_ptr
are both an application of RAII.
A variety of the problems noted in the previous section are solved by using RAII. It offers deterministic release of resources at well-defined times. Languages like C# or Python have added an incomplete form of RAII by way of using
or with
directives. These directives have the short coming in that they only operate at a scope level and hence only through a single execution tree in the code. An C# IDisposable
object's Dispose
method will not automatically invoke the Dispose
methods of any sub-objects.
RAII is not a panacea. shared_ptr
is an application of RAII for example. It's simply important to note that RAII - if used correctly - allows problems to be solved that automatic GC does not.
Alternatives to shared_ptr
Having gone over all the problems with shared_ptr
it would be useful to understand some alternatives.
Borrowed References vs Ownership
The first thing to realize is that shared ownership is unnecessary, even where a resource may be used by multiple objects which determine a minimum lifetime of that resource. The term I've found useful to use is "borrowed reference."
A simple example would be textures in a graphics engine. A texture might be used by multiple game objects. The lower bound on the lifetime of the texture object is determined by those game objects; it wouldn't be good to free the texture while an object is still using it. The texture can and should be owned uniquely, however. A texture manager object would be an ideal candidate for ownership of all textures.
Each game object then can simply "borrow" a reference to the texture. This would be implemented with some kind of soft-owning pointer. In optimized release builds of the game it might even be a reference count almost identical to what shared_ptr
uses. The difference is that a borrowed reference is more of a request and the texture manager is still in full control of the lifetime of the texture. If the texture manager is asked to release all textures while the game is still running, the textures are released. The borrowed references in the game objects will either be invalidated (which the objects must be ready to deal with) or the handles will automatically start referring to a "missing" or "incomplete" texture object (handy for visual debugging of the game).
Game engines have long supported this kind of borrowed reference using things like unique ID handles. Instead of storing a pointer (smart or otherwise) to an object, store a numeric ID instead (possibly wrapped in a templated type to ensure type safety). Accessing the actual object is as simple as asking the owning manager object to look up the object referenced by that ID. Lifetime can be requested by notifying the manager that the resource is in use (again, possibly with a simple reference count). A smart handle type can make management of all this much easier.
In place of a request model, a borrowed reference can instead be a strong guarantee made by the engine. With this approach the object's holding a borrowed reference are guaranteed that the resource will remain available. The owner is not allowed to release the resource while a borrowed reference exists. By itself, this is little different than shared ownership or shared_ptr
; the difference is explained in the following section.
Debugging Support
An additional feature that is relatively easy to add to the lifetime management of a borrowed reference is a back reference. This is a reverse registration of the object holding a borrowed reference. Each resource has a list of the objects that have a reference to it. The list may be an actual set of pointers to the objects holding the references or it may be a simple string or just a pointer back to the handle (which in C++ could be used to find the real object with a slight bit of manual work in the debugger). I personally prefer the string approach.
The smart handle should disallow copies or implicit moves and only allow explicit moving. This move would require setting the debug name. This can be used to register the name of the object holding the reference.
The owner of the resource can dump out the list of these borrowed references on request. If the choice is made for borrowed references being strong references, an error can be raised which prints out the list of reference holders when the owner of the resource tries to release it. The ownership isn't shared, it's checked.
A simplification of a list of back-references is a hard-coded selection of possible references. A threaded job system for example needs to keep track of jobs in its queue. These jobs may belong to job groups that are used to signal the completion of a batch of jobs. In such a design, the job group can be in used by both the job system itself (each job belonging to the group holds a reference to it) and to the user code (which sets up the job group). A reference count could solve the problem. The job group can have at most two references, however. A simple alternative to is to hard code this knowledge and used two flags instead of a reference count. This has the advantage in that debug code can display which of the two flags is set, indicating why the group is kept alive. In this particular example, the count of non-complete jobs could be used in place of the flag, essentially forming a pair of a reference count from the jobs themselves and a liveness flag from the user code. If the list of owners that are worth keeping track of is large, a plain reference count is certainly simply, but the argument can be made that any objects with that kind of lifetime needs should be redesigned.
Efficiency
For an ID-based alternative to shared_ptr
, certain optimization opportunities become available.
In the example of a graphics system, it's often critical to be able to sort a list of visible objects by the graphics resources they use in order to perform batching and minimize state changes. One of the most efficient ways to do this sorting is to use an integer key for each object that encodes all resources it uses (material, shader package, mesh data, animation state, etc.). Since shared_ptr
is essentially just a wrapper around a pointer, it's difficult to impossible to squeeze more than one of them into a single machine integer. With an ID system it becomes feasible to ensure that each ID remains within some limit, making it much easier to pack all the IDs into a single machine integer (at least on a 64-bit machine).
Another potential efficiency gain with an ID system that is stable is that it could be used for lookups of resources. Files can refer to the ID instead of a string name or the like. This is a really minor gain that's unlikely to be important in many scenarios at all, but it could be useful here and there. It's particularly handy for pre-packed data sources that intended to be copied into memory and used directly with no post-processing or decoding.