-
Notifications
You must be signed in to change notification settings - Fork 139
Smart Pointers overview
This page was created by Boris Belousov in 2011 for MDSP project. While it was published in GoogleCode, it had a great success on the Internet. I have updated things as C++11 was released since. — Pavel Kryukov
Smart Pointers can greatly simplify C++ development. Chiefly, they provide automatic memory management close to more restrictive languages (like C# or VB), but there is much more they can do.
The first part of this article named General Overview answers three questions about smart pointers: what, why and which. If you already know what smart pointers are and interested in using Boost smart pointers in your programs look through the second part of the article named Smart Pointers In The Boost Library.
A Smart Pointer is a C++ object that acts like a pointer, but additionally deletes the object when it is no longer needed. Probably the most common bugs in C++ (and C) are related to pointers and memory management: dangling pointers, memory leaks, allocation failures and other joys. Having a smart pointer take care of these things can save a lot of aspirin...
The simplest example of a smart pointer is unique_ptr
, which is included in the standard C++ library. You can find it in the header <memory>
. Here is part of unique_ptr's implementation, to illustrate what it does:
template <class T> class unique_ptr
{
T* ptr;
public:
explicit unique_ptr(T* p = 0) : ptr(p) {}
~unique_ptr() {delete ptr;}
T& operator*() {return *ptr;}
T* operator->() {return ptr;}
// ...
};
As you can see, unique_ptr
is a simple wrapper around a regular pointer. It forwards all meaningful operations to this pointer (dereferencing and indirection). Its smartness in the destructor: the destructor takes care of deleting the pointer.
For the user of unique_ptr
, this means that instead of writing:
void foo()
{
MyClass* p(new MyClass);
p->DoSomething();
delete p;
}
You can write:
void foo()
{
unique_ptr<MyClass> p(new MyClass);
p->DoSomething();
}
And trust p to cleanup after itself.
- Automatic cleanup. You don't need to remember to free the pointer, and so there is no chance you will forget about it.
-
Automatic initialization. You don't need to initialize the
unique_ptr
tonullptr
, since the default constructor does that for you. - Dangling pointers. A common pitfall of regular pointers is the dangling pointer: a pointer that points to an object that is already deleted.
There are some possible strategies for handling the statement q = p, where p and q are smart pointers, to avoid pointers dangling:
-
Nullify p when it is copied (
unique_ptr
works according to this conception). - Create a new copy of the object pointed by p, and have q point to this copy.
- Ownership transfer: Let both p and q point to the same object, but transfer the responsibility for cleaning up ("ownership") from p to q.
- Reference counting: Maintain a count of the smart pointers that point to the same object, and delete the object when this count becomes zero.
- Reference linking: The same as reference counting, only instead of a count, maintain a circular doubly linked list of all smart pointers that point to the same object.
- Copy on write: Use reference counting or linking as long as the pointed object is not modified. When it is about to be modified, copy it and modify the copy.
Let's take another look at this simple example:
void foo()
{
MyClass* p(new MyClass);
p->DoSomething();
delete p;
}
What happens if DoSomething()
throws an exception? All the lines after it will not get executed and p will never get deleted! If we're lucky, this leads only to memory leaks. However, MyClass
may free some other resources in its destructor (file handles, threads, transactions, COM references, mutexes) and so not calling it may cause severe resource locks.
If we use a smart pointer, however, p will be cleaned up whenever it gets out of scope, whether it was during the normal path of execution or during the stack unwinding caused by throwing an exception.
The simplest garbage collection scheme is reference counting or reference linking, but it is quite possible to implement more sophisticated garbage collection schemes with smart pointers. For more information see the garbage collection FAQ.
A common strategy for using memory more efficiently is copy on write (COW). This means that the same object is shared by many COW pointers as long as it is only read and not modified. When some part of the program tries to modify the object ("write"), the COW pointer creates a new copy of the object and modifies this copy instead of the original object. The standard string class is commonly implemented using COW semantics (see the <string>
header).
string s("Hello");
string t = s; // t and s point to the same buffer of characters
t += " there!"; // a new buffer is allocated for t before
// appending " there!", so s is unchanged.
The C++ standard library includes a set of containers and algorithms known as the standard template library (STL). STL is designed to be generic (can be used with any kind of object) and efficient (does not incur time overhead compared to alternatives). To achieve these two design goals, STL containers store their objects by value. This means that if you have an STL container that stores objects of class Base, it cannot store objects of classes derived from Base. To have a collection of objects from different classes you can use a collection of pointers and since the smart pointer automatically cleans up after itself, there is no need to manually delete the pointed objects.
NOTE: STL containers may copy and delete their elements behind the scenes (for example, when they resize themselves). Therefore, all copies of an element must be equivalent, or the wrong copy may be the one to survive all this copying and deleting. This means that some smart pointers cannot be used within STL containers, specifically the standard unique_ptr and any ownership-transferring pointer. For more info about this issue, see C++ Guru of the Week #25.
The standard unique_ptr
is the simplest smart pointer, and it is also, well, standard. If there are no special requirements, you should use it.
Although you can use unique_ptr
as a class member (and save yourself the trouble of freeing objects in the destructor), copying one object to another will nullify the pointer; using a copied pointer instead of unique_ptr solves this problem.
As explained above, using garbage-collected pointers with STL containers lets you store objects from different classes in the same container.
It is important to consider the characteristics of the specific garbage collection scheme used. Specifically, reference counting/linking can leak in the case of circular references (i.e., when the pointed object itself contains a counted pointer, which points to an object that contains the original counted pointer).
Reference counting might be:
- Intrusive - the pointed object itself contains the count.
- Non-intrusive - requires an allocation of a count for each counted object.
Reference linking does not require any changes to be made to the pointed objects, nor does it require any additional allocations. A reference linked pointer takes a little more space than a reference counted pointer - just enough to store one or two more pointers.
If you have objects that take a lot of space, you can save some of this space by using COW pointers. This way, an object will be copied only when necessary, and shared otherwise. The sharing is implemented using some garbage collection scheme, like reference counting or linking.
For this: | Use that: |
---|---|
local variables | unique_ptr |
class members |
copied pointer |
STL containers |
garbage collected pointer (e.g. reference counting/linking) |
big objects | copy on write |
Since C++11, STL provides the following smart pointer implementations. Analogs of them can be found in Boost.
shared_ptr<T> |
a pointer to T using a reference count to determine when the object is no longer needed. shared_ptr is the generic, most versatile smart pointer offered by boost |
---|---|
weak_ptr<T> |
a weak pointer, working in conjunction with shared_ptr to avoid circular references |
Boost provides the following smart pointer implementations:
| intrusive_ptr<T>
| another reference counting pointer. It provides better performance than shared_ptr, but requires the type T to provide its own reference counting mechanism |
scoped_ptr is the simplest smart pointer provided by STL. It guarantees automatic deletion when the pointer goes out of scope. Sample:
#include <memory>
void Sample1_ScopedPtr()
{
std::unique_ptr<CSample>
samplePtr(new CSample);
if (!samplePtr->Query() )
{
// just some function...
return;
}
samplePtr->Use();
}
use for | automatic deletion of local objects or class members1, Delayed Instantiation, implementing PIMPL and RAII (see below) |
---|---|
not good for | element in an STL container, multiple pointers to the same object |
performance | unique_ptr adds little (if any) overhead to a "plain" pointer, it performs |
1For this purpose, using scoped_ptr is more expressive than the (easy to misuse and more complex) std::unique_ptr; using scoped_ptr you indicate that ownership transfer is not intended or allowed.
The "normal" reference counted pointer provided by STL is shared_ptr (the name indicates that multiple pointers can share the same object). Sample:
void Sample2_Shared()
{
// (A) create a new CSample instance with one reference
std::shared_ptr<CSample> mySample(new CSample);
printf("The Sample now has %i references\n", mySample.use_count()); // should be 1
// (B) assign a second pointer to it:
std::shared_ptr<CSample> mySample2 = mySample; // should be 2 refs by now
printf("The Sample now has %i references\n", mySample.use_count());
// (C) set the first pointer to NULL
mySample.reset();
printf("The Sample now has %i references\n", mySample2.use_count()); // 1
// the object allocated in (1) is deleted automatically
// when mySample2 goes out of scope
}
Here are some use cases for a shared_ptr:
- use in containers
- using the pointer-to-implementation idiom (PIMPL)
- Resource-Acquisition-Is-Initialization idiom (RAII)
- Separating Interface from Implementation
-
shared_ptr
<T>
works with an incomplete type: when declaring or using a shared_ptr<T>
, T may be an "incomplete type". E.g., you do only a forward declaration usingclass T;
but do not yet define how T really looks like. Only where you dereference the pointer, the compiler needs to know "everything". -
shared_ptr
<T>
works with any type: there are virtually no requirements towards T (such as deriving from a base class). -
shared_ptr
<T>
supports a custom deleter: so you can store objects that need a different cleanup thandelete p
. For more information, see the std documentation. -
Implicit conversion: if a type
U*
can be implicitly converted toT*
(e.g., because T is base class of U), a shared_ptr<U>
can also be converted to shared_ptr<T>
implicitly. - shared_ptr is thread safe: this is a design choice rather than an advantage, however, it is a necessity in multithreaded programs, and the overhead is low.
- Works on many platforms, proven and peer-reviewed, the usual things.
Many container classes, including the STL containers, require copy operations (e.g., when inserting an existing element into a list, vector, or container). However, when this copy operations are expensive (or are even unavailable), the typical solution is to use a container of pointers. Sample with the shared_ptr:
typedef std::shared_ptr<CMyLargeClass> CMyLargeClassPtr;
std::vector<CMyLargeClassPtr> vec;
vec.push_back( CMyLargeClassPtr(new CMyLargeClass("bigString")) );
If you use shared_ptr, the elements get destroyed automatically when the vector is destroyed - unless, of course, there's another smart pointer still holding a reference. Let's have a look at sample 3:
void Sample3_Container()
{
typedef std::shared_ptr<CSample> CSamplePtr;
// (A) create a container of CSample pointers:
std::vector<CSamplePtr> vec;
// (B) add three elements
vec.push_back(CSamplePtr(new CSample));
vec.push_back(CSamplePtr(new CSample));
vec.push_back(CSamplePtr(new CSample));
// (C) "keep" a pointer to the second:
CSamplePtr anElement = vec[1];
// (D) destroy the vector:
vec.clear();
// (E) the second element still exists
anElement->Use();
printf("done. cleanup is automatic\n");
// (F) anElement goes out of scope, deleting the last CSample instance
}
A few things can go wrong with smart pointers (most prominent is an invalid reference count, which deletes the object too early, or not at all). The boost implementation promotes safety, making all potentially dangerous operations explicit. So, with a few rules to remember, you are safe.
Rule 1: Assign and keep - Assign a newly constructed instance to a smart pointer immediately, and then keep it there. The smart pointer(s) now own the object, you must not delete it manually, nor can you take it away again. This helps to not accidentally delete an object that is still referenced by a smart pointer, or end up with an invalid reference count.
Rule 2:a_
ptr<T>
is not aT*
- More correctly, there are no implicit conversions between a T*
and a smart pointer to type T
.
This means:
- When creating a smart pointer, you explicitly have to write
..._ptr<T> myPtr(new T)
- You cannot assign a
T*
to a smart pointer - You cannot even write ptr = NULL. Use ptr.reset() for that.
- To retrieve the raw pointer, use ptr.get(). Of course, you must not delete this pointer, or use it after the smart pointer it comes from is destroyed, reset or reassigned. Use get() only when you have to pass the pointer to a function that expects a raw pointer.
- You cannot pass a
T*
to a function that expects a_
ptr<T>
directly. You have to construct a smart pointer explicitly, which also makes it clear that you transfer ownership of the raw pointer to the smart pointer. (See also Rule 4) - There is no generic way to find the smart pointer that "holds" a given raw pointer. However, the boost: smart pointer programming techniques illustrate solutions for many common cases.
Rule 3: No circular references - If you have two objects referencing each other through a reference counting pointer, they are never deleted. Boost provides weak_ptr to break such cycles (see below).
Rule 4: No temporary shared_ptr - Do not construct temporary shared_ptr to pass them to functions, always use a named (local) variable. (This makes your code safe in case of exceptions. See the boost: shared_ptr best practices for a detailed explanation.)
A strong reference keeps the referenced object alive (i.e., as long as there is at least one strong reference to the object, it is not deleted). std::shared_ptr acts as a strong reference. In contrast, a weak reference does not keep the object alive, it merely references it as long as it lives.
Note that a raw C++ pointer in this sense is a weak reference. However, if you have just the pointer, you have no ability to detect whether the object still lives.
std::weak_ptr<T>
is a smart pointer acting as weak reference. When you need it, you can request a strong (shared) pointer from it. (This can be NULL if the object was already deleted.) Of course, the strong pointer should be released immediately after use.
shared_ptr offers quite some services beyond a "normal" pointer. This has a little price: the size of a shared pointer is larger than a normal pointer, and for each object held in a shared pointer, there is a tracking object holding the reference count and the deleter. In most cases, this is negligible.
intrusive_ptr provides an interesting tradeoff: it provides the "lightest possible" reference counting pointer, if the object implements the reference count itself. This isn't so bad after all, when designing your own classes to work with smart pointers; it is easy to embed the reference count in the class itself, to get less memory footprint and better performance.
To use a type T with intrusive_ptr, you need to define two functions: intrusive_ptr_add_ref and intrusive_ptr_release. The following sample shows how to do that for a custom class:
#include "boost/intrusive_ptr.hpp"
// forward declarations
class CRefCounted;
namespace boost
{
void intrusive_ptr_add_ref(CRefCounted * p);
void intrusive_ptr_release(CRefCounted * p);
};
// My Class
class CRefCounted
{
private:
long references;
friend void ::boost::intrusive_ptr_add_ref(CRefCounted * p);
friend void ::boost::intrusive_ptr_release(CRefCounted * p);
public:
CRefCounted() : references(0) {} // initialize references to 0
};
// class specific addref/release implementation
// the two function overloads must be in the boost namespace on most compilers:
namespace boost
{
inline void intrusive_ptr_add_ref(CRefCounted * p)
{
// increment reference count of object *p
++(p->references);
}
inline void intrusive_ptr_release(CRefCounted * p)
{
// decrement reference count, and delete object when reference count reaches 0
if (--(p->references) == 0)
delete p;
}
} // namespace boost
unique_ptr<T[]> and shared_ptr<T[]> are almost identical to unique_ptr and shared_ptr - only they act like pointers to arrays, i.e., like pointers that were allocated using operator new[]
. They provide an overloaded operator[]
. Note that neither of them knows the length initially allocated.
Smart pointers are useful tools for writing safe and efficient code in C++. Like any tool, they should be used with appropriate care, thought and knowledge.
The main source of information: http://www.boost.org/doc/libs/1_46_1/libs/smart_ptr/smart_ptr.htm
About smart pointers in general: http://ootips.org/yonat/4dev/smart-pointers.html
About implementation of smart pointers in details: http://www.davethehat.com/articles/smartp.htm
About the Boost smart pointers: http://www.codeproject.com/KB/stl/boostsmartptr.aspx
MIPT-V / MIPT-MIPS — Cycle-accurate pre-silicon simulation.