Thursday, March 29, 2012

Improving on shared_ptr

In this post I will tell about one optimization I have recently discovered and successfully applied in my project. It is very simple to apply but yields very good results - in terms of both performance and memory footprint.

Let's start with the usual way shared_ptr's are created:
class Test
{
public:
    Test(int dummy) {}
}
...


std::shared_ptr<Test> pTest(new Test(1));


Now let's look step by step on what's going here:
  1. New object of class Test is allocated and constructed. Note that it is allocated on heap, and each heap allocation incurs memory overhead above sizeof(Test).
  2. Then pointer is passed to constructor of std::shared_ptr<Test> which takes ownership of the newly created object.
The mentioned constructor of shared_ptr basically does two things: it assigns its internal pointer to the object that it will hold, and it allocates (again on the heap) shared buffer that will be used to hold reference counters.
As you see, here's another memory allocation, another performance and memory footprint hit.

So the more efficient way of doing it is:
std::shared_ptr<Test> pTest = std::make_shared<Test>(1);


There are few benefits of doing this:
  • There is only one memory allocation - both reference counters and object itself are allocated in one buffer (however this is implementation-dependent, so this is the case for Visual C++ 10)
  • There is a constructor of std::shared_ptr<T> that takes R-value reference as an argument making this construct potentially more efficient
What are drawbacks here you might ask? There are only a few of these:
  • As Visual C++ 10 (and beta of VC11) does not support variadic templates number of arguments passed to object's constructor is limited to 10 (and in VC11 to 5 with default settings)
  • std::make_shared's from boost library does not support this optimization and even in the latest version available now (1.49.0) allocates two separate buffers - one for object and one for reference counters
So anyway, if have boost::shared_ptr's and you are using boost starting from 1.36.0 in your project (that's the version what make_shared was copied from tr1 to boost library)  it might still make sense to migrate to make_shared even if you won't get benefits right away.

Is it that simple?

No, there is one small pitfall I hit when I migrated my project to using std::make_shared.
When object is being created by some factory class the constructor is declared private and factory class is declared as a friend.
The problem lies in the fact that when factory creates new object with make_shared constructor is called from some unknown implementation-specific class, and this call fails to compile as we can't declare it a friend of our class.

The solution is not very elegant (at least for my taste), but it works, and helps when this optimization is applied to make a real difference in performance.
Here is a sample code:

class Obj;
typedef std::shared_ptr ObjPtr;

class Obj
{
...
private:
    struct HideMe {}
    friend class ObjFactory;
public:
    Obj(HideMe, int param)
    { ... }
}

class ObjFactory
{
public:
    static ObjPtr CreateObj()
    {
        return std::make_shared(Obj::HideMe(), param) ;
    }
}
So the actual constructor that is being called by ObjFactory is public, but no other object could invoke it due to hidden structure Obj::HideMe.

Happy optimizing!