Sunday, June 16, 2013

On pooling of objects that contain vectors

Recently I had an interesting exercise in de-leaking of our rather complex linguistic analysis system. Actually the system was rather solid and almost didn't leak, but under heavy load in production its memory usage grew slowly but steadily.

First approach was to use some free de-leaking tools, like Visual Leak Detector, but they didn't show any leaks - after application termination all objects were freed.

My next attempt was to analyze growth pattern of memory allocations. I created a small script to dump memory allocations every five minutes with UMDH (that could be found in Debugging Tools for Windows). After running this thing overnight I parsed outputs with another script and dumped it into huge Excel spreadsheet for further analysis.

I was surprised to see that there were only few allocations that had growing trend, and they all belong to the memory allocator within std::vector.

After few days I finally realized that it is not the size of the vector was growing, but their capacity. The problem was with rather complex, heavy-to-construct object that had few vectors in it. As the object was really complex and algorithm had to run very fast, objects were not constructed but rather taken from the pool.

Of course, before putting object back to pool it was cleared (e.g. the vector was empty after calling clear() method), but the underlying memory buffer that was allocated was not cleared. It would not be a problem if the vector in all cases would have similar size, however in our case average size was about 1.5, but the maximum size sometimes went up to 100.

This means that eventually after running long enough each pooled object will contain an empty vector that has a buffer to hold 100 elements.

Of course the fix was trivial, and such technique is mentioned in Effective STL by Scott Meyers. Instead of calling std::vector::clear() I did following:
std::vector<MyType>().swap(m_vMyTypes);
which basically swaps newly constructed (and thus empty) vector with existing one, freeing all memory that was held. Also the similar effect could be observed with std::deque collection - it also does not free allocated memory after call to clear().

This all means that everyone should be very careful when putting complex objects that contain STL collections into object pools.