11/06/2012

11-06-12 - Using your own malloc with operator new

In Oodle, I use my own allocator, and I wish to be able to still construct & destruct classes. (Oodle's public API is C only, but I use C++ internally).

The traditional way to do this is to write your own "operator new" implementation which will link in place of the library implementation. This way sucks for various reasons. The important one to me is that it changes all the news of any other statically-linked code, which is just not an okay thing for a library to do. You may want to have different mallocs for different purposes; the whole idea of a single global allocator is kind of broken in the modern world.

(the presence of global state in the C standard lib is part of what makes C code so hard to share. The entire C stdlib should be a passed-in vtable argument. Perhaps more on this in a later post.)

Anyway, what I want is a way to do a "new" without interfering with client code or other libraries. It's relatively straightforward (*), but there are a few little details that took me a while to get right, so here they are.

(* = ADDENDUM = not straightforward at all if multiple-inheritance is used and deletion can be done on arbitrary parts of the MI class)

//==================================================================

/*

subtlety : just calling placement new can be problematic; it's safer to make an explicit selection
of placement new.  This is how we call the constructor.

*/

enum EnumForPlacementNew { ePlacementNew };

// explicit access to placement new when there's ambiguity :
//  if there are *any* custom overrides to new() then placement new becomes ambiguous   
inline void* operator new   (size_t, EnumForPlacementNew, void* pReturn) { return pReturn; }
inline void  operator delete(void*,  EnumForPlacementNew, void*) { }

#ifdef __STRICT_ALIGNED
// subtlety : some stdlibs have a non-standard operator new with alignment (second arg is alignment)
//  note that the alignment is not getting passed to our malloc here, so you must ensure you are
//    getting it in some other way
inline void* operator new   (size_t , size_t, EnumForPlacementNew, void* pReturn) { return pReturn; }
#endif

// "MyNew" macro is how you do construction

/*

subtlety : trailing the arg list off the macro means we don't need to do this kind of nonsense :

    template <typename Entry,typename t_arg1,typename t_arg2,typename t_arg3,typename t_arg4,typename t_arg5,typename t_arg6,typename t_arg7,typename t_arg8,typename t_arg9>
    static inline Entry * construct(Entry * pEntry, t_arg1 arg1, t_arg2 arg2, t_arg3 arg3, t_arg4 arg4, t_arg5 arg5, t_arg6 arg6, t_arg7 arg7, t_arg8 arg8, t_arg9 arg9)

*/

//  Stuff * ptr = MyNew(Stuff) (constructor args); 
//  eg. for void args :
//  Stuff * ptr = MyNew(Stuff) ();
#define MyNew(t_type)   new (ePlacementNew, (t_type *) MyMalloc(sizeof(t_type)) ) t_type 

// call the destructor :
template <typename t_type>
static inline t_type * destruct(t_type * ptr)
{
    RR_ASSERT( ptr != NULL );
    ptr->~t_type();
    return ptr;
}

// MyDelete is how you kill a class

/*

subtlety : I like to use a Free() which takes the size of the object.  This is a big optimization
for the allocator in some cases (or lets you not store the memory size in a header of the allocation).
*But* if you do this, you must ensure that you don't use sizeof() if the object is polymorphic.
Here I use MSVC's nice __has_virtual_destructor() extension to detect if a type is polymorphic.

*/

template <typename t_type>
void MyDeleteNonVirtual(t_type * ptr)
{
    RR_ASSERT( ptr != NULL );
    #ifdef _MSC_VER
    RR_COMPILER_ASSERT( ! __has_virtual_destructor(t_type) );
    #endif
    destruct(ptr);
    MyFree_Sized((void *)ptr,sizeof(t_type));
}

template <typename t_type>
void MyDeleteVirtual(t_type * ptr)
{
    RR_ASSERT( ptr != NULL );
    destruct(ptr);
    // can't use size :
    MyFree_NoSize((void *)ptr);
}

#ifdef _MSC_VER

// on MSVC , MyDelete can select the right call at compile time

template <typename t_type>
void MyDelete(t_type * ptr)
{
    if ( __has_virtual_destructor(t_type) )
    {
        MyDeleteVirtual(ptr);
    }
    else
    {
        MyDeleteNonVirtual(ptr);
    }
}

#else

// must be safe and use the polymorphic call :

#define MyDelete    MyDeleteVirtual

#endif

and the end result is that you can do :

    foo * f = MyNew(foo) ();

    MyDelete(f);

and you get normal construction and destruction but with your own allocator, and without polluting (or depending on) the global linker space. Yay.

11 comments:

Arseny Kapoulkine said...

This does not work with multiple inheritance, i.e.

struct A { virtual ~A() {} char a; };
struct B { virtual ~B() {} char b; };
struct C: A, B { };

B* b = MyNew(C)();
MyFree(b);

As far as I know, unless you have RTTI and are willing to use dynamic_cast (ugh), you have to resort to overriding the operator delete, locally (inside the class) or globally. For global overrides, at one company we used a complicated system to be able to specify a per-allocation pool (i.e. MyNew(pool, type) + MyDelete(pool, ptr)) - it stored the pool pointer on a thread-local deallocation stack and then called the global operator delete, which inspected the stack to get the pool pointer (the stack is needed since MyDelete operations can cascade in case they are called from the dtor).

God, I hate C++.

Arseny Kapoulkine said...

dynamic_cast was meant to be dynamic_cast < void* >

Cat said...

If it really fails with MI, can you add a static_assert to detect MI use?

cbloom said...

Yeah it definitely fails with MI. The problem is you need to be able to find the base allocation pointer from any of the types in the class hierarchy.

That didn't even occur to me cuz I just never use MI any more.

I believe it's fundamentally impossible to fix without adding meta-language.

The problem is when you are given a B* to free, there's nothing at compile-time that tells you that B* is a standalone object, or part of a multiple inheritance.

It's easy to fix with meta-language. Two possibilities :

1. Require that all polymorphic objects have a "get_base" virtual member function that provides the allocation address.

2. Make MyDeleteVirtual take an extra template parameter in which the caller explicitly specifies the base type.

3. The only time I've used MI in a large code base, we required that every polymorphic MI class derive from a virtual base Root class, so that you could always cast to Root to get a consistent address for an object.

4. When the destructors run, make them register their pointer on a stack.

5. (abusing the language) You should be able to grab the address of the destructor from your vtable and thus find your most-derived type.

It would be nice to at least detect MI at runtime and assert. (the underlying allocator can probably do this)

Anonymous said...

If you want to do any optimization of your allocator based on passing in the block size on free, but you don't want to HAVE to pass in the block size on free, don't you need to call a different malloc depending on whether you will free it with the block size or not? Or are you sucking up an O(N) free in the no-size case?

cbloom said...

When no size is provided I do a hash table lookup, so technically O(1), but yeah.

I presume that small/non-polymorphic allocs are common (and should be fast and minimum overhead), and large/polymorphic are rare.

Surely it would be better to just have two completely different allocators, one for small blocks and one for large/polymorphic, and make the caller use a totally different malloc/free pair for those two cases.

In my personal code I like to do mallocs by reserving chunks of virtual address range, but I figured that was too messy in a library.

  said...

new [], delete []?

  said...
This comment has been removed by the author.
  said...

by the way (off topic) is there a way to get blogger to email you when somebody replies to you or comments on one of your blog posts?

cbloom said...

" new [], delete []? "

Assuming non-polymorphic, those are straightforward.

It's up to you whether or not you want to shift the pointer and store the count before the head of the array. (that's needed if you want delete [] to work without passing in the count).

Personally I don't do that, I pass in the size to MyDeleteArray().

Actually I almost never new/delete arrays, if I have an array I just use some kind of template array container.

cbloom said...

You use something like :

template < typename t_type >
static inline t_type * construct(t_type * pArray,const size_t size)
{
for(size_t i=0;i<size;i++)
new (ePlacementNew, pArray+i) t_type;

return pArray;
}

old rants