Friday, October 14, 2011

Curiously recurring template Pattern, CRTP - Static Polymorphism in C++

Let us start this discussion with some minimalistic C++ classes, which doesn't convey any meaning but serves the purpose of discussion.

class Base
{
public:
void PrintMe()
{
std::cout << "Print: Base" << std::endl;
}
};

class Derived : public Base
{
public:
void PrintMe()
{
std::cout << "Print: Derived" << std::endl;
}
};

void TestPolymorphism()
{
Base *ptrBase = new Derived();
ptrBase->PrintMe(); // This would print "Print: Base"
}

In the above code, ptrBase is initialized with an object of Derived. Ideally, PrintMe should have printed "Print: Derived". But it would print "Print: Base". Any C++ programmer could identify the issue; early binding according to the type. It means when we say, ptrBase->PrintMe(), compiler would check whether the function called is virtual or not. If it is not virtual, it would bound this call to the address of function defined in the calling type; In this case calling type is Base.

Dynamic binding can be achieved using "Virtual" specifier, in method declaration. When we mark a function with Virtual, compiler wouldn't make early binding; it would resolve the same using V-Table. Every class which has a virtual method, would have a table of function pointers. And a hidden pointer for the table, will be inserted into the class and will be initialized during construction of objects of the class. So, during a real method invocation on objects, two things would happen. Getting the right function address from V-Table and calling the function pointed by the same. Even though this extra indirection doesn't cause much performance degradation, in several scenarios this could be easily avoided. Please find the snippet below, after adding virtual specifiers.

class Base
{
public:
virtual void PrintMe()
{
std::cout << "Print: Base" << std::endl;
}
};

class Derived : public Base
{
public:
virtual void PrintMe()
{
std::cout << "Print: Derived" << std::endl;
}
};

void TestPolymorphism()
{
Base *ptrBase = new Derived();
ptrBase->PrintMe(); // This would print "Print: Base"
}

As we discussed early, even though dynamic polymorphism doesn't cause much performance overhead, it would make a function not "in-line"able. Sometimes, in-lining a simple function would improve performance especially if it is being called several times in code. Since normal function call breaks code execution flow, CPU level, caching like optimizations, are not possible. With static polymorphism, we could achieve the in-lining capability. Now let us get into the topic of Static polymorphism.

Let us try to make Base class function a bit intelligent.

In Base class, if we could cast the "this" pointer's type to the right Derived class, we could be able to solve the issue of Dynamic binding; Means avoiding Virtual specifier.

The below code snippet won't compile, but it gives the idea of what to do.

class Base
{
public:
void PrintMe()
{
static_cast<Derived*>(this)->PrintMe(); // This would call "Print: Derived"
}
};

We had successfully removed Virtual Keyword; So, V-Table wouldn't be created and function can be in-lined. There are some problems; First off, it wouldn't compile, as compiler doesn't know the Derived class yet. Secondly, this is specialized for only one derived class. This specialization can be removed with templates. Yes. that is the whole idea.

template<Derived>
class Base
{
public:
void PrintMe()
{
static_cast<Derived*>(this)->PrintMe(); // This would call "Print: Derived"
}
};

Now, you could derive classes from the above class like this,

class Derived : public Base<Derived>{ };

Deriving from template classes, specialized with the same Derived class type, is called as CRTP, curiously recurring template pattern. Even though the above example is very minimalistic, there are several uses for this pattern. One of the great examples, as given by Wikipedia is Counter base class. In order to get the statistics of objects of a particular type, we can implement a Counter class as below.

template<T>
class Counter
{
public:
static int GetTotalObjectsCreated()
{
return nObjectsCreated;
}
protected:
Counter() { nObjectsCreated++;}
~Counter() { nObjectsCreated--; }
private:
static int nObjectsCreated;
};

template<T>
int Counter<T>::nObjectsCreated = 0;

Let us assume we need to get the number of objects created for a particular type, Test. Derive Test from Counter, specialized with Test itself.

Class Test : private Counter<Test>{ };
// Private derivation is to show that it is not 'is a' relationship.

Now at any point of time, in program execution, we could get the number of objects alive using the call like this.

int nObjects = Counter<Test>::GetTotalObjectsCreated();

CRTP is cool and positive side effect of Code replication based generics mechanism, unlike Java and .NET.



7 comments:

  1. thanks for sharing this. Good learning though. But correct my understanding , so, the main purpose & advantage of CRTP, is that, Object Counter example, say for classes X ,Y etc. If not CRTP, the counter logic would have to be replicated in all of the classes separately , right ?
    Pls.explain more on this:

    CRTP is cool and positive side effect of Code replication based generics mechanism, unlike Java and .NET.

    ReplyDelete
  2. Yes Sudhakar. Counter implementation should not be bound with code at all. Here we do derive Counter template, privately. And for the second query, There are two types of Generics implementation; Code sharing and Code replication. In code sharing, code wouldn't be replicated, one for each type as in C++ templates. In Java, Code sharing would happen with Generics. This is possible in Java as all classes are derived from Object, base class. Anyway I will completely the explain the relationship between Code replication based generics and CRTP in a separate post.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Nice article! This is the best explanation I've seen of this so far. One question: When you say

    Now, you could derive classes from the above class like this,

    class Derived : public Base{ };

    How would you derive another class, such as "Derived2"?

    ReplyDelete
    Replies
    1. Anthony, Base class is templatized. So, when you derive from that you should give a type name and in this case, the name of class which derives. So, Derived2 could be declared as follows.

      class Derived2 : public Base.

      Does it make sense? Or Have I misunderstood your question?

      Delete
  5. class Base
    {
    public:
    virtual void PrintMe()
    {
    std::cout << "Print: Base" << std::endl;
    }
    };

    class Derived : public Base
    {
    public:
    virtual void PrintMe()
    {
    std::cout << "Print: Derived" << std::endl;
    }
    };

    void TestPolymorphism()
    {
    Base *ptrBase = new Derived();
    ptrBase->PrintMe(); // This would print "Print: Base"
    }

    will it not print "Print:Derived" ???

    ReplyDelete