What's Wrong with C++ Templates?

[I wrote this article for Kuro5hin.org, where it appeared in May 2003. There was a big discussion about it there, and there was another at OSNews.com where a link to it was posted. This version is identical to the version posted at Kuro5hin.org except that I've reformatted all the code samples to be indented properly.]

If you've read The Hitchhiker's Guide to the Galaxy and its sequels, you probably remember the Vogons, the incredibly ugly, disgusting, and bad-tempered aliens charged with destroying Earth to clear the path for an intergalactic highway. The Vogons' brains, it turns out, were "originally a badly deformed, misplaced and dyspeptic liver" -- and that explains their demeanor. In this article, I'll explain why I think C++ has a badly deformed, misplaced and dyspeptic liver of its own: its template system.

Before I make my case, I want to make sure my position on templates is clear. For the record: if you're going to program in C++, templates are unquestionably useful, and I hope you won't mistake me for one of those people who say that templates aren't necessary and we should all be using inheritance instead or some gobbledygook like that -- I'm not. If you want to program in C++ there are lots of times when templates are absolutely the best option the language gives you for writing generic, reusable code.

On the other hand, just because templates are the best C++ has to offer doesn't mean they're good. It turns out that all the problems templates solve were already solved better, before templates were ever introduced, by other languages. In fact, the kludgey way templates work precludes C++ programmers from a lot of the good stuff that nicer solutions have.

What are templates, and why should I care?

Templates are a widely-used C++ feature that have simple behavior with complicated consequences. What they do is allow you to mark particular code fragments that might otherwise look like functions, methods or classes as being, in fact, templates for those functions, methods or classes that have holes in them where later on, different people can place certain interesting constants, like type names or numbers. For example, the function

int myCleverFunction() {
  return 4;
}

is just a regular function, but

template <int N>
int myCleverFunction() {
  return N;
}

isn't a function at all, but a pattern for many different functions that the user can make just by supplying a concrete value for N.

Sound useless? It's not. There are a couple of very useful things people do with templates: one is writing code that is abstracted over types, and another is a clever trick called template metaprogramming in which programmers use templates to actually make decisions or perform calculations during compilation, sometimes with amazing effects.

In the rest of this article, we'll look at the various ways people use C++ templates and we'll see what features other languages provide to let programmers achieve the same effects. We'll look at basic and advanced topics, starting with the original and simplest use of templates: generics, also known as parametric polymorphism or just "writing the same code to work with multiple types of data."

Generics: write once, run anywhere

Most programmers who use templates use them to write data structures like lists and other containers. Templates are a natural match for lists (and container classes in general) because they not only let programmers write one List implementation rather than many different List kinds for all the different types of values that lists will need to store, but also let them write down statically-checkable rules like "this particular list must contain ints only."

For instance, In C++, you could write a simple linked-list as follows:

template <class T>
class ListNode {
public:
  ListNode(T it, ListNode* next) {
    this->it = it;
    this->next = next;
  }
  T getItem() { return it; }
  ListNode* nextNode() { return next; }
private:
  T it;
  ListNode* next;
};

When the compiler sees this code, it remembers the definition but emits no assembly instructions. Later, when it sees a use of the template instantiated with a particular type (say, int) that it hasn't seen before, it generates a fresh code fragment by replacing T with int everywhere in the body of the class definition and changing the class name to be unique, and then rewrites the usage to refer to the newly-generated code. So the code would allow you to write type-safe lists of any type:

// fine
ListNode<int>* il = new ListNode<int>(2, new ListNode<int>(4, NULL));
// also fine
ListNode<string>* sl = new ListNode<string>("hi", new ListNode<string>("bye", NULL));
// type error
ListNode<int>* il2 = new ListNode<int>(3, new ListNode<string>("hi", NULL));
// fine
int i = il->getItem();
// also fine
string s = sl->getItem();
// type error
string s2 = il->getItem();

This is a very handy trick, and one that you can't get any other way in C++ (even using void pointers or single-rooted class hierarchies, neither of which provide type-safety).

So handy, in fact, that it's hard to believe that nobody had thought of the idea before C++ templates were introduced in the mid-80's. You might know that C++ got the idea from Ada, but what you may not know is that the idea predates both -- in fact, the earliest versions of ML in the mid-seventies used a type-inference scheme that explicitly allowed functions to have polymorphic types. The notion had been around in research literature earlier than that, but ML was the first real programming language to have the feature.

ML's approach to the problem of writing a function that works on arbitrary types is very different from C++'s. In ML, the system isn't a pre-type-checking phase that generates new copies of the code for every different type of value the code gets used with, but instead it's a feature of ML's type-checker that allows it to make clever deductions about how functions behave. It tries to infer types for functions based on their source code: for instance, if the SML/NJ compiler sees the function definition

fun f(a,b) = a + 2*b

it is smart enough to realize that a and b must be numbers and the result type must also be a number -- even though the programmer didn't have to type that in, the type-checker realizes it anyway. On the other hand, if it sees

fun g(x) = x

it will conclude that x can be anything and the return type will be whatever was input. This is a perfectly sensible type, called a polymorphic type, and the type-checker can reason about it just fine. For example, if somewhere else in the same program it sees the code fragment

fun h(a,b) = g(f(a,b))

it will know that h takes two numbers and returns a number.

ML's type-checker gives ML programmers every bit as much power to write type-independent programs as C++ templates give C++ programmers: for example, we could write the SML/NJ version of the linked-list template above like so:

datatype 'a List = ListNode of 'a * 'a List | Empty
exception EmptyList

fun getItem (ListNode (i,_)) = i
  | getItem (Empty) = raise EmptyList

fun nextNode (ListNode(_,rest)) = rest
  | nextNode (Empty) = raise EmptyList

and the same lists will typecheck:

- val il = ListNode(2, ListNode(4, Empty));
val il = ListNode (2,ListNode (4,Empty)) : int List

- val sl = ListNode("hi", ListNode("bye", Empty));
val sl = ListNode ("hi",ListNode ("bye",Empty)) : string List

- val il2 = ListNode(3, ListNode("hi",Empty));
stdIn:3.1-3.36 Error: operator and operand don't agree [literal]
operator domain: int * int List
operand: int * string List
in expression:
ListNode (3,ListNode ("hi",Empty))

- val i = getItem(il);
val i = 2 : int

- val s = getItem(sl);
val s = "hi" : string

- val s2 : string = getItem(il);
stdIn:5.1-5.30 Error: pattern and expression in val dec don't agree [tycon mismatch]
pattern: string
expression: int
in declaration:
s2 : string = getItem il

Aside from the syntactic differences and the fact that ML deduces types on its own, the C++ version and the SML/NJ version appear pretty similar. But the ML way offers a few tangible benefits: first, the ML compiler can check to ensure that a polymorphically-typed function has no type errors even if you never call it (C++ can't check your template for type errors until you instantiate it, and must check each instantiation separately), which is a big advantage for incremental development and for library authors. Furthermore, if ML says your polymorphic function is safe, then it's guaranteed to be safe no matter what types anybody uses with it: in C++, just because your template compiles with types A, B, and C doesn't say anything about whether it will compile if you instantiate it with type D later. This strategy also allows an ML compiler's code generator to make tradeoffs between the size of the code it generates and that code's efficiency -- C++ compilers get no such choice.

Interfaces: the virtue of (implementation) ignorance

As cool as ML's polymorphic functions are, you may have already realized that they have a pretty major drawback compared to templates: you can't rely on universally-quantified types to have any properties at all. That means that while you could write a function that took a list of any type and computed its length (because the length of a list doesn't depend on anything about the type of elements it holds), you couldn't write a function that sorted that list in any meaningful way without doing some extra work (because the proper sorting of a list depends on properties of the elements it holds).

You never have to worry about this problem when you write C++ templates. You just use any functions, methods, or operators you want, and when C++ fills in the template for you and recompiles the template body you'll automatically get the right thing (provided it exists, of course). For instance, you could add the following method to the C++ list example above with no problem:

template <class T>
class ListNode {
  // ... as before ...
  ostream & print(ostream &o) { return o << it; }
  // ...
}

What happens when you apply this thing to a type? Well, if the type you used has an appropriate << operator defined for it, exactly what you'd expect happens, and print works fine. On the other hand, if it doesn't, you'll get an explosion of template error messages that don't really indicate the source of the problem. The worst thing about this situation is that it can cause insidious lurking bugs: the code works fine for a year, then one day the new guy uses it in a way that's not quite expected and all the sudden everything breaks for no obvious reason.

This is the sort of bug type systems were invented to catch, so it's not surprising that there are type systems that will catch it. The one you've most likely heard of is Java's interface system. That system allows a programmer to declare a certain bundle of method signatures apart from any class and then use that bundle as a type, meaning that methods can accept objects of any class that announces that it has all method for each method signature in the bundle.

This system works well, but unfortunately it requires that every class you want to use declares itself to implement a particular bundle of functionality ahead of time. ML's functor system (no relation to the things C++ calls functors, which in ML terms would simply be higher-order functions) deals with this problem nicely using the concept of a parameterized module system.

What's that? It's like a normal C++ library, but with some details (like types of things, for instance) sucked out so they have to be specified later. That may not make much sense, but hopefully an example will clarify: to add a print feature to the ML version of the list introduced above, we could rewrite it as a functor in the following way:

signature PRINTABLE = sig
  type t
  val printT : t -> unit
end

functor List(T : PRINTABLE) =
struct
  datatype List = ListNode of T.t * List | Empty
  exception EmptyList
  (* ... other functions as before ... *)

  fun print (ListNode (i,_)) = T.printT i
end

Now we can make new lists more-or-less on the fly, even holding types that don't have a print function explicitly declared for them, like so:

- structure L = List(struct 
    type t = int
    val printT = print o (Int.toString)
  end);
structure L :
sig
  val getItem : List -> T.t
  val nextNode : List -> List
  val print : List -> unit
  exception EmptyList
  datatype List = Empty | ListNode of T.t * List
end

- L.print (L.ListNode (5, L.ListNode(6, L.Empty)));
5
val it = () : unit

This system gives you more abstraction power than C++ templates or Java interfaces while providing type-safety.

Metaprogramming: the art of good timing

Another purpose for which particularly devious programmers can use C++ templates is "template metaprogramming," which means writing pieces of code that run while the main program gets compiled rather than when it runs. Here's an example of a program that computes the factorials of 4, 5, 6, and 7 (which are 24, 120, 720, and 5040) at compile-time:

#include <stdio.h>

template <int n>
class Fact {
public:
  static const int val = Fact<n-1>::val * n;
};

class Fact<0> { public: static const int val = 1; };

int main() {
  printf("fact 4 = %d\n", Fact<4>::val);
  printf("fact 5 = %d\n", Fact<5>::val);
  printf("fact 6 = %d\n", Fact<6>::val);
  printf("fact 7 = %d\n", Fact<7>::val);

  return 0;
}

If you look at the assembly code g++ or any other reasonable compiler produces for this code, you'll see that the compiler has inserted 24, 120, 720, and 5040 as immediate values in the arguments to printf, so there's absolutely no runtime cost to the computation. (I really encourage you to do this if you never have before: save the code as template.cc and compile with g++ -S template.cc. Now template.s is assembly code you can look over.) As the example suggests, it turns out that you can get the compiler to solve any problem a Turing machine can solve by means of template metaprogramming.

This technique might sound like some strange abuse of C++ that's primarily useful for code obfuscation, but it turns out to have some practical applications. For one thing, you can improve the speed of your programs by doing extra work in the compile phases, as the example shows. In addition to that, it turns out that you can actually use the same technique to provide convenient syntax for complicated operations while allowing them to achieve high performance (matrix-manipulation libraries, for instance, can be written using templates). If you're clever, you can even get effects like changing the order in which C++ evaluates expressions for particular chunks of code to produce closures or lazy evaluation.

Again, it turns out that this ability was old before templates were a glimmer in Bjarne Stroustrup's eye in the form of Lisp macros. You may recoil at the use of that name, but don't worry: Lisp macros are much more pleasant to work with than their higher-profile cousins. At about the same time Kernigan and Ritchie were inventing C and C preprocessor macros, a group at the MIT Media Lab was inventing a system called MacLISP that introduced Lisp macros, a totally different implementation of the macro concept that survives to this day in Common Lisp and Scheme as well as in a number of offshoots and related languages.

As they exist today, Lisp macros and C macros do similar things: they allow the programmer to substitute one fragment of code with another before the program gets run. The big difference between the two is that while C macros work by scanning for and replacing literal text phrases within source code, Lisp macros replace portions of a parse-tree instead. That might not sound revolutionary, but it turns out to be the difference between a system that gurus recommend you never use and one that goes a long way towards defining a language.

Lisp macros offer a kind of compile-time computation that goes one step above C++ template metaprogramming by allowing you to actually write your compile-time programs in Lisp itself. The same code you'd use to write a regular program, put in the proper place, runs at compile-time instead, and its result gets inserted into the source code of your program. For instance, here's how you could write a normal factorial function in Scheme (this is PLT Scheme version 203):

(define (fact n)
  (cond
    [(= n 0) 1]
    [else (* n (fact (- n 1)))]))

If you wanted to make a version of the same function that was guaranteed to run at compile-time, you could just write:

(define-syntax (ctfact stx)
  (define (fact n)
    (cond
      [(= n 0) 1]
      [else (* n (fact (- n 1)))]))

  (syntax-case stx ()
    [(_ n) (fact (syntax-object->datum n))]))

Aside from some mumbo-jumbo telling Scheme that this is a macro and how to read its arguments, it's exactly the same as the original Scheme function. That's true in general of Lisp macros: they're just regular functions that you tell the language to run at compile-time rather than runtime. While that may not sound all that important, it actually makes huge practical difference: it allows your macros to use parts of your code that also run at runtime, to load libraries and make library calls at compile time — in PLT Scheme, you could even write a macro that popped up a GUI with a dialog box that asked the user how to compile a particular expression! — and more, all with no extra effort. C++ templates can't use normal run-time C++ code in the process of expanding, and suffer for it: for instance, the C++ factorial program is limited in that it produces 32-bit integers rather than arbitrary-length bignums. If we had that problem in a Lisp system, it would be no problem: just load a bignum package and rewrite your macro to use it, and everything works out (and still all happens at compile-time). In C++, though, the bignum library is no use to us at all, and we'd have to implement another "compile-time bignum" library to make the fix.

Just as metaprogramming is more powerful than computing little mathematical functions at runtime, Lisp macros have quite a few more uses too. In fact, they were made specifically for extending Lisp's syntax with new constructs. For instance, PLT Scheme has no equivalent of C++'s while loop, but you can add one in just a few lines of code:

(define-syntax (while stx)
  (syntax-case stx ()
    [(_ test body)
     #'(letrec ((loop (lambda () (if test (begin body (loop))))))
         (loop))]
    [else (raise-syntax-error 'while "Illegal syntax in while loop" stx)]))

Notice in that code fragment how obvious it is what's happening, even if you don't know Scheme: whenever Scheme sees (while <test> <body>) for any code fragments test and body, it should replace that bit with

(letrec ((loop (lambda () (if test (begin body (loop))))))
  (loop))

which is Scheme code that performs the loop properly. Otherwise, the user used bad syntax so print a syntax error.

Even with a simple example like while, you can begin to see how these macros are more powerful than C++ templates: while it's clear that since they perform the same copy-and-paste function that templates perform, they can fill the same role, they also have a lot more built-in support for making your metaprograms play well with your normal program. Allowing user-defined syntax errors, for example, would have been an easy way to let the STL authors write code that produced helpful, meaningful error messages rather than the notoriously unhelpful error messages it prints now.

In fact whole large syntax systems can easily be built out of this mechanism, particularly when you remember you can transform your syntax trees not just using pattern matching, but in fact using any arbitrary code you want. A good example of a big system you can build using macros is PLT Scheme's object-oriented programming system: it's a normal, unprivileged library that adds seamless object-oriented programming to PLT Scheme, which has no built-in object-oriented features. You get syntax forms, natural error messages, and everything else an in-language system would provide. In the Lisp world this is standard and many large-scale Lisp and Scheme projects use macros -- a quick check of the standard libraries included with the PLT Scheme distribution shows 292 uses of define-syntax in about 200,000 lines of Scheme. What's more amazing is that this count doesn't include the many macros that PLT Scheme uses to actually define the core of Scheme, like cond, define, let, and so on, which are all macros in PLT Scheme. It might surprise you to learn that in fact the only syntax forms in any of the PLT Scheme examples in this article that are not in fact macros that expand into some simpler form are the begin and if forms I used to implement the while loop above.

So what?

So some other languages invented some features before C++, and they implemented them in arguably better ways. So what?

For one thing, templates hurt the C++ compiler's ability to generate efficient code. It might surprise you to hear that, considering that C++ is "efficient" while functional languages like ML and Scheme are "inefficient," but it's true. C++ templates hide your intentions from the compiler: did you mean type abstraction? Did you mean a syntax extension? Did you mean constant-folding or compile-time code generation? The compiler doesn't know, so it has to just blindly apply the copy-and-paste strategy and then see what happens. In ML or Scheme, though, aside from the individual benefits described above, the simple fact that you're telling the compiler what you want to achieve lets it optimize for you much more effectively.

Another reason to care is that if you understand the context in which templates exist, you'll be able to make more effective use of them and you'll be able to make more intelligent decisions about when to use them.

But from a broader perspective, realizing that templates are really just a C++ version of Lisp macros geared towards generating type declarations rather than extending C++'s syntax helps you understand the wider history of programming languages rather than just knowing the flavor of the month (which is rapidly becoming the flavor of last month!).

This article is copyright © 2003 Jacob B. Matthews.

The background image and icon graphics used on this page supplied by FreeFoto.com.

main