I've been playing around with generics in C# today. Simply put,
generics are akin to templates in C++, only much better. My aim in this
post is not to explain the in's and out's of generics, in the new 2.0
version of the .NET Framework but to highlight some key points of
interest.
For a primer on generics, read
part 1 and
part 2 of an article included in
MSDN magazine, by Jason Clark.
Q: So, what's the point of generics?
A: They provide a mechanism for defining code without specifying variable types.
So, the typical example is the
System.ArrayList problem. Let's say that
you want to store integers in an ArrayList because it offers dynamic
sizing, sorting and the ability to remove random accessed elements from
the collection. The problem is that ArrayList uses a base type of
System.Object to store data - a reference type. Adding an integer to an
ArrayList will involve boxing the integer value type before insertion,
which incurs a large performance overhead. Casting to
System.Object also side steps type checking, so the compiler will not help
you out if there is problem with instances of incorrect types added to
the collection, which means developers have to add their own type
checking code.
Generics provide a neat way to define collection classes
(but not restricted to collection classes), which are type agnostic.
The following example defines two instances of the
System.Collections.Generic.List class, one
which contains strings, and the other which contains integers.
Q: How do generics work in the CLR?
A:
In C++, templates are processed by the compiler for each use of the
template. Each template is expanded into a concrete type at compile time when required
by a caller of said template. So, using a single template multiple
times with different types will create a new concrete version of the
template for each type. Misuse of templates in C++ can very quickly cause code bloat.
C++ templates are not compiled or type checked for errors until they
are used. So, the compiler will let you write reams of code, and won't
compile any of it until it is specifically used as a template
instantiation.
No validation of template code is performed by the C++ compiler. The
example below is a template function to compare two instances of type
'T'. This function works well when 'T' is an integer, but what if 'T'
is a class type
that does not contain the less than operator? The developer will see
some ambiguous compiler error in the template, not an error indicating
the real problem.
Generics deal with both code bloat and type validation checking. Code
bloat is limited because the JIT compiler stores one version of the
common reusable code between template instances. Expansion of generic types is
performed at run-time, all that exists of a constructed type at compile
time is a type reference.
Generics are type and validation checked at compile time using
constraints. In their bare form (without constraints), generics
don't enable the developer to do much. The
example below is the C# version of the C++ template above. Compiling
this code will cause an error. The compiler doesn't know anything
about type 'T' and cannot deduce whether a less than comparison is
legal, so it doesn't permit it. The same goes for the CompareTo method.
In fact, pretty much the only operations and method calls that can be
performed on a bare generics are those associated with
System.Object - GetType, ToString etc.
Q: So what are these constraints?
A:
The following example is the correct way to achieve the comparison we were looking for above. The
where
keyword is followed by the constraints, in this case I have indicated
that 'T' must be derived from the IComparable interface, so the compiler knows that
types passed to the generic method can always be compared.
Constraints aid the compiler in describing what rules can be applied to
parameter types, and can determine the operations and
methods that can be called in generics.
Another constraint is
new(). This constraint tells the compiler
that the code in the generic method can create new object instances of
the parameter type. This is especially useful when using helper
classes to construct derived class types. I ran into this
constraint today when I was developing a utility class. I was
surprised to find out that the
new() constraint only enables
construction of types that have a default non-parameterized
constructor. At the time I didn't know why this was so, but I
later found out it is because the compiler cannot evaluate
parameterized constructors without breaking validation
checking. For example, if 'T' is a class, which requires a
parameter passed to the constructor of 'T' that is itself complex type, the compiler
has no way of knowing if the generic will work for all different types
bound to 'T'. The template may work great if 'T' is an employee, with
parameterized constructor taking in a name as string, but would not work if
'T' is then an integer. My example below highlights this
explanation:
Q: What is inference?
A:
.Net is smart enough to determine the parameter type in a generic if
variable of the same type is passed as to the method as a parameter.
The example code below defines a generic method. When calling this
method, type 'T' can be inferred by the parameters passed to the method.