New features in C++11 for LHCb Physicists

Learning Objectives

"Learn a few of the most useful tools in C++11."

C++11 was the largest change ever made to C++; and due to the changed release schedule, probably will remain the largest single change. It is a well thought out, mostly backward-compatible change that can completely change the way you write code in C++. It is best thought of as almost a new language, a sort of (C++)++ language. There are too many changes to list here, and there are excellent resources available, so this is meant to just give you a taste of some of the most useful changes.

Many of the features work best together, or are related. There already are great resources for learning about C++11 (listed at the bottom of this lesson), and C++11 is already in use in current LHCb software. Therefore, the remainder of this lesson will cover a few of the common idioms in C++11 that a programmer experienced with the older C++ might not immediately think of.

Types

Typing in C++11 is much simpler. If the type is deductible, auto allows you to avoid writing it out. A few examples of uses of auto:

// Avoiding double writing on pointers
auto pointer = new SomeNamespace::MyLongType(arg1, arg2);

// Does anyone know the type of an iterator? It's ugly!
auto iterator = some_vector.begin();

// This is useful for prototyping, but probably should be explictly typed in real code to help coders
auto type = function_returns_something();

// Worst use of auto; don't do this
auto a = 1;

ROOT macros have a syntax that precedes auto; you can simply assign (x = Something();) and if x does not already exist, it does the equivalent of auto x = Something();. This has the drawback that it is not valid C++ code, but unfortunately is still useful in a RootBook, where you may want to rerun a cell.

The NULL keyword, which is equivalent to 0, causes typing issues. A new nullptr keyword was added, and is not equivalent to zero. This is significantly better type-safety, and should be used instead of NULL.

The definitions of const and mutable changed a little; see this video. In short: const actually means bit-wise constant or thread-safe; mutable means the object is already thread-safe (atomics, mutexs, some queues). This is probably what you thought they meant (but didn't) in C++98.

Containers and iterators

While iterators existed in previous versions, using them is now part of the language, with the iterating for statement (for each). It allows a unintuitive iteration loop to be written more cleanly and compactly. Compare the following two methods of setting the values of a vector to zero:

for(vector<double>::iterator iter = vector.begin(); iter != vector.end(); ++iter)
    *value = 0.0;

for(auto &value : vector)
    value = 0.0;

To make something iterable, you should define a begin and end method or function. There are several options, as well. Adding an & will give you a reference, to save memory copies and to allow mutation of the original iterable. Adding const & avoids the copy but ensures that you don't change the original iterable.

A related improvement is the addition of container constructors. Which means you can finally do this:

std::vector<int> values = {1,2,3,4,5,6};

C++ also has a variety of different initializers; C++11 added a uniform initializer syntax, so it has even more different initializers. The benefit is that it works in cases that the old syntax had issues. It solves the Most Vexing Parse, where a new object cannot be followed by parenthesis, since that is a function definition. It also does not explicitly narrow.

Due to the addition of initialiser lists, though, this can be confusing. What does the following statement produce, a list of 42 integers, or one integer with the value 42?

std::vector<int> vector{42};

AnswerClick to expand

Example

Functional programming

Functions are now easier to refer to and create. A std::function type is useful, but usually will be hidden with auto unless you are crafting a function to take functions as arguments. The lambda function allows an inline function definition, with some perks. The syntax is [](){}, which looks like a normal function definition with the function name and type replaced by the square brackets. For example:

auto square = [](double x){return x*x;};
double squared_five = square(5.0);

The lambda function gets interesting when you add something to the square brackets; this is called "capture" and allows you to capture the surrounding variables. For example:

int i = 0;
auto counter = [&i](){return i++;};
counter(); // returns 0
counter(); // returns 1

You can capture by value, by reference, etc. If you use [=] or [&], the lambda function will automatically capture (by value or reference, respectively) any variables mentioned inside the function. Note that if you do use capture, you will lose the ability to convert to a old-style C function pointer.

Example

Class improvements

Default values for members can be declared in the definition now (as I'm sure you've tried to do in older C++ at least once). You can also call a previous constructor in the initializer list (delegating constructors).

Compile time improvements

The slow removal of the ugly, error-prone macro programming has started in C++11, with constexpr. A function or class with this modifier (and lots of restrictions as to what it can contain) can be used by the compiler at compile time to produce a result in your compiled code.

The move operation was added, as well. This allows moving a value through a special std::move, and also can be invoked by the compiler automatically if you return a value from a function. This reduces the need to return pointers from functions.

Std library improvements

The powerful std::shared_ptr and std::unique_ptr remove most of the reasons to fear pointers, though they don't work that well with toolkits like ROOT that try to do their own memory management. A chrono library was added for consistent timekeeping on all platforms. A threading library provides tools that work with the new functional tools and makes threading easy, and also a mutex and atomic library to support it. A regular expression library was added. More algorithms were added. A sized integer library was finally added with stdint.h (a benefit of being based on C99). Random number generation is finally properly supported, with a good set of algorithms and distributions.

Several container libraries were added. The most notable container library addition is the array library, which provides a compile time replacement for C-arrays, with bounds features. Unordered versions several containers were added. Containers became much more powerful, now with move semantics, as well as emplace_back, which can construct a value directly inside a container. Most containers support initializer lists.

A functional library was added to give an explicit type to function pointers, and is used for lambdas as well. A bind function allows you to add function arguments to a function (currying in functional programming terms).

Variadic templates

Variadic templates allow a function or constructor to take an unlimited number of arguments of any type. This allows the std::tuple feature, Gaudi's functional algorithms, and other very powerful features. Although the implementation may seem unusual at first, especially if you are used to a scripting language like Python, the implementation allows the parameter expansion to happen at compile time, meaning there is no runtime penalty for using variadic templates.

The syntax uses the following constructs:

Declaring a parameter pack. This is denoted by an ellipses to the left of a parameter name. The ellipses are usually placed next to the proceeding type, such as in template<typename... Ts> or Ts... values, but the meaning is the same. For a class this must be the last parameter; for a function it can occur earlier (but usually does not).
Expanding a parameter pack. This is denoted by an ellipses to the right of a parameter pack or expression containing a parameter pack. This is often seen in the body of the functions. These are commonly used to call functions, funct(values...), or perform an expression then call a function funct(do_something(values)...). In the second example, do_something is called on each value before calling the funct function.
Counting the parameters in the parameter pack. This is done with the sizeof...() function, which is a constexpr function that returns the number of parameters inside the parameter pack.

Variadic template function exampleClick to expand

An example of a Python style print function:

void print() {
    std::cout << std::endl;
}
template<typename T, typename... Ts>
void print(T value, Ts... values) {
    std::cout << value << " ";
    print(values...);
}

Here, the ellipses serves the same role the first two times it is seen; it is declaring a parameter pack values with type Ts. Inside the function, the ellipses is expanding the call to print(first value, second value, etc). This recursively calls itself, ending in the final empty argument form; a very common pattern for variadic templates.

Note

There is an old feature called variadic functions from C, which allows an unlimited number of arguments, but is not type safe. This is how the unlimited argument printf function is defined. This uses an ellipsis at the end of the parameter list, optionally after a comma.

For more on variadic templates and what they can do, see Cpp Reference's page on paramter packs.

Move semantics

One of the more fundamental changes in the language was the promotion of move semantics to a language feature, as well as stronger guidelines on auto-optimization. This can fundamentally change the way functions are written. What is the problem with this statement in C++03?

GiantObject item = GiantObject_returning_function();

Here, you create a GiantObject inside the function, and then copy it to a new object item, then delete the old object. It's horribly wasteful in both time and memory; if you don't have enough memory for two separate copies of GiantObject, you can crash your program. In C++11, not only is the idea of a move instead of a copy added, the compiler is generally recommended to do that for you if it can. So the value will simply be moved, with no changes to either the function or the line above, as long as there are no references retained to the object inside the function (though globals, members, parameters, etc). This is a compiler recommendation, rather than a formal requirement of the language, but most compilers implement it. (In C++17, it becomes an official requirement of the language in most cases.) This is part of C++11's move away from the use of pointers, and is huge step in the right direction.

Moving is also a part of the language, but using it explicitly requires some understanding of a fundamental feature of C++ that was not usually talked about before: value categories.

Explanation of value categories

The names rvalues and lvalues historically refer to where the expression tends to be relative to an assignment operation. So for the expression:

x = 1+2;

The x is an lvalue, and the 1+2 is an rvalue. The names can be slighltly misleading, since an l-value could occur on either side of the assignment, and there is no requirement that an assignment be made in every line of C, but every expression (or sub-expression) can be classified under these two categories in C++03. The difference is in memory; the L-value is given a real location in memory, while the rvalue is temporary and cannot be used past the current expression (and the compiler might optimize it away in some cases). An example of an expression that is an rvalue is *(x+1), where x is a pointer. It is quite possible for a function to return an lvalue.

In C++11, rvalues are further complicated because of the desire to move object. To properly grasp the syntax, it is important to understand all 3 unique categories in C++11, from the C++ standard n3055:

lvalue: Anything that can be on the left side of equal sign.
xvalue: An rvalue that about to expire (something being returned from a function, for example). An xvalue can be moved.
prvalue: An rvalue that is not about to expire, like a literal (12, true) or the result of a non-reference return of a function.

C++11 also defines two combination categories, glvalue is an lvalue or xvalue, and the classic rvalue, which is a prvalue or an xvalue.

Syntax for the value categories

Functions can return any of the three C++11 categories:

int returns_prvalue();
int& returns_lvalue();
int&& returns_xvalue();

And, functions can take either of the values:

void foo(int any_value);
void foo(int& lvalue);
void foo(int&& rvalue); // rvalue takes an rvalue, but inside the function it is a named lvalue

This hopefully gives you an idea of what a move function must look like. Its signature must be:

T&& move(T&& input_rvalue);

Here, it takes an rvalue and returns an xvalue. It is the same thing as doing a static cast to an rvalue. This xvalue is therefore movable by the compiler.

Example of explicit moveClick to expand

std::string a = "original";
std::string b = "new";
b = std::move(a); // a now empty
std::cout << "[" << a << " " << b << "]";
// Will output: [ orignal]

The most common uses tend to be in overloads, such as for move constructors, to specialize behavior for moves.

Other features

Tuples allow multiple return values, albeit through std::tie. In C++17 they are elevated to a more fundamental part of the language through new syntax.

C++ now supports user defined literals, for strings and for numbers. These were not included in the standard library for common types until C++14, though. One use was added, though; raw string literals R"..." were added, with similar meaning and semantics to the Python raw strings, were intended for the same use (regular expressions).

Attributes were added, allowing arbitrary tags to be added to expressions. There are very few official ones, though some get added for each version. These are intended mostly for compiler specific functions, like OpenMP. They are indicated by double square brackets.

The parser is smarter, finally understanding <one, <two>> correctly without forcing a space between the greater-thans.

Suffix return type allows the return type to be listed after the function definition line, instead of before; this allows templated functions with complex return types to be written more easily, since the parameters exist after the definition line, but not before.

The override keyword indicates a function is intended to override a virtual function, making the intention clear (and compiler can check and issue errors). The final keyword is also added, enforcing a virtual function to be non-overridable.

What's new in C++11