Sunday, 22 December 2013

STL Tuples

The STL tuple class is a bit like the STL pair, except the tuple supports any number of types to be tied together in a single object. This can be achieved using variadic templates, although many tuple implementations were completed before variadic templates were available (e.g. VC++ 2012). They implemented the tuple by using different macros for different numbers of arguments.


STL pairs are useful but sometimes not sufficient

STL pairs are useful when two pieces of data are always coupled together, for example:

std::pair<std::string, float> raceWinner{ "M40", 17.20 };

However, in the past, to include any more pieces of data required some more work, for example creating a simple struct or class:

struct triathlonResult {
std::string ageCategory;
float swim;
float cycle;
float run;
};

which (if uniform initialization is supported) can then be used with:

triathlonResult triathlonWinner{ "M35", 13.45, 35.5, 17.3 };

STL tuples extend the pair to any number of arguments

The introduction of the tuple to the standard library means that a custom class is no longer needed, instead:

std::tuple<std::string, float, float, float> triathlonWinner( "F35", 13.45, 35.5, 17.3 );

can be used to create an object that holds the same information.

Initialization can be simplified using auto and make_tuple like this:

auto triathlonWinner = make_tuple("F35", 13.45, 35.5, 17.3);

An added convenience is that the comparison operators (==, !=, >, >=, < and <=) are included in the standard library. From the standard [tuple.rel] - "The elementary comparisons are performed in order from the zeroth index upwards. No comparisons or element accesses are performed after the first equality comparison that evaluates to false".


Retrieving elements from a tuple

Retrieving elements from a tuple may differ from how you're used to getting them from a pair. One way to retrieve members from a pair is to use the public members first and second. For example:

auto raceWinnersCategory = raceWinner.first;
auto raceWinnersTime = raceWinner.second;

However, there is no tuple member function to retrieve elements, instead they must be retrieved using the non-member function get (defined in <tuple>). The reason is that the element to retrieve must be indexed, and if this were to happen on a member function it would need to be invoked using the template keyword like this:

auto triWinnersCategory = triathlonWinner.template get<0>();
(for more information see [tuple.helper] note 11 from the standard)

To avoid this a non-member get function is provided, to be used as shown:
auto triWinnersCategory = get<0>(triathlonWinner);
auto triWinnersSwimTime = get<1>(triathlonWinner);
auto triWinnersCycleTime = get<2>(triathlonWinner);
auto triWinnersRunTime = get<3>(triathlonWinner);

This may not seem very natural to code at first but it is easy to read. Note that this get function can also be used for pairs, so if you suspect your pairs may one day be promoted to tuples then it's a good idea to use get with pairs too.

Sunday, 15 December 2013

Uniform Initialization in VC++ 2013

Before uniform initialization was introduced to C++ there were many ways to initialize variables. For example:

int myInt(3);
int myInt = 3;
Musician guitarist; // Zero argument constructor
Musician drummer("Black", 10.01, true);

Uniform initialization provides a uniform way to initialize objects using curly braces. For example:

int myInt {4};
Musician piper {}; // Zero argument constructor
Musician jazzFlautist {"Black", 12.34, true};

Uniform initialization provides a standard method for initializing objects and for consistency alone should be the first preference. However, there are further advantages.

Further advantages of Uniform Initialization

The uniform initialization syntax can be used with the class std::initializer_list<> to pass a variable number of similar arguments to a constructor. This is useful for creating std containers, for example:

vector<int> myInts { 1, 3, 2, 4};

will create a vector of ints with four elements: 1, 2, 3, and 4To make use of std::initializer_list<> in your own classes you'll need to write an initializer list constructor, i.e. a constructor that takes an std::initializer_list<> of the desired type and parses it as required. If you do this users will be able to use the { } notation to invoke that constructor.

Uniform initialization can also be nested, for example:

vector<Musician> band { guitarist ,
                        jazzFlautist,
                        {"Blue", 43.21, false} };

In the above case guitarist and jazzFlautist are added to the vector, as is a third anonymous object that is created and copied (or hopefully moved) into the vector.

Uniform initialization also brings a way to avoid Most Vexing Parse confusion. For a simple illustration consider the following code:

Musician harpist();

This may be considered an acceptable way to create a Musician object called harpist with the zero argument constructor, but in fact it declares a function called harpist that takes zero arguments and returns a Musician. This ambiguity can take more subtle forms, for example where a function call is intended as an input parameter to a function call, but the compiler reads this as a function declaration that takes a function type as a parameter. It's so called the most vexing parse for good reason, but it is avoided using by the curly braces uniform initialization notation, i.e.

Musician harpist{};

Another advantage of uniform initialization is that it is a convenient way to initialise Aggregate classes. For example given the following definition:

struct number {
int asInteger;
std::string asString;
bool isPrime;
};

a number can be initialized with:

number theNumberFour{ 4, string("four"), false };

Reasons to still use the old constructor syntax

There may still be times when you wish to use the old constructor syntax. If a class has an initializer list constructor then it will take precedence over other constructors. If you wish to call the other constructors you'll need to use the old constructor syntax to be explicit.

For example, if you wish to create a std::vector of ints and you wish to call the constructor that takes a single parameter specifying the vector's size then writing:

std::vector<int> myInts{4};

won't work. The above code would create a vector of ints and initialize it with the single element 4. Instead you need to use the round bracket notation :

std::vector<int> myInts(4);

However, in general you should default to using uniform initialization, and only break that pattern when you've got a specific reason.

Sunday, 8 December 2013

Writing an Arbitrarily Large Integer Class in C++

Project Euler Problem 25 is to find the first 1000 digit Fibonacci number. I'm implementing a solution in C++ so the problem here is that the largest number that can be represented with a built-in type is 264A few of the Project Euler problems prior to this one required handling numbers greater than 264 - for example Problem 13 is to find the 10 least significant digits of a sum of one-hundred 50-digit numbers - but in those cases it was possible to work around the 264 limitation by discarding digits that aren't needed in the final answer or using doubles.

However, this problem prompted me to write my own class that could deal with arbitrarily large integers. To this end I have created embiguint.


Main goals of EMBigUInt

The main goals of the class are as follows:
  • The class should work naturally with the build in types. For example it should support both:
    • embiguint = int + embiguint, and
    • embiguint = embituint + int
  • The class should provide implicit construction from the built in types.
  • Like the built-in types, the class should work with all the standard arithmetic operators (+, -, /, *, +=, *=, etc...) and comparison operators (==, >, >=, !=, etc...). These operators should work with embiguints and between embiguints and the built-in types.
  • The class should provide some additional functionality beyond what the built-in types, for example:
    • Construction from a string of digits to support constructing an embiguint greater than 264 e.g. 
      embiguint myBigUInt("1234567890123456789012")
    • Return a count of the number of digits.
There's still much more work to do to make the class I want, but here are some notes on design decisions I've taken so far:
  • Efficiency is not my first concern. My initial aim it to get a working class. When I've got all the functionality I need with unit tests and performance tests in place I can work on efficiency in the confidence that I won't break existing functionality and I can measure any speed gains. Some outcomes from this are:
    • I'm using standard library vectors, which are easy to use, provide lots of debugging help, and come with lots of standard library algorithms. It's worth investigating if raw arrays could improve efficiency, but for now this would slow down development.
    • I'm relying on implicit conversion from built-in types to embiguint for some of the functions. I could implement more efficient functions if I overloaded specifically for the built-in types but, again, this can be done as a refactoring.
  • So far, most of the functions in the library are non-member functions. This follows from the advise in Item 24 of Scott Meyers' Effective C++: Declare non-member functions when type conversion should apply to all parameters, and it allows me to do
    embiguint = int + embiguint

    Here I haven't overloaded
    operator+ for the type int, instead the non-member function 
    const embiguint operator+ (const embiguint& lhs, const embiguint& rhs)

    is being used with implicit type conversion converting the
    int to an embiguint.

Using EMBigUInt for Project Euler Problem 25

This class in its current state enables Project Euler Problem 25 to be solved very simply with the following:

embiguint lastValue = 0;
embiguint currentValue = 1;
emuint counter = 1;

while (currentValue.GetNoDigits() < 1000) {
  embiguint temp = currentValue;
  currentValue += lastValue;
  lastValue = temp;
  counter++;
}

cout << "Answer is " << counter << endl;