Sunday 8 December 2013

Writing an Arbitrarily Large Integer Class in C++

Project Euler Problem 25 is to find the first 1000 digit Fibonacci number. I'm implementing a solution in C++ so the problem here is that the largest number that can be represented with a built-in type is 264A few of the Project Euler problems prior to this one required handling numbers greater than 264 - for example Problem 13 is to find the 10 least significant digits of a sum of one-hundred 50-digit numbers - but in those cases it was possible to work around the 264 limitation by discarding digits that aren't needed in the final answer or using doubles.

However, this problem prompted me to write my own class that could deal with arbitrarily large integers. To this end I have created embiguint.


Main goals of EMBigUInt

The main goals of the class are as follows:
  • The class should work naturally with the build in types. For example it should support both:
    • embiguint = int + embiguint, and
    • embiguint = embituint + int
  • The class should provide implicit construction from the built in types.
  • Like the built-in types, the class should work with all the standard arithmetic operators (+, -, /, *, +=, *=, etc...) and comparison operators (==, >, >=, !=, etc...). These operators should work with embiguints and between embiguints and the built-in types.
  • The class should provide some additional functionality beyond what the built-in types, for example:
    • Construction from a string of digits to support constructing an embiguint greater than 264 e.g. 
      embiguint myBigUInt("1234567890123456789012")
    • Return a count of the number of digits.
There's still much more work to do to make the class I want, but here are some notes on design decisions I've taken so far:
  • Efficiency is not my first concern. My initial aim it to get a working class. When I've got all the functionality I need with unit tests and performance tests in place I can work on efficiency in the confidence that I won't break existing functionality and I can measure any speed gains. Some outcomes from this are:
    • I'm using standard library vectors, which are easy to use, provide lots of debugging help, and come with lots of standard library algorithms. It's worth investigating if raw arrays could improve efficiency, but for now this would slow down development.
    • I'm relying on implicit conversion from built-in types to embiguint for some of the functions. I could implement more efficient functions if I overloaded specifically for the built-in types but, again, this can be done as a refactoring.
  • So far, most of the functions in the library are non-member functions. This follows from the advise in Item 24 of Scott Meyers' Effective C++: Declare non-member functions when type conversion should apply to all parameters, and it allows me to do
    embiguint = int + embiguint

    Here I haven't overloaded
    operator+ for the type int, instead the non-member function 
    const embiguint operator+ (const embiguint& lhs, const embiguint& rhs)

    is being used with implicit type conversion converting the
    int to an embiguint.

Using EMBigUInt for Project Euler Problem 25

This class in its current state enables Project Euler Problem 25 to be solved very simply with the following:

embiguint lastValue = 0;
embiguint currentValue = 1;
emuint counter = 1;

while (currentValue.GetNoDigits() < 1000) {
  embiguint temp = currentValue;
  currentValue += lastValue;
  lastValue = temp;
  counter++;
}

cout << "Answer is " << counter << endl;

No comments: