Eoin's Programming Blog: Haskell

Showing posts with label Haskell. Show all posts

Monday, 31 March 2014

Haskell Prime Number Generator (part 2 of 3)

In my previous post I detailed the following elegant but inefficient Haskell prime number generator:

primes (x:xs) = x : primes (filter ((/= 0) . (`mod` x)) xs)

This algorithm was taken from user gauchopuro in the Project Euler Problem 7 forum. In this post I'm going to look at a prime number generator also submitted in that forum.

Again, as I'm currently learning Haskell, I'm going to explain this function at the beginner level.

Prime Number Generator from user vineet:

isPrime x = isPrimeHelper x primes

isPrimeHelper x (p:ps)

| p*p > x = True

| x `mod` p == 0 = False

| otherwise = isPrimeHelper x ps

primes = 2 : filter isPrime [3..]

This prime number generator is a bit longer than the previous one. It doesn't use a sieve, and it involves three functions, each of which, as I will show, is quite simple. The above code uses recursion, pattern matching, the filter function, the cons operator, and lazy evaluation, all of which are explained in the previous post. See that post for more details.

In addition, to parse the above code, you'll need to understand Haskell guards. The isPrimeHelper function is defined for three different conditions using the guard notation '|'. If the predicate after the guard evaluates to True then the function definition to the right of the corresponding equals is used. If more than one predicate evaluates to True only the first one is used. So, in the above example, if p*p > 0 then isPrimeHelper will return True. If p*p > 0 is False, then x `mod` p == 0 is evaluated next. If that's True then isPrimeHelper returns False. Otherwise isPrimeHelper returns the result of isPrimeHelper x ps.

Now to parse the above code. Firstly consider that the list of prime numbers is defined as:

primes = 2 : filter isPrime [3..]

This line is quite simple. It says that the list of prime numbers starts with 2, and is followed by all the numbers from 3 to infinity that return True when passed to isPrime. The isPrime function, again, looks like this:

isPrime x = isPrimeHelper x primes

This function uses the final component of this prime number generator, isPrimeHelper, to determine if the argument x is prime or not. Notice that it also uses primes which we've defined as the list of prime numbers. The list primes is infinite and so it's never complete, and you may think this would be a problem. However, as will be explained, this works thanks to Haskell's lazy evaluation and the way isPrimeHelper is written and called. Here's isPrimeHelper again:

isPrimeHelper x (p:ps)

| p*p > x = True

| x `mod` p == 0 = False

| otherwise = isPrimeHelper x ps

Before explaining this code, here is an explanation of how the algorithm for isPrimeHelper works. It takes a number to check, x, and a list of primes (p:ps). If any number in the list of primes divides evenly into x then x is not prime, and the function returns False. The primes are checked one by one beginning with the smallest. If the algorithm reaches a primes that is greater than the square root of x then there's no need to check any further as x is prime (because any number x can have only one prime factor greater than the square root of x, and in the case of a prime number that factor is x itself).

The code for isPrimeHelper works as follows. The function uses guards (explained earlier). The first element from the list of primes passed in is p. If p*p > x, then p is greater than the square root of x, so x is a prime number and the function returns True. Otherwise the function checks if p is a factor of x. If so then x is not prime and isPrimeHelper returns False. Finally, if p isn't a factor of x then isPrimeHelper calls itself recursively with p being removed from the front of the list of primes.

Remember that the list of primes passed into isPrimeHelper initially has just the single element 2, and this list is extended as the algorithm runs. Due to lazy evaluation isPrimeHelper only requires the elements one at a time until it returns, and it never runs out of elements because one the conditions that lead to the function returning will always occur before the end of the list is reached.

This prime number generator is much faster that the previous one. Although it's a bit longer, each of the three functions it uses is quite simple to understand. In another post I'll show how to make this function a bit more performant, and in my opinion a bit more readable.

Friday, 21 February 2014

Haskell Prime Number Generator (part 1 of 3)

Project Euler Problem 7 reads "What is the 10 001st prime number?". Solving this problem requires a prime number generator. I solved this using Haskell, which I'm currently learning, but after I'd done so I learned a lot more by looking at others' solutions. Here I present some of the Haskell prime number generators being discussed in the Project Euler forum, problem 7 thread (I can't link to the forum as you need to have an account and have solved Problem 7 to access that thread):

As I'm just learning Haskell my aim is to annotate these examples at my own level, i.e. for beginners. Note, the examples I'm referring to were meant to solve a particular problem, so in some cases I've tweaked them slightly to make them general prime number generators.

Prime Number Generator from user gauchopuro:

primes (x:xs) = x : primes (filter ((/= 0) . (`mod` x)) xs)

This function must be supplied with a list of the positive integers starting at 2 and it uses the Sieve of Eratosthenes to calculate the list of the primes. For example a function that uses the above to get the first n prime numbers would look like this:

getFirstNPrimes n = take n $ primes [2..]

Or to get the nth prime:

getNthPrime n = primes [2..] !! (n-1)

Although very concise, once you understand the notation involved, primes is also very expressive. Here's what you need to know to parse it:

primes uses pattern matching. It expects to be supplied with a list of integers and the(x:xs)to the left of the equals sign means that within the function (i.e. to the right of the equals sign) x will refer to the first element in the list and xs (pronounced exes, as in the plural of x) refers to the remainder of the list.
primes is recursive. The list that it generates consists of the first element of the list passed into it and the output of primes when passed the remainder of the list after some filtering.
The filter function expects two arguments. The first argument is a predicate and the second is a list. filter returns a new list consisting of the elements from the initial list that yield True when passed to the predicate.
The . operator composes two functions i.e. (f . g) x == f (g x).
The cons operator ':' appends a new element to the start of a list.

Haskell Lazy Evaluation

Although not strictly required by the primes function shown, it's also useful to understand Haskell's lazy evaluation. It means that Haskell will do just enough work to get to the result it needs. For example, consider the following function:

fib :: Int -> Int -> [Int]
fib a b = a : fib b (a+b)

The first line tells us that this function takes two integers and returns a list of integers. The second line defines the function and says that the list to return is made up of the first integer argument followed by the list produced by calling fib again recursively, with the original second argument as the new first argument, and the sum of the original arguments as the new second argument. This function can be called with 1 1 to generate the Fibonacci sequence as follows:

fib 1 1 = 1 : fib 1 2
= 1 : 1 : fib 2 3
= 1 : 1 : 2 : fib 3 5
= 1 : 1 : 2 : 3 : fib 5 8 etc...

You can see how this generates the Fibonacci sequence, but if you're not familiar with lazy evaluation you'd expect this function to call itself recursively indefinitely and never return. However, lazy evaluation ensures that the function does just enough work to get the result it needs. So if you just want the first 10 Fibonacci numbers you call the function with:

take 10 (fib 1 1)

and the result is:

[1,1,2,3,5,8,13,21,34,55]

After the first 10 elements are generated Haskell has the result it needs and so fib returns without continuing to call recursively.

Understanding primes

Here is primes again:

primes (x:xs) = x : primes (filter ((/= 0) . (`mod` x)) xs)

and, again, it is to be called with the list of integers starting at two, for example like:

primes [2..] !! 101 -- To get the 100th prime

With the above knowledge you can now understand primes as implementing the Sieve of Eratosthenes as follows:

It's originally passed a list of integers starting at 2 (2, 3, 4, 5...). The first element of that list, x, will originally be 2. This becomes the first element in the list of prime numbers to be returned.
The remainder of the list, xs, is the integers from three upwards. They are filtered with the predicate ((/= 0) . (`mod` x)) to create a new list. Remember x is still 2, so this predicate first finds the remainder when the input is divided by 2, and returns True if that remainder is non-zero. As True is returned for all odd elements in xs, the even numbers are filtered out of the list. This filtered list (3, 5, 7, 9...) is passed into primes recursively.
This time the first element in the list is 3, so x is 3, and this becomes the next prime number that is returned. Now xs is the list of odd integers starting at 5. They are filtered as before with x=3 so that the filtered list is the integers from 5 that don't divide evenly by 2 or 3 (5, 7, 11, 13, 17...). This list is passed recursively into primes.
This process continues adding another prime number to the list each time.

This process can be illustrated with the following equivalences:

primes [2, 3, 4, 5, 6...] = 2 : primes [3, 5, 7, 9, 11...]

= 2 : 3 : primes [5, 7, 11, 13, 17...]

= 2 : 3 : 5 : primes [7, 11, 13, 17, 19...]

= 2 : 3 : 5 : 7 : primes [11, 13, 17, 19, 23...]

= 2 : 3 : 5 : 7 : 11 : primes [13, 17, 19, 23, 29...]

Although not the most efficient prime number generator, I did find this one to be expressive, and a good learning aid.

I'll follow up in future posts with more prime number generators.

Monday, 3 February 2014

Comparison of C++ and Haskell for Project Euler

I've already solved quite a few of the problems on Project Euler using C++, but as I've recently been learning Haskell, I decided to redo them all using it. Solving real problems is a good way to learn a new language and as I've already done them in C++ I can make some comparisons between the two languages.

Haskell for Maths Problems

Without using Haskell for very long it's easy to see that it is well suited to maths problems. This is because the syntax is optimised for expressing mathematical statements and it often resembles maths notation. For example, consider the following definition for the absolute function, absolute:

absolute n
| n >= 0 = n
| otherwise = -n

You don't need to know Haskell to figure out what this function does. Another frequently used example is the factorial function. It can be defined, for example, in the following three ways:

factorial n
| n == 0 = 1
| otherwise = n * factorial (n-1)

factorial2 0 = 1
factorial2 n = n * factorial2 (n-1)

factorial3 n = product [1..n]

Not only are these more succinct than in a procedural language, but they also read more naturally.

Haskell for Project Euler, Problem 5

Given Haskell's suitability for maths problems, it should be well suited to Project Euler. I'll demonstrate that here using Problem 5, "What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?"

Firstly, here is an explanation of my approach. The solution must be a multiple of all the prime numbers below 20, and as they don't share any common factors the solution must be a multiple of the product of the primes below 20. Therefore I calculate this product and test multiples of it until I find the first one that the non primes below 20 also divide evenly into.

Here is the solution implemented in C++

bool isValidAnswer(int numberToCheck) {
vector<u64> nonPrimesTo20{ 4, 6, 8, 9, 10, 12, 14, 15, 16, 18 };
  auto isMultiple = true;
  for_each(nonPrimesTo20.begin(), nonPrimesTo20.end(), [&](u64 n) {
    if (numberToCheck%n != 0) {
      isMultiple = false;
    }
  });
  return isMultiple;
}

int main(void) {
  const auto productOfPrimesTo20 = 2*3*5*7*11*13*17*19;
  auto numberToCheck = productOfPrimesTo20;

  while (!isValidAnswer(numberToCheck)) {
    numberToCheck += productOfPrimesTo20;
  }
  cout << "Answer is " << numberToCheck << endl;
return 0;
}

and here is the same algorithm implemented in Haskell:

productOfPrimesTo20 = 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19
nonPrimesTo20 = [4, 6, 8, 9, 10, 12, 14, 15, 16, 18]
isValidAnswer x = and [mod x n == 0 | n <- nonPrimesTo20]
answer = head [x | x <- [productOfPrimesTo20, productOfPrimesTo20*2..], isValidAnswer x]

Again, it is clear that the Haskell solution is much more succinct than in C++. Not only that, but if you understand the Haskell notation, the Haskell solution is also more expressive, and it's easier to read as you don't need to keep track of a loop or any variables.

If you're not familiar with Haskell then this code will look pretty obfuscated, but I've got posts coming up on Haskell prime number generation that explain many of the techniques used above.