### Haskell for noobs, written by a noob

Thu, 25 December 2014 :: #haskell

Haskell is a purely functional language. It means that it's different from "normal" imperative languages like `C/C++`, `Ruby` or `Python`.

In a functional language, the thinking process is different. Instead of specifying the steps needed to perform an operation, you specify the result you'll want to get. This switch of thinking process to a different paradigm is the majority of the learning curve.

Recently I've started to learn this functional methodology. By writing down my progression, maybe it will be helpful to you as well, as I came from the imperative world without prior knowledge about functional approach.

The first problem is a Project Euler problem number 1. It goes like this:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000.

The algorithm I've used isn't the best or efficient, but it's easy to understand.

The problem states that we need to find numbers that are divisable by 3 or 5 from the pool of numbers from 0 to 1000. So, first we need to generate this pool. We start by building an infinite list, from 0 to infinity:

``````[0..]
``````

This will build a list from 0 to infinity. It will be a lazy list, so it will reserve a small amount of memory to store only the generator code, not the resulting list contents.

Later, if you'll want to reference any element from this list, it will be computed on demand. So, indexing is slower, but at the same time it requires a lot less memory than a non-lazy method.

You can see how this list looks like in `ghci`, the Glasgow Haskell Compiler Interactive mode, but prepare to use `^C` to cancel the printing process:

``````\$ ghci
...

*Main> [0..]
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,...
``````

However, we need to get only first 1000 numbers from this list. There is a function in Haskell that takes first `n` elements of user-supplied list, named `take`.

``````take 1000 [0..]
``````

The syntax is: `function-name arg1 arg2`.

So, `take` takes first `1000` elements from an infinite list of numbers from 0 to infinity.

The resulting list is a list of numbers from 0 to 999.

Since we have a list of numbers, we need to filter out those which are not divisable by 3 nor 5. Normally I'd just use an iterator that would allow me to iterate on every item in this list. This however implies that I would use a state. The "state" can be also thought as a class field, mutable local variable, mutable global variable, etc. Haskell is a pure functional language, this means that its functions are state-less. You can't have a state in your functions, unless you're using `monads`, but that is a topic for another day. We won't use it, so we have to build a function that will not use any state.

One way to generate a list with these numbers is to simply loop over the input list, get each item from the input list, and if this item is divisable by 3 or 5, append it to the output list. When repeating this for each element of the input list, we should have a proper output list with proper elements.

Before writing the body of the function, a good practice would be to declare the types on which the function is operating.

``````generateList :: [Int] -> [Int]
``````

This means: there is a function named `generateList`. I would like to specify type information for this function (`::`). My first argument is `[Int]`, so it's a list of Int's. OK, since I've specified the type of the first argument, I'm going to specify the type of next argument now (`->`). Oh wait, it will be last type in my definition, so it will actually be a type of the return value instead of a type of the second argument. So, the return value is also `[Int]`. So, my function will take a list of Int's as an argument, and will return a list of Int's as a return value.

The input list will be `take 10 [0..]`, and the output list will be list of numbers divisable by 3 or 5.

``````generateList (x:xs) = div35 x ++ generateList xs
generateList [] = []
``````

Before we go analyzing the body of the function and why it's written the way it is, there are multiple other items here that are worth explaining. Function definitions in Haskell are defined by using pattern matches. In the example above, we have 2 patterns that Haskell needs to match to invoke the proper function.

First match is defined as:

``````generateList (x:xs) = ...
``````

which is a standard notation for matching a non-empty list as an argument. I will get back to it in a minute.

Second match is defined as:

``````generateList [] = ...
``````

This is matched when `generateList` is called with an argument that is an empty list. So, if you'd call `generateList` function like this:

``````generateList []
``````

Then the second function body would be executed. Second function is equal to `[]`, so this means that if `generateList` will be called with an argument that is an empty list, it will return an empty list.

But, if `generateList` will be invoked with an argument that is not an empty list, first match will be triggered, and first body will be executed.

Now, there is a small shortcut notation being utilised here. `(x:xs)` matches a non-empty list, and automatically creates two variables named `x` and `xs`, that contain: first element of the list, and the rest of the list -- respectively. This means, that if `generateList` will be called as:

``````generateList [1,2,3,4,5]
``````

then `x` will contain `1`, and `xs` will contain this list: `[2,3,4,5]`.

By using these two matches: `(x:xs)` and `[]`, we are covering 100% of cases, because we are covering lists that are empty, and lists that are not empty. There are no other types of lists.

The body of the function invoked for non-empty lists is:

``````div35 x ++ generateList xs
``````

This means: invoke the `div35` function, and give `x` as its first parameter. Operator `++` is used as a list concatenation operator. This means that it can append multiple lists to one bigger list. Example:

`````` ++  => [1,2]
[1,2] ++  => [1,2,3]
 ++ [2,3] => [1,2,3]
[1,2] ++ [2,3] => [1,2,2,3]
``````

Then, the function recursively calls itself on the remainding part ("tail") of the input list. When it will finish, we will get a list. This list gets merged through the usage of the `++` operator with the result of div35 function to one list. This list is returned as a return value.

By quickly looking at the `div35` function, it's relatively clear what it does. First, let's look at the type of the function. It requires an Int as an argument, and will return a list of Ints -- or an empty list.

``````div35 :: Int -> [Int]
div35 x
| x == 0            = []
| x `mod` 3 == 0    = [x]
| x `mod` 5 == 0    = [x]
| otherwise         = []
``````

If `x` is `0`, the function returns `[]`. If `x` is divisable by `3`, it will return a list with one element inside: `x`. Same thing with `x` divisable by `5` -- it will return a list with that number as a sole element. In case nothing is matched, it returns an empty list.

Yes, this looks like a normal `switch` statement, from C/C++ or similar.

To get more clear view of this process, let's debug it.

Let's invoke the `generateList` function with the list of `[1,2,3]` as an argument:

``````generateList [1,2,3]
``````

It got a non-empty list as an argument, so first function body is matched. `x` is `1`, `xs` is `[2,3]`. Following operations are being done:

``````div35 1
generateList [2,3]
return new list
``````

`div35` for `1` will return an empty list, because `1` is not divisable neither by `3` nor `5`.

Then, `generateList` recursively calls itself with an argument of `[2,3]`. Lets step into it.

In a new frame, `x` is `2` and `xs` is ``. Again, we have these operations to perform:

``````div35 2
generateList 
return new list
``````

`div35` will return an empty list again. And we enter ourselves again, specifying a list `` as an argument.

So, it's a new frame of our function again. `x` is 3, and `xs` is an empty list: `[]`. Yet again, we have these operations to perform:

``````div35 3
generateList []
return new list
``````

`div35` will return ``! Because `3` is divisable by `3` (obviously), `div35` built a new list with only one item inside -- the same number as was specified in the argument. And yet again, we recursively enter our function again.

But this time, we match the second function pattern -- the one that specifies an empty list as a match pattern. This function implementation doesn't recursively call itself anymore, it merely returns an empty list. So we have our result immediately. So, `div35` returned this list: ``, `generateList` returned an empty list `[]`. We have both lists, so we can concatenate them:

`````` ++ [] => 
``````

and we return a new list constructed this way to the previous frame.

In the previous frame, `div35 2` returned an empty list, and `generateList ` returned ``. This means that we can merge them:

``````[] ++  => 
``````

and return this new list to our previous frame.

In the previous frame, `div35 1` returned an empty list as well, and `generateList [2,3]` returned `` (we just calculated this). By concatenating `[]` and ``, we get ``

So, we're returning it, and we're in the caller frame. `generateList [1,2,3]` has just returned a list containing ``. And this is a valid result: this is a list of numbers from an input list that are divisable by 3 or 5. Only 3 meets this criteria, so our function works well.

By using exactly the same method, we can build a function that will sum all of the numbers in the input list:

``````sumList :: [Int] -> Int
sumList (x:xs) = x + sumList xs
sumList [] = 0
``````

This is exactly the same method as above, so I'll just skip the analysis.

Now we have all the functions we need to have to solve the problem. We just need to use them.

Let's define a function that will use them to calculate the proper solution. In short words, we need to generate an input list of numbers, so that `generateList` will filter out all the numbers that are not divisable by 3 or 5. Then we will use the new list (that will be generated by `generateList`) as an argument to `sumList` function, that will return one Int value, being the sum of all elements in its input list.

``````result = sumList (generateList (take 1000 [0..]))
``````

This can be shortened to:

``````result = sumList \$ generateList \$ take 1000 [0..]
``````

And we're done! `result` will return one number, being the sum of all elements of a list generated by `generateList`.

Our work is done ;).

But wait, there's more.

Actually, our list generation method isn't very nice. Instead of taking 1000 numbers from an infinite list, we can simply generate a finite list by using this syntax:

``````[0..999]
``````

This will generate a finite list with elements from 0 to 999. So, why we've used an infinite list in the first place? To show you that Haskell has no problem with infinity, and by default it's using lazy evaluation!

## Bad code: checking for divisors

While we can't get over with division of elements, we can surely compress some code. We don't really need `div35` and `generateList` functions. They're reinventing the wheel. Instead of them, we can use two features from Haskell: the `filter` function and a lambda function.

You probably already know what is a lambda function. From Python's perspective it's a small function that can be inlined as an argument. C++ also has lambda functions in the form of `auto func = [&] (Args...) { body; }`. For Java people, you can imagine that a lambda function is an anonymous class containing just one (default) method inside.

One of the purposes of lambda functions is to provide a mechanism for delayed function invocation. If a higher-order function requires a function as one argument, you can either put a name of the function that is defined elsewhere, but you can also put a lambda function as this argument -- this way you will specify the body of the function in the place of its declaration. Neat.

The `filter` function works like this: it takes a function as the first argument, and an input list as the second argument. Then, for each element of this list, `filter` will invoke supplied function, putting the element as an argument to this function. Consider the following example -- you invoke the `filter` function with arguments `f` (name of a function) and `[1,2,3,4]` (a list of numbers):

``````filter f [1,2,3,4]
``````

What you will get is another list with new content. This new content will be based on the input list, but will contain only elements for which the `f` function returned `True`. So if you have an `f` function defined as:

``````-- A function named `f`, taking one argument `x`, that is defined as
-- `x == 1`.

f x = x == 1

-- This function will return True if `x` is equal to 1. If it's not equal to 1,
-- the function will return False.
``````

after calling `filter`, the resulting list will contain only one element:

``````
``````

because `f` function only returns `True` for `1`. Let's try a different function as an example:

``````f x = x < 3
``````

After invoking `filter f [1,2,3,4]` we will create this list:

``````[1,2]
``````

because only for `1` and `2` the condition `x < 3` will yield `True`.

So, to reiterate, we can build a list with numbers that are divisable by 3 or 5 by using the following Haskell equation:

``````generateList = filter (\ x -> x `mod` 3 == 0 || x `mod` 5 == 0) [0..999]
``````

It uses a lambda function in the form of `(\ x -> [...])`. The `(\` sign supposedly looks similar to a lambda character, but honestly speaking I fail to see the similarity. Important thing to remember is that `(\` is used to begin the definition of a lambda function. `x ->` is a list of argments -- in our case we're using just one argument, `x`, and `->` simply begins the body of the function. The body is:

``````x `mod` 3 == 0 || x `mod` 5 == 0
``````

This one is easy to interpret, it will return `True` if `x` is divisable by `3` or `5`. There is one peculiarity here in the form of the backtick notation for `mod` operator, but as this seems to be just a syntactical sugar, I'll leave the description of this until another day.

So our invocation of the `filter` function will invoke our lambda function for each element of the list of numbers from 0 to 999. If the lambda function will return True for an element, this element will be included in the output list.

This means that we have just built a list of elements that are divisable by 3 or 5.

Here is the code:

``````sumList :: [Int] -> Int
sumList (x:xs) = x + sumList xs
sumList [] = 0

result = sumList \$ filter (\ x -> x `mod` 3 == 0 || x `mod` 5 == 0) [0..999]
``````

So why did I use `div35` and `generateList` functions in the first place? To demonstrate a recursive looping operation, which is quite common in Haskell. You would need to learn it anyway!

To run the code, simply save the above program into a file (i.e. `1.hs`), invoke `ghci` and load the program into your current session:

``````:l 1.hs
``````

Then, just run the `result` function:

``````Main*> result
``````

`ghci` should produce a valid result. 