Advent of code and 25 days of haskell

On 30 Dec 2017

This year, I decided to do the puzzles from advent of code for the first time, in a new language. I have read some parts of the amazing Learn You a Haskell For Great Good! about two years ago, but I haven't solved any real problems nor build anything with it.

The very interesting and bite-sized problems of advent of code seemed like a perfect place to play with it. In this post, I wrote about the most interesting experiences I had with haskell over the last month. Let me know what you think!

Type signature & purity

Adding type signatures to functions before implementing them was something I found very rewarding. It helped me think about what's the concern of that piece of code and also pushed me to write smaller, self contained functions that were easier for me to reason about.

For instance, in day 11, the challenge included walking in an hexagonal grid. These are the type signatures of the functions step and walk I implemented to solve it:

 type Direction = String
type Coord = (Int, Int)

step :: Coord -> Direction -> Coord

walk :: Coord -> [Direction] -> Coord  

Making type aliases with type also made the signatures easier to read and reason about. This was specially nice for more convoluted types. In day 25 we were asked to simulate a turing machine:

 type State = String
type Pos   = Integer
type Tape  = Map.Map Integer Bool
data Rule = Rule State (Bool, Integer, State) (Bool, Integer, State)
type RuleBook = Map.Map State Rule

step :: State -> Pos -> Tape -> RuleBook -> (State, Pos, Tape) 

Here, step receives a state, a position, a tape and a book of rules and returns tuple containing the next state, position and tape. Being a pure function, we can also count on the fact that no funny business will take place when the function is invoked, since nothing changes in the world: stuff in, some other stuff out. No side effects, no IO, no mutation - that's it.


I've struggled a little bit with debugging in the beginning. printing stuff, purity and laziness have a non-trivial interplay. Let's say we want to inspect the value of the next let-bind inside the function inc:

 inc :: Int -> Int
inc n =
  let next = (n + 1)
    in next 

In principle, all I wanted was to print the value of next. So maybe something like this might work:

 inc :: Int -> Int
inc n =
  let next     = (n + 1)
      whatever = print $ "The value of next is: " ++ show next
    in next 

This is valid code, but nothing is printed. The reason is that the result whatever is never used, so the call to print is never evaluated. Forcing its eager evaluation would also defeat the purity of this function, since it would talk to the external world.

There are some more powerful debugging options, but in my case it sufficed to use Debug.Trace.trace. Under the hood, the trace function uses the unsafePerformIO, which apparently provides a escape hatch for doing IO in a sneaky way.

 inc :: Int -> Int
inc n =
  let next      = (n + 1)
      !whatever = trace ("The value of next is: " ++ show next) 1
    in next 

Note that I needed to add an exclamation point before the bind so that the call to trace is evaluated strictly. Otherwise we'd still have problems with the laziness.

Functionally handling state

On multiple occasions, it was easier for me to visualize a solution in the form of a variation of a python snippet like this:

def solve(state, n_iterations):
    accumulator = 0
    for i in range(n_iterations):
      accumulator += f(state)
      state = compute_next_state(state)
    return accumulator

This includes mutations in the state and accumulator variables. Do do away with the mutations, the recipe I applied over and over was to make the function solve recursive, in a way it calls itself again, with new arguments, instead of mutating any variables:

def solve2(state, n_iterations):
    if n_iterations == 0:
        return 0
      new_state = compute_next_state(state)
      return f(state) + solve2(new_state, n_iterations - 1)

If the number of iterations is large, this approach can lead to a stack overflow, since the caller cannot return until all recursive calls have finished. In languages that implement tail call optimization (CPython does not, and probably will not), we can solve this issue by making solve2 tail recursive:

def solve3(state, n_iterations, accumulator=0):
    if n_iterations == 0:
        return accumulator
      new_state = compute_next_state(state)
    return solve3(new_state, n_iterations - 1, accumulator + f(state))

Or, in haskell:

solve :: String -> Integer -> Integer -> Integer
solve _ 0 accumulator = accumulator
solve state iterations accumulator =
  solve (nextState state) (iterations-1) (accumulator + f(state))

I used this recipe extensively, along with reducing lists with foldl (for simple expressions) to handle loops and mutation. Haskell, in turn, does more than tail call optimization, but there's a small caveat: arguments are not evaluated until needed.

Lazy argument evaluation and stack overflows

Let's take a look at this function:

count :: Integer -> Integer -> Integer
count 0     acc = acc
count times acc = count (times-1) (acc+1)

This is a tail recursive function that returns the second parameter acc when the first parameter is egual to 0. This is a simplification of the function solve above. At first, I was surprised to see that this function does fail with a stack overflow if the argument times is large.

After a while, I found out that the reason has to do with how haskell evaluates the arguments of the recursive call. In order to delay the evaluation of the arguments, haskell wraps the expression such as (acc+1) in a thunk.

In other words, the recursive call does not receive the evaluation of the argument, but a computation that tells it how to obtain it. So, in a sense, we're still carrying the context of the caller function deep into the recursive calls, which causes the stack to grow and overflow. I was more than happy to remember that I have seen this before.

The fix for this is to ask haskell to evaluate the argument strictly, using the obviously semantically named $! function:

count :: Integer -> Integer -> Integer
count 0     acc = acc
count times acc = count (times-1) $! (acc+1)

I had to use this trick in a couple of problems, like day 5. The usage in the solution is here.

Infinite lists

Laziness make it easy to work with infinite lists. I remember having my mind blown when i first saw this piece of code:

ones :: [Integer]
ones = 1:ones

To be fair, it's not difficult to do something analogous in python land, altough it might be harder to architect a non-trivial program around it:

def ones():
  while True:
    yield 1

I was also happy to recall this stream processing programming paradigm. I used this trick in day 8, in which we were asked to simulate a cpu:

generateStates :: State -> [Instruction] -> [State]
generateStates _ [] = []
generateStates state (instr:restInstr) =
  state : generateStates (execInstr state instr) restInstr

The function generateStates receives a state (current state of the registers), a list of instructions and returns a stream of states, each resulting from applying the current instruction to the previous. If we were only interested in the last state, we could also get away with a foldl (reduce) construct. It gets more interesting if we engineer our program so that functions can act on streams and produce new streams too.

Parsing input

Almost every challange required some form of input parsing. Some were very straight forward, like a text file with one value per line. For these, using the read function was all it took.

For some inputs, I needed to split a string in a specific delimiter. For instance:

"foo,bar,baz" => ["foo", "bar", "baz"]

I was a little disappointed by the fact that it seems like there is no easy, built-in way of doing this. For some problems, I implemented my own splitting function, but then I installed the split library that provides the Data.List.Split module.

The real surprise came when I turned to regexes for parsing some inputs. I was (and am) under the impression that the regex environment in haskell is... overwhelming. There isn't a straight forward choice and I spent a few hours looking into possible packages and understanding which one would better attend my use case. I'm still not sure.

In day 25, we had to parse some structured rules like this:

 In state A:
  If the current value is 0:
    - Write the value 1.
    - Move one slot to the right.
    - Continue with state B.
  If the current value is 1:
    - Write the value 0.
    - Move one slot to the left.
    - Continue with state B.

I ended up using Text.Regex.Posix, and the Perl-like =~ operator:

parseRule :: [String] -> Rule
parseRule ls =
  let s   = unlines ls
      res = (filter (/= '\n') s) =~ patt :: [[String]]
      (state:_:w1:m1:s1:_:w2:m2:s2:_) = tail $ head res
    in Rule state ((w1 == "1"), (if m1 == "right" then 1 else -1), s1)
                  ((w2 == "1"), (if m2 == "right" then 1 else -1), s2)


Let me know what you think of this post. Emails are always welcome!