Thursday, January 23, 2014

Functional Programming 101 - With Haskell

In this blog post, I'll attempt to explain some basic concepts of Functional Programming, using Haskell. This blog post isn't about Haskell per-se, but about one way of approaching this problem, which demonstrates the benefits of functional programming.

You can run most of these examples in ghci, by saving the contents of the example into a .hs file, loading up ghci and running :load file.hs.

Many thanks to Mattox Beckman for coming up with the programming exercise, and Junjie Ying for coming finding a better data structure for this explanation than I came up with.

The Problem

You are Hercules, about to fight the dreaded Hydra. The Hydra has 9 heads. When a head is chopped off, it spawns 8 more heads. When one of these 8 heads is cut off, each one spawns out 7 more heads. Chopping one of these spawns 6 more heads, and so on until the weakest head of the hydra will not spawn out any more heads.

Our job is to figure out how many chops Hercules needs to make in order to kill all heads of the Hydra. And no, it's not n!.

Prelude: Simple Overview Of Haskell Syntax

Before we dive into code, i'll explain a few constructs which are unique to Haskell, so it's easy for non Haskellers.

  • List Creation: You can create a list / array using the : operator. This can even be done lazily to get an infinite list.
  • Defining Function: Looks just like defining a variable, but it takes parameters. One way they are different from other languages is the ability to do pattern matching to simplify your code. Here, I define a method that sums all the elements of a list.
  • More List Foo: Adding lists can be done with ++. Checking if a list is empty can be done with null. You can use replicate to create a list with the same elements over and over.

Choosing a data structure

Let's choose a simple data structure to represent the hydra. We'll pick an array to represent the heads of the Hydra, using the level of each head as the value. The initial state of the Hydra (with 9 level 9 heads) can be represented as follows: [9, 9, 9, 9, 9, 9, 9, 9, 9].

Chopping off a head

The whole point of functional programming is to build small functions and compose them later. We'll build a few functions, specific to our domain, and a few more general ones to orchestrate.

Let's first build a specific function to chop off the Hydra's head. We know that chopping off one level 9 head should result in 8 level 8 heads (and 8 of the original level 9 heads). This is represented as [8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9]

Let's build the chop function. It takes a single argument, and the current levels of the all live heads. It will return the state of the heads after chopping the first one.

The three lines of code below map to these rules:

  • If there are no heads left, just return []
  • If we find that there is a level 1 head at the start of our list, just chop it off and return the rest of the array
  • If we find that there is a higher level head at the start of our list, spawn n - 1 heads in it's place

chop []       = []
chop (1:xs)   = xs
chop (n:xs)   = (replicate (n - 1) (n - 1)) ++ xs
----------------------------------------------------
chop [1]
-- []
chop [4]
-- [3, 3, 3]
chop [3, 3, 3]
-- [2, 2, 3, 3]
chop [9,9,9,9,9,9,9,9,9]
-- [8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9]

Repeatedly chopping heads

Our function chop is a pure function as, given some input, it always returns the same output, without any sort of side effects. Side effects is a general term for modifying inputs / IO Operations / DB Calls, and so on.

Since chop is pure function, we can safely call it over and over. Let's build a list where each element is the result of chopping the previous element.

repeatedlyChop heads = heads:repeatedlyChop (chop heads)
----------------------------------------------------------
repeatedlyChop [3]
-- [[3],[2,2],[1,2],[2],[1], [], [], [] ...]
-- this is an infinite list

This paradigm is so common, that we have a functional construct that does this: iterate. We can replace the above code with the following:

repeatedlyChop heads = iterate chop heads

Truncate that infinite list

Great, we now have built a list of all the states the Hydra is in while Hercules is busy chopping away at it. However, this list goes on forever (we never put in a termination condition in the earlier code), so let's do that now.

We can use a simple empty check (null) to test if the hydra is still alive. Let's keep items as long as the Hydra is alive

takeWhileAlive (x:xs) = if null x then [] else x:(takeWhileAlive xs)

Putting the two together

repeatedlyChopTillDead heads = takeWhileAlive (repeatedlyChopTillDead heads)
----------------------------------------------------------------------------
repeatedlyChopTillDead [4]
-- [[4],[3,3,3],[2,2,3,3],[1,2,3,3],[2,3,3],[1,3,3],[3,3],[2,2,3],[1,2,3],[2,3],[1,3],[3],[2,2],[1,2],[2],[1]]

Again, these patterns are so common, that we can replace the entire thing with a single line. takeWhile keeps things in the list until the first element that doesn't match.

repeatedlyChopTillDead heads = takeWhile (not.null) (iterate chop heads)

Finishing up

Now that we have the sequence of chops needed to kill that Hydra, figuring out the number of chops is just a matter of figuring out how long the sequence is.

countOfChops heads = length (repeatedlyChopTillDead heads)
--------------------------------------------------
countOfChops [1] -- 1
countOfChops [3] -- 5
countOfChops [9,9,9,9,9,9,9,9,9] -- 986409 (this takes a while)

Extending

Now that we've solved the problem, what next? How easy is it to extend this? Let's add a new requirement: Hercules, though a half god, can only fight at most n Hydra heads at a time. If the number of Hydra heads goes above n, then hercules dies. Let's make a function willHerculesDie which takes two parameters, n and the Hydra.

Turns out, this is pretty simple. We just need to count all the heads that are alive. If the count is more than n at any point, then we return true, and Hercules dies.

willHerculesDie n heads = any (> n) (map length (repeatedlyChopTillDead heads))
----------------------------------------------------------------------------
willHerculesDie 37 [9,9,9,9,9,9,9,9,9] -- False
willHerculesDie 36 [9,9,9,9,9,9,9,9,9] -- True

So what next?

We've built a bunch of really composable functions, and we can look at each one independently to tune the system.

You can get Haskell set up with the Haskell Platform and play around with the code here.

Some great books you can check out:


If you liked this post, you could:

upvote it on Hacker News
or just leave a comment

13 comments:

  1. Do you really need an infinite list here? It seems that a simple accumulator would be more appropriate (I am not familiar with Haskell, so the below is Scheme):

    ;; hydra
    ;; creates a Hydra with n heads
    (define (hydra n)
    (make-list n n)
    )

    ;; chop
    ;; chops off a single Hydra head with value X
    ;; head X is replaced with X - 1 heads of value X - 1
    (define (chop h)
    (cond
    ((null? h) '())
    ((= 1 (car h)) (cdr h))
    (else (append (hydra (- (car h) 1)) (cdr h))))
    )

    ;; count-chops
    ;; count the number of chops it takes to kill a Hydra with N heads
    (define (count-chops heads)
    (let loop ((h heads)
    (acc 0))
    (if (null? h)
    acc
    (loop (chop h) (+ 1 acc)))))

    ;; computes 986409 in 1.57 seconds

    ReplyDelete
    Replies
    1. Good Question :-). There are multiple ways of doing it. The solution you posted is the tail recursive version of the solution, which corresponds to the following haskell code (reusing the chop method from above)

      countToSlay heads = chopsCount 0 heads
      where chopsCount acc [] = acc
      chopsCount acc heads = chopsCount (acc + 1) (chop heads)

      Most functional programming languages, the performance of the two solutions should be similar. Uncompiled, the infinite list solution takes 1.01 seconds and the tail recursive one takes 1.63 seconds on my laptop. Compiled, they take 0.090s and 0.218 respectively.

      The reason for this is because lists are lazy, and calculated just in time. Functional programming languages are incredibly good at managing memory and space for this. In fact, since `length' is the eventual consumer of the infinite list, I suspect that there would only be a few objects that actually exist at any given point, the rest would be marked for GC.

      As far as intuitiveness goes, I think there are merits for both ways of doing things. Personally I prefer the iterative way of an infinite list as it's easier for me to extend on the solution (with the willHerculesDie method). The willHerculesDie method would also need to do the null check on heads, which for me is a form of duplication.

      Delete
    2. mcmillhj please see my comment below, the updated Haskell version does this in 0.003s

      I'm sure the scheme version would do comparably fast without using an intermediary list too :P

      Delete
  2. How long does your solution take for a 10 and 11 headed Hydra? Even the tail recursive version doesnt seem to perform that "well" for those cases.

    ReplyDelete
    Replies
    1. Without compiling, it takes me quite a while. Running it in GHCI takes 73.01 seconds.

      Compiling the output, I can do (countOfChops [12]) in 9.01 seconds.

      And yes, sometimes the iterative solution is in fact faster than the tail recursive one.

      The problem is not polynomial, so it's expected to grow fast with the input.

      Delete
    2. Hmm interesting. Well Scheme doesn't have the option of compiling that I know of so there is always going to be a little more run-time overhead. I am sure there are other optimizations that I didn't include. I wrote up a small thing about my solution if you interested: http://blog.hjm.im/how-many-chops-does-it-take-hercules-to-kill-an-n-headed-hydra/

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Hi, why are you using lists? I love the tutorial, but the code is very slow. With the help of #haskell I was show you can do this without lists:

    chops 1 = 1
    chops n = 1 + (n-1) * chops (n-1)

    countToSlay = sum . map chops

    main = print $ countToSlay [12]

    Also, you could use a mutable vector if you need some type of container:

    import Control.Applicative
    import Control.Monad.Primitive
    import Data.Vector.Mutable
    import Prelude hiding (read, length)

    chop :: MVector (PrimState IO) Int -> Int -> IO Int
    chop heads 0 = read heads 0
    chop heads n = do
    count <- read heads n
    lower <- read heads (pred n)
    write heads (pred n) (lower + n * count)
    (+count) <$> chop heads (pred n)

    main :: IO ()
    main = do
    heads <- new 12
    set heads 0
    write heads (pred 12) 1 -- there is 1 12-headed hydra
    print =<< chop heads (pred 12)



    https://gist.github.com/codygman/1a85d548e1fc1e1445a6#file-haskell-slays-hydras
    https://gist.github.com/jwiegley/9383231

    ReplyDelete
    Replies
    1. I hope I don't come off as rude or unappreciative of the tutorial, I just don't want people to think Haskell is slow :)

      Delete
    2. The counting solution will always be faster regardless of the language because you don't need to allocate space for data structures.

      Delete
    3. @Cody no worries, always appreciate feedback.

      I suppose the goal of this post is more of how to approach functional programming, and not necessarily to solve this particular problem (which as you mentioned, does have a quick, closed form solution).

      Concepts of immutability, mapping and reducing are what I was hoping to put forth in this blog post, hoping that it will give a taste for people to learn more.

      Delete