102

I've seen people talking about Free Monad with Interpreter, particularly in the context of data-access. What is this pattern? When might I want to use it? How does it work, and how would I implement it?

I understand (from posts such as this) that it's about separating model from data-access. How does it differ from the well-known Repository pattern? They appear to have the same motivation.

Benjamin Hodgson
  • 4,488
  • 2
  • 25
  • 34

2 Answers2

148

The actual pattern is actually significantly more general than just data access. It's a lightweight way of creating a domain-specific language that gives you an AST, and then having one or more interpreters to "execute" the AST however you like.

The free monad part is just a handy way to get an AST that you can assemble using Haskell's standard monad facilities (like do-notation) without having to write lots of custom code. This also ensures that your DSL is composable: you can define it in parts and then put the parts together in a structured way, letting you take advantage of Haskell's normal abstractions like functions.

Using a free monad gives you the structure of a composable DSL; all you have to do is specify the pieces. You just write a data type that encompasses all of the actions in your DSL. These actions could be doing anything, not just data access. However, if you specified all your data accesses as actions, you would get an AST that specifies all the queries and commands to the data store. You could then interpret this however you like: run it against a live database, run it against a mock, just log the commands for debugging or even try optimizing the queries.

Lets look at a very simple example for, say, a key value store. For now, we'll just treat both keys and values as strings, but you could add types with a bit of effort.

data DSL next = Get String (String -> next)
              | Set String String next
              | End

The next parameter lets us combine actions. We can use this to write a program that gets "foo" and sets "bar" with that value:

p1 = Get "foo" $ \ foo -> Set "bar" foo End

Unfortunately, this is not enough for a meaningful DSL. Since we used next for composition, the type of p1 is the same length as our program (ie 3 commands):

p1 :: DSL (DSL (DSL next))

In this particular example, using next like this seems a little odd, but it's important if we want our actions to have different type variables. We might want a typed get and set, for example.

Note how the next field is different for each action. This hints that we can use it to make DSL a functor:

instance Functor DSL where
  fmap f (Get name k)          = Get name (f . k)
  fmap f (Set name value next) = Set name value (f next)
  fmap f End                   = End

In fact, this is the only valid way to make it a Functor, so we can use deriving to create the instance automatically by enabling the DeriveFunctor extension.

The next step is the Free type itself. That's what we use to represent our AST structure, build on top of the DSL type. You can think of it like a list at the type level, where "cons" is just nesting a functor like DSL:

-- compare the two types:
data Free f a = Free (f (Free f a)) | Return a
data List a   = Cons a (List a)     | Nil

So we can use Free DSL next to give programs of different sizes the same types:

p2 = Free (Get "foo" $ \ foo -> Free (Set "bar" foo (Free End)))

Which has the much nicer type:

p2 :: Free DSL a

However, the actual expression with all of its constructors is still very awkward to use! This is where the monad part comes in. As the name "free monad" implies, Free is a monad—as long as f (in this case DSL) is a functor:

instance Functor f => Monad (Free f) where
  return         = Return
  Free a >>= f   = Free (fmap (>>= f) a)
  Return a >>= f = f a

Now we're getting somewhere: we can use do notation to make our DSL expressions nicer. The only question is what to put in for next? Well, the idea is to use the Free structure for composition, so we will just put Return for each next field and let the do-notation do all the plumbing:

p3 = do foo <- Free (Get "foo" Return)
        Free (Set "bar" foo (Return ()))
        Free End

This is better, but it's still a bit awkward. We have Free and Return all over the place. Happily, there's a pattern we can exploit: the way we "lift" a DSL action into Free is always the same—we wrap it in Free and apply Return for next:

liftFree :: Functor f => f a -> Free f a
liftFree action = Free (fmap Return action)

Now, using this, we can write nice versions of each of our commands and have a full DSL:

get key       = liftFree (Get key id)
set key value = liftFree (Set key value ())
end           = liftFree End

Using this, here's how we can write our program:

p4 :: Free DSL a
p4 = do foo <- get "foo"
        set "bar" foo
        end

The neat trick is that while p4 looks just like a little imperative program, it's actually an expression that has the value

Free (Get "foo" $ \ foo -> Free (Set "bar" foo (Free End)))

So, the free monad part of the pattern has gotten us a DSL that produces syntax trees with nice syntax. We can also write composable sub-trees by not using End; for example, we could have follow which takes a key, gets its value and then uses that as a key itself:

follow :: String -> Free DSL String
follow key = do key' <- get key
                get key'

Now follow can be used in our programs just like get or set:

p5 = do foo <- follow "foo"
        set "bar" foo
        end

So we get some nice composition and abstraction for our DSL as well.

Now that we have a tree, we get to the second half of the pattern: the interpreter. We can interpret the tree however we like just by pattern-matching on it. This would let us write code against a real data store in IO, as well as other things. Here's an example against a hypothetical data store:

runIO :: Free DSL a -> IO ()
runIO (Free (Get key k)) =
  do res <- getKey key
     runIO $ k res
runIO (Free (Set key value next)) =
  do setKey key value
     runIO next
runIO (Free End) = close
runIO (Return _) = return ()

This will happily evaluate any DSL fragment, even one that isn't ended with end. Happily, we can make a "safe" version of the function that only accepts programs closed with end by setting the input type signature to (forall a. Free DSL a) -> IO (). While the old signature accepts a Free DSL a for any a (like Free DSL String, Free DSL Int and so on), this version only accepts a Free DSL a that works for every possible a—which we can only create with end. This guarantees we won't forget to close the connection when we're done.

safeRunIO :: (forall a. Free DSL a) -> IO ()
safeRunIO = runIO

(We can't just start by giving runIO this type because it won't work properly for our recursive call. However, we could move the definition of runIO into a where block in safeRunIO and get the same effect without exposing both versions of the function.)

Running our code in IO is not the only thing we could do. For testing, we might want to run it against a pure State Map instead. Writing out that code is a good exercise.

So this is the free monad + interpreter pattern. We make a DSL, taking advantage of the free monad structure to do all the plumbing. We can use do-notation and the standard monad functions with our DSL. Then, to actually use it, we have to interpret it somehow; since the tree is ultimately just a data structure, we can interpret it however we like for different purposes.

When we use this to manage accesses to an external data store, it is indeed similar to the Repository pattern. It intermediates between our data store and our code, separating the two out. In some ways, though, it's more specific: the "repository" is always a DSL with an explicit AST which we can then use however we like.

However, the pattern itself is more general than that. It can be used for lots of things which do not necessarily involve external databases or storage. It makes sense wherever you want fine control of effects or multiple targets for a DSL.

Tikhon Jelvis
  • 5,206
  • 1
  • 24
  • 20
  • 8
    Why's it called a 'free' monad? – Benjamin Hodgson Jun 03 '14 at 20:13
  • 16
    The "free" name comes from category theory: http://ncatlab.org/nlab/show/free+object but it sort of means that it is "minimal" monad -- that only valid operations on it are the monad operations, as it has "forgotten" all it's other structure. – Boyd Stephen Smith Jr. Jun 03 '14 at 21:04
  • 3
    @BenjaminHodgson: Boyd is completely right. I wouldn't worry about it too much unless you're just curious. Dan Piponi gave a [great talk](http://www.youtube.com/watch?v=OGUuGL0AgYs&t=5m00s) about what "free" means at BayHac, which is worth a look. Try following along with his [slides](http://www.youtube.com/watch?v=OGUuGL0AgYs&t=5m00s) because the visual in the video is completely useless. – Tikhon Jelvis Jun 03 '14 at 21:56
  • 3
    A nitpick: "The free monad part is *just* [my emphasis] a handy way to get an AST that you can assemble using Haskell's standard monad facilities (like do-notation) without having to write lots of custom code." It's more than "just" that (as I'm certain you know). Free monads are also a normalized program representation that makes it impossible for the interpreter to distinguish between programs whose `do`-notation is different but actually "mean the same." – sacundim Jun 04 '14 at 00:45
  • 5
    @sacundim: Could you elaborate on your comment? Especially the sentence 'Free monads are also a normalized program representation that makes it impossible for the interpreter to distinguish between programs whose do-notation is different but actually "mean the same."'. – Giorgio Jan 02 '15 at 22:17
  • 5
    Just re-read this answer and I have a question: what does `Free` give you that you can't get from using an explicit recursive datatype? My crude understanding is that `Free` separates the recursive structure of the type from the type itself (a bit like `Fix`); it sort-of fills in the `next` type parameter with `Free DSL`. Is there some advantage to this technique that we miss out on by omitting that type parameter and writing the `Monad` instance ourselves? Something like `data DSL a = Get String (String -> (DSL a)) | Set String String (DSL a) | Return a` – Benjamin Hodgson Apr 06 '15 at 19:24
  • I haven't tried cranking the handle on that data definition so it might turn out that it's impossible and I've asked a stupid question. Feel free to shoot me down if so ;) – Benjamin Hodgson Apr 06 '15 at 19:28
  • 1
    @BenjaminHodgson: It makes the pattern more explicit, allows us to write reusable code and makes reasoning about it simpler. (In a sense, we can reuse reasoning just like we can reuse code!) Your intuition that it's just a matter of factoring the recursion out is correct, and it's useful just like any other sort of factoring with the added benefit of being mathematically grounded. The [free](https://hackage.haskell.org/package/free-4.11/docs/Control-Monad-Free.html) package has some useful functions, for example. – Tikhon Jelvis Apr 06 '15 at 21:07
  • This is a great answer. However, I noticed that `runIO` appears to be a partial function. It doesn't handle `Return` and should be amended to do so. – Andrew Thaddeus Martin May 05 '15 at 20:30
  • @AndrewThaddeusMartin: Ah, fair point. I'll add that case in. – Tikhon Jelvis May 05 '15 at 23:20
  • @AndrewThaddeusMartin: Actually, thinking about it, I'm not sure a `Return` case makes sense. The only way to produce a `Free DSL a` with `Return` is `Return undefined`, so it would be partial no matter what. I'd have to think a bit more about what that case should actually be. – Tikhon Jelvis May 05 '15 at 23:23
  • @TikhonJelvis: Yeah, I think you're right. I don't normally have an "end" piece in the `Functor`s I use for free monads, so I didn't really consider that `Return` sort of becomes unreachable. Maybe, just to make it more clear for future readers, you could add something like:`runIO (Return _) = error "Should be unreachable"`. – Andrew Thaddeus Martin May 06 '15 at 12:27
  • @AndrewThaddeusMartin: Yeah, that sounds like the best solution. – Tikhon Jelvis May 06 '15 at 17:08
  • Shouldn't `Free DSL a -> IO ()` be `(forall a. Free DSL a) -> IO ()` to force usage of `end`? – porges Aug 18 '15 at 05:22
  • @Porges: Good eye! I'll fix my code snippet, thanks. – Tikhon Jelvis Aug 18 '15 at 17:03
  • `Note how the input is a Free DSL a, for any type a. The only way to produce that with out particular DSL is by ending the expression with End—which guarantees that we can't forget to close the connection once we're done.` Not true as shown above. In this form the caller can pick any type for a. – massysett May 20 '15 at 22:31
  • Is this supposed to work under GHC 7.10? What are the extensions required? I don't seem to be able to work my way thru getting the type checker to accept it: just `Free DSL a -> IO ()` works fine but not `(forall a. Free DSL a) -> IO ()` with `ExistentialQuantification` and `RankNTypes`/`Rank2Types`). – Erik Kaplun Nov 10 '15 at 16:03
  • @ErikAllik: Ah, I think there was a mistake in my code with the `(forall a. Free DSL a)` argument. Either I only want it for the outside of the `runIO` function (with a helper function of type `Free DSL a -> IO ()`) or I might be able to make something similar work with `ImpredicativeTypes` and `Free DSL (forall a. a)`. Since `ImpredicativeTypes` is poorly supported and people recommend not to use it, I'll change the code using my first approach. – Tikhon Jelvis Nov 10 '15 at 20:08
  • @ErikAllik: I fixed up the code and tried to explain what's going on, but I think I just made it more confusing :/. – Tikhon Jelvis Nov 10 '15 at 20:18
  • 2
    I believe you could even use `foldFree` from `Control.Monad.Free` to implement `runIO`. Such an implementation would not require explicit recursion but rather you only need to specify the implementation for the individual constructors. – michid Apr 18 '20 at 13:32
15

A free monad is basically a monad that builds a data structure in the same "shape" as the computation rather than doing anything more complicated. (There are examples to be found online.) This data structure is then passed to a piece of code which consumes it and carries out the operations.* I am not entirely familiar with the repository pattern, but from what I've read it appears to be a higher level architecture, and a free monad + interpreter could be used to implement it. On the other hand, the free monad + interpreter could also be used to implement entirely different things, such as parsers.

*It is worth noting that this pattern is not exclusive to monads, and in fact can produce more efficient code with free applicatives, or free arrows. (Parsers are another example of this.)

Dan
  • 1,508
  • 2
  • 10
  • 12
  • Apologies, I should've been clearer about Repository. (I forgot that not everyone has a business systems/OO/DDD background!) A Repository basically encapsulates data access and rehydrates domain objects for you. It's often used alongside Dependency Inversion - you can 'plug in' different implementations of the Repo (useful for testing, or if you need to switch database or ORM). Domain code just calls `repository.Get()` with no knowledge of _where_ it's getting the domain object from. – Benjamin Hodgson Jun 02 '14 at 21:24