Critique of the IO monad being viewed as a state monad operating on the world

Question

The IO monad in Haskell is often explained as a state monad where the state is the world. So a value of type IO a monad is viewed as something like worldState -> (a, worldState).

Some time ago I read an article (or a blog/mailing list post) that criticized this view and gave several reasons why it's not correct. But I cannot remember neither the article nor the reasons. Anybody knows?

Edit: The article seems lost, so let's start gathering various arguments here. I'm starting a bounty to make things more interesting.

Edit: The article I was looking for is Tackling the awkward squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell by Simon Peyton Jones. (Thanks to TacTics's answer.)

Is it [this article](http://kawagner.blogspot.co.at/2007/02/why-monads-are-evil.html) (or this [older version of it](http://kawagner.blogspot.co.at/2007/01/real-functional-programming-or-why-io.html))? — Joachim Sauer, Aug 20 '12 at 07:45
@JoachimSauer Thanks, it's also an interesting article, but it's not the one I'm looking for. That one was focused on the state-of-the-world paradigm. — Petr, Aug 20 '12 at 08:11
Off the top of my head, the comments [here](http://blog.ezyang.com/2011/05/unraveling-the-mystery-of-the-io-monad/) are a good start — Adam, Aug 20 '12 at 18:55
What does "world" mean in this context? I presume it does not mean "The Earth." Is it some kind of global scope? The author who penned this is selling himself short. If he wants to simultaneously confuse and crush the egos of his readers, he should call it "The State is the Universe" or "The God State." World. Pah! You young people don't aspire high enough these days! — GlenPeterson, Sep 04 '12 at 16:37
Also informative: http://stackoverflow.com/a/2754244/1333025 — Petr, Sep 04 '12 at 20:11

score 34 · Accepted Answer · answered Sep 05 '12 at 19:51

34

The problem with IO a = worldState -> (a, worldState) is that if this were true then we could prove that forever (putStrLn "Hello") :: IO a and undefined :: IO a are equal. Here is the proof courtesy of dolio (2010, irc):

forever m
 =
m >> forever m
 =
fix (\r -> m >> r)
 = {definition of >> for worldState -> (a, worldState)}
fix (\r -> \w -> r (snd $ m w))

Lemma: (\r w -> r (snd $ m w)) ⊥ = ⊥

(\r w -> r (snd $ m w)) ⊥
  =
\w -> ⊥ (snd $ m w))
  =
⊥ . snd . m
  =
⊥

Therefore forever m = fix (\r -> \w -> r (snd $ m w)) = ⊥

In particular forever (putStrLn "Hello") = ⊥ and hence forever (putStrLn "Hello") and undefined are equivalent programs. However, clearly they are not supposed to be considered equivalent programs, in theory or in practice.

Notice that this model is wrong even without invoking concurrency.

answered Sep 05 '12 at 19:51

Russell O'Connor

506
3
2

8

Is anyone surprised that a nonterminating program is equivalent to `undefined` in the pure semantics of Haskell? Different ⊥s are supposed to be indistinguishable in Haskell's pure semantics! But when we think *operationally* about our programs we want to distinguish different kinds of ⊥ too, even when `IO` is not involved; **I** care whether my program is throwing an exception or entering an infinite loop, even if you can prove that those are equal by proving that they're both ⊥. That's not actually a contradiction though. – Ben Jan 24 '13 at 20:40
4

The denotations of ⊥ and [0,1..] are distinct even though they are both "non-terminating". The difference is that ⊥ is denotes non-terminating *and* non-productive computations, while [0,1..] is non-terminating but productive. We expect (forever (putStrLn "Hello")) to have a similar non-terminating but productive denotation. – Russell O'Connor Feb 09 '14 at 17:20
2

But `forever (putStrLn "Hello")` isn't like `[0,1..]`, surely. Your proof isn't particular to `worldState`, therefore it also applies to the regular state monad. So `forever (someModificationWith "Hello")` is also denotationally equivalent to ⊥. I'm completely unsurprised by that result; it *isn't* productive in the denotational semantics, and what the computer is doing *operationally* while we wait forever is irrelevant. Same thing for `forever (putStrLn "Hello")`; it doesn't **and shouldn't** produce a new world state we can somehow consume lazily. – Ben Feb 09 '14 at 23:31
Are programming languages like Mercury and Clean that use *explicit* world-state passing to provide a declarative model for IO fundamentally wrong? – Ben Feb 09 '14 at 23:35
@Ben are you referring to how does world passing work with concurrency? Have you seen the rosetta code for Mercury's concurrency? I have wondered what that means semantically too. – CMCDragonkai Feb 20 '15 at 04:27
@Ben - The whole point of the IO monad is that the output (and input) of a program *is* part of the value of the program, as far as the language is concerned, not just the result! The denotational semantics is supposed to capture everything you need to know to predict the behavior of the system (on an infinitely fast computer). And, yes, that means that Mercury and Clean *aren't* actually purely functional, since it's impossible to describe I/O in those languages without re-introducing side effects. – Jonathan Cast Aug 18 '17 at 12:26

Christopher Done · Answer 2 · 2012-09-04T15:11:55.387

12

Here's a trivial answer: any change to the state monad's state is due to any actions ran in the monad. If indeed the “WorldState -> (a, WorldState)” explanation claims the same property, with WorldState being a pure value that only the IO monad changes, it's wrong. Time changes, the content of files, the state of handles, etc. can change independently of what happens in the IO monad. That's the point of the IO monad. The fact that GHC passes around a RealWorld value (or w/e it was) underneath is to guarantee things are ran in order, as far as I know, if that (may just be something to put in the ST value).

edited Sep 04 '12 at 15:11

answered Sep 04 '12 at 14:43

Christopher Done

229
1
4

8

that's actually not a problem. you can model the bind operation as performing a modification to the world state derived from some fixed-but-unknowable rule store. – sclv Sep 04 '12 at 15:51
1

@sclv: yes, but this fixed-but-unknowable rule store is the differentiating factor that makes the IO *not* the state monad, this inconsistency is not found in the state monad – Jimmy Hoffa Sep 04 '12 at 16:43
An argument I've heard against the WorldState state is related to concurrency, though I cannot remember the exact argument. But even still, I hypothesize that WorldState could encode the future into it as well, so I still don't really see the problem. Of course, I suppose I'm missing something. – Thomas Eding Sep 04 '12 at 17:07
@JimmyHoffa: You can carry around the rule store in state though. – sclv Sep 04 '12 at 18:22
@sclv: Which would make state similar to IO if that was automatically a piece of state. State can in your way mimic the IO monad, but this behavior is the default and only reasonable implementation of the IO monad, not true of the state monad. Moreover, in your example how unknowable is the rule store? Arguably in the case of the IO monad, there is an amount of it being truly unknowable, where as in your example the uknowable rule store could actually be known to the developer or through reverse engineering. No such ability is present in the case of IO. – Jimmy Hoffa Sep 04 '12 at 19:03
Well, this makes look the IO monad looking like a measurement/interaction in quantum physics. The world is a unknown state and only what you're directly interacting with gets a "real" value. – datenwolf Sep 04 '12 at 21:00
@datenwolf: yep. even in the cps style implementation of IO there's a notion in which evaluation is "collapsing the waveform." This actually relates deeply to models of concurrency that use symmetry to reduce the state space. – sclv Sep 05 '12 at 14:41
1

@JimmyHoffa: this is the purpose of abstraction. Also, to follow up on my initial comment, Clean models IO as world-passing explicitly and happily, using uniqueness types to ensure that you don't cheat and "duplicate" the world. This is one way of enforcing the abstraction. – sclv Sep 05 '12 at 14:42
That's not a problem at all. The **type** `WorldState -> (a, WorldState)` only claims to accept a `WorldState` and give you back another one. It "does something" to the world. In the pure *model* of IO, that includes clocks advancing, stars decaying, network packets arriving, etc. Operationally of course it *implements* that by just observing how the real world actually changes. – Ben Jan 24 '13 at 20:25
But the interface `WorldState -> (a, WorldState)` is deliberately abstract, because the whole point is to model something that can't actually be directly expressed in Haskell, and it never breaks its contract. Mercury also uses *explicit* world-passing. The monad bit of Haskell's IO interface is only needed to ensure you don't reuse the world; understanding it as a State-monad-like abstraction for world-passing *does* work, because world-passing *does* work **directly**. – Ben Jan 24 '13 at 20:27
One thing I haven't understood is how does unique world passing methods work with concurrency? If I'm spinning up multiple concurrent threads which may concurrently modify the world, how does the world passing ensure uniqueness? I saw an example of concurrency on Mercury, and all it did was pass IO_in and immediately take the IO_out and pass it to the next thread constructor. How can they do this without "duplicating" the world? – CMCDragonkai Feb 20 '15 at 04:25
@CMCDragonkai The implementation of the `io` passing is just a no-op. But the *model* that makes you able to do IO with pure code says that whenever you do a "primitive IO operation", the `io` it gives you back is a world that incorporates the effects of the primitive **and arbitrary other effects**. The "other effects" come from other threads in your program, as well as totally unrelated processes going on in the real world (like stars decaying). – Ben Feb 20 '15 at 05:42
@CMCDragonkai Each thread does have its own "world", but within each thread the world must be used uniquely. They sync up when you call primitives. The pure model says *nothing at all* about exactly when each thread can observe what results from another thread's IO actions... but that's exactly how concurrency with threads works! Unless you explicitly synchronise with locks or other methods, the effects of threads are interleaved in a completely unspecified order. This is consistent with the semantics we'd have if we really could pass around representations of the entire universe. – Ben Feb 20 '15 at 05:42
So to visualise it, it's as if IO splits of into multiple threads and never comes back together unless we explicitly create some sort of synchronisation primitive? Does that kind of mean merging the independent IO values back together? (So we get an IO with the world that now incorporates the effects of all the other worlds produced by the other IOs?) How does a synchronisation work? Can you give an syntactic example, something like (sync(IO1, IO2, IO3))? – CMCDragonkai Feb 20 '15 at 06:55

score 12 · Answer 3 · answered Sep 04 '12 at 16:18

12

I wrote a blog post on the topic of how to model IO as a form of asymmetric coroutine communicating with the runtime system for your language. (It is admittedly the third part of a series)

http://comonad.com/reader/2011/free-monads-for-less-3/

That post covers a bit of why it is awkward to reason about the semantics of 'world-passing'.

answered Sep 04 '12 at 16:18

Edward Kmett

364
2
7

+1 - especially interesting, as I have long planned on implementing the IO of the language I'm designing in a similar manner to this! :) – Jules Jun 18 '16 at 11:40

score 8 · Answer 4 · edited Aug 04 '17 at 09:13

8

See Tackling the Awkward Squad.

The big reason is RealWorld state models of the IO monad don't work well with concurrency. SPJ in this readable classic favors using an operational semantics to understand it.

edited Aug 04 '17 at 09:13

MasterMastic

268
3
13

answered Sep 04 '12 at 16:25

TacTics

89
1

I believe this is the original article I was looking for, mainly section 3.1. If you had posted it before I edited the question, I'd have accepted your answer, but now I think it'll be more fair to wait until the end, to see all ideas that others will post. – Petr Sep 05 '12 at 07:27

score 5 · Answer 5 · answered Sep 04 '12 at 16:37

The main complaint about RealWorld state models is that, as TacTics says, world-passing doesn't necessarily work with concurrency. But Wouter Swierstra and Thorsten Altenkirch showed how to reason about concurrency as "world-passing" effect, with a fixed-but-arbitrary sequence of interleaving threads in their paper "Beauty in the Beast: A Functional Sematics for the Awkward Squad": http://www.staff.science.uu.nl/~swier004/Publications/BeautyInTheBeast.pdf

The code corresponding to this is on Hackage as IOSpec: http://hackage.haskell.org/package/IOSpec

I think Wouter's thesis goes into more detail: http://www.staff.science.uu.nl/~swier004/Publications/Thesis.pdf

Critique of the IO monad being viewed as a state monad operating on the world

5 Answers5