5

I am trying to get a firm grasp of exceptions, so that I can improve my conditional loop implementation. To this end, I am staging various experiments, throwing stuff and seeing what gets caught.

This one surprises me to no end:

% cat X.hs
module Main where

import Control.Exception
import Control.Applicative

main = do
    throw (userError "I am an IO error.") <|> print "Odd error ignored."
% ghc X.hs && ./X
...
X: user error (I am an IO error.)
% cat Y.hs
module Main where

import Control.Exception
import Control.Applicative

main = do
    throwIO (userError "I am an IO error.") <|> print "Odd error ignored."
% ghc Y.hs && ./Y
...
"Odd error ignored."

I thought that the Alternative should ignore exactly IO errors. (Not sure where I got this idea from, but I certainly could not offer a non-IO exception that would be ignored in an Alternative chain.) So I figured I can hand craft and deliver an IO error. Turns out, whether it gets ignored depends on the packaging as much as the contents: if I throw an IO error, it is somehow not anymore an IO error.

I am completely lost. Why does it work this way? Is it intended? The definitions lead deep into the GHC internal modules; while I can more or less understand the meaning of disparate fragments of code by themselves, I am having a hard time seeing the whole picture.

Should one even use this Alternative instance if it is so difficult to predict? Would it not be better if it silenced any synchronous exception, not just some small subset of exceptions that are defined in a specific way and thrown in a specific way?

duplode
  • 33,731
  • 7
  • 79
  • 150
Ignat Insarov
  • 4,660
  • 18
  • 37
  • The docs state that "[the two functions are subtly different](https://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Exception.html#v:throwIO)". – chi Aug 07 '19 at 20:13
  • But whoever wrote that had weird ideas about the meaning of "subtly". – K. A. Buhr Aug 07 '19 at 20:16
  • It is too subtle for me, I do not get the difference. Where in my code is `seq`? – Ignat Insarov Aug 07 '19 at 20:19
  • Using the `Alternative` instance of `IO` for that kind of loop is unidiomatic. In Java, it would be like throwing an exception at the end of a for loop's body in order to advance to the next iteration. – danidiaz Aug 07 '19 at 20:37
  • @danidiaz Well then what is idiomatic with Alternative IO? – Ignat Insarov Aug 07 '19 at 20:38
  • @IgnatInsarov In my opinion, not a lot. Perhaps retries when the action we want to retry *already* throws exceptions, we don't want to inspect or analyze the exceptions thrown, and we want to exit with an exception if all the retries fail. – danidiaz Aug 07 '19 at 20:47
  • In my opinion, the really annoying thing is that we don't have separate `catch`, `catchIO`, and `catchFrom` corresponding to `throw`, `throwIO`, and `throwTo`. Imprecise, precise, and asynchronous exceptions are all quite different. The most annoying thing is that there's no reliable way to distinguish asynchronous exceptions from imprecise ones in catching, when they have important differences. In particular, the distinction affects thunk resumption, and there's no reliable way to rethrow a caught exception with the same synchronous/asynchronous status. – dfeuer Aug 07 '19 at 22:39
  • @dfeuer I am thinking `catches action [someAsyncRethrow, someHandle]` is your way to distinguish asynchronous exceptions. I cannot undersand the part about thunk, and I do not understand why you say there is no reliable way to rethrow an exception with the same status. I thought whether an exception is synchronous or asynchronous is written on the box? When it matches `SomeAsyncException`, it is asynchronous, and if you rethrow it, it is still asynchronous. Am I missing something big here? – Ignat Insarov Aug 08 '19 at 06:56

3 Answers3

6

throw is a generalization of undefined and error, it's meant to throw an exception in pure code. When the value of the exception does not matter (which is most of the time), it is denoted by the symbol ⟘ for an "undefined value".

throwIO is an IO action which throws an exception, but is not itself an undefined value.

The documentation of throwIO thus illustrates the difference:

throw e   `seq` x  ===> throw e
throwIO e `seq` x  ===> x

The catch is that (<|>) is defined as mplusIO which uses catchException which is a strict variant of catch. That strictness is summarized as follows:

⟘ <|> x = ⟘

hence you get an exception (and x is never run) in the throw variant.

Note that, without strictness, an "undefined action" (i.e., throw ... :: IO a) actually behaves like an action that throws from the point of view of catch:

catch (throw   (userError "oops")) (\(e :: SomeException) -> putStrLn "caught")  -- caught
catch (throwIO (userError "oops")) (\(e :: SomeException) -> putStrLn "caught")  -- caught
catch (pure    (error     "oops")) (\(e :: SomeException) -> putStrLn "caught")  -- not caught
Li-yao Xia
  • 31,896
  • 2
  • 33
  • 56
  • @IgnatInsarov The documentation for `throwIO` states that it should be preferred to `throw` when working in a `IO` context. – danidiaz Aug 07 '19 at 20:43
  • 2
    If anything `throw` is the abomination. If you have exceptions throwing them is a perfectly fine action to do, that's `throwIO`. – Li-yao Xia Aug 07 '19 at 20:45
  • @danidiaz Does it per chance also explain why? I have kinda read what it says on the box, — maybe twice, — but I am still not seeing the picture. – Ignat Insarov Aug 07 '19 at 20:45
  • @Li-yaoXia But I can catch `throw`n exceptions just as well. We have exceptions occuring in pure computations anyway, have we not? Like `ArithException` for instance? – Ignat Insarov Aug 07 '19 at 20:48
  • But we also can't really get rid of `throw` (or `error` or `undefined`) either. In a non-strict language with general recursion (Haskell), you will have "undefined values" at least because of nontermination, so you might as well keep the ability of throwing exceptions instead of going into an infinite loop and make the best out of it. – Li-yao Xia Aug 07 '19 at 20:48
  • `ArithException` isn't really an exception; it's an ordinary data type that gets used in a pure *simulation* of exceptions. `throw`, on the other hand, raises a real "catch it or the runtime gives up" exception (which are called IO exceptions only because they only get generated during the execution of an IO action, not because they are values some `IO` type). – chepner Aug 07 '19 at 20:51
  • What do you mean @chepner? You can `throw (Overflow :: ArithException)`. – Li-yao Xia Aug 07 '19 at 20:52
  • @Li-yaoXia I think of it as `throw` *generating* an exception using information carried by `Overflow`, rather than `Overflow` *being* the actual exception. – chepner Aug 07 '19 at 20:56
  • There's also an `IOException` type that's distinct from `ArithException`, so I wouldn't take the denomination "IO exception" so lightly. – Li-yao Xia Aug 07 '19 at 20:59
5

Say you have

x :: Integer

That means that x should be an integer, of course.

x = throw _whatever

What does that mean? It means that there was supposed to be an Integer, but instead there’s just a mistake.

Now consider

x :: IO ()

That means x should be an I/O-performing program that returns no useful value. Remember, IO values are just values. They are values that just happen to represent imperative programs. So now consider

x = throw _whatever

That means that there was supposed to be an I/O-performing program there, but there is instead just a mistake. x is not a program that throws an error—there is no program. Regardless of whether you’ve used an IOError, x isn’t a valid IO program. When you try to execute the program

x <|> _whatever

You have to execute x to see whether it throws an error. But, you can’t execute x, because it’s not a program—it’s a mistake. Instead, everything explodes.

This differs significantly from

x = throwIO _whatever

Now x is a valid program. It is a valid program that always happens to throw an error, but it’s still a valid program that can actually be executed. When you try to execute

x <|> _whatever

now, x is executed, the error produced is discarded, and _whatever is executed in its place. You can also think of there being a difference between computing a program/figuring out what to execute and actually executing it. throw throws the error while computing the program to execute (it is a "pure exception"), while throwIO throws it during execution (it is an "impure exception"). This also explains their types: throw returns any type because all types can be "computed", but throwIO is restricted to IO because only programs can be executed.

This is further complicated by the fact that you can catch the pure exceptions that occur while executing IO programs. I believe this is a design compromise. From a theoretical perspective, you shouldn't be able to catch pure exceptions, because their presence should always be taken to indicate programmer error, but that can be rather embarrassing, because then you can only handle external errors, while programmer errors cause everything to blow up. If we were perfect programmers, that would be fine, but we aren't. Therefore, you are allowed to catch pure exceptions.

is :: [Int]
is = []

-- fails, because the print causes a pure exception
-- it was a programmer error to call head on is without checking that it,
-- in fact, had a head in the first place
-- (the program on the left is not valid, so main is invalid)
main1 = print (head is) <|> putStrLn "Oops"
-- throws exception

-- catch creates a program that computes and executes the program print (head is)
-- and catches both impure and pure exceptions
-- the program on the left is invalid, but wrapping it with catch
-- makes it valid again
-- really, that shouldn't happen, but this behavior is useful
main2 = print (head is) `catch` (\(_ :: SomeException) -> putStrLn "Oops")
-- prints "Oops"
HTNW
  • 27,182
  • 1
  • 32
  • 60
3

The rest of this answer may not be entirely correct. But fundamentally, the difference is this: throwIO terminates and returns an IO action, while throw does not terminate.


As soon as you try to evaluate throw (userError "..."), your program aborts. <|> never gets a chance to look at its first argument to decide if the second argument should be evaluated; in fact, it never gets the first argument, because throw didn't return a value.

With throwIO, <|> isn't evaluating anything; it's creating a new IO action which, when it does get executed, will first look at its first argument. The runtime can "safely" execute the IO action and see that it does not, in fact, provide a value, at which point it can stop and try the other "half" of the <|> expression.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • 2
    I like to think of Haskell as a purely function DSL (with no exceptions) for composing IO programs, which are written in a language you have little direct access to. Li-yao puts it nicely: `throw` lets you raise an exception inside the IO interpreter directly from the pure Haskell code. – chepner Aug 07 '19 at 20:32