Should the state of the ST not change in order to be executed by the runST?

Question

type ST s a = ST (s -> (s, a))

runST :: (forall s. ST s a) -> a
runST (ST f) = case (f realWorld) of (_, a) -> a

If you look closely, there are many errors, but the overall structure of runST is as above.

So, if you want to apply runST to a value of type ST s a, which is identical to s -> (s, a), its type s must be fully parameterized.

Some kinds of functions depending on concrete type can't applied by runST. Example of inappropriate function is below.

\s -> (s + s, "hello world")   // this won't run cause it depends on (Num) type class.

fun [] = ([], 0)
fun (x : xs) = (xs, length (x : xs))  // this won't run. It depends on ([]) type.

So, functions to be executed by runST should be as follows.

\s -> (s, fun s)

fun = maybe some native code, not haskell.

The point is that s from (s, a) term of s -> (s, a) should always same as s itself, like an identity.

We call it parametricity.

Currently I know that if s has RealWorld in it, even what seems to be an id can make meaningful calculations. (Although it's not pure Haskell).

To prove what I guessed, I prepared the following experiment.

//given
newMutVar# :: v -> State# s -> (# State# s, MutVar# s v #)

//then (this is pseudo code)
let (# s#, var# #) = newMutVar# "hello world" (State# 0) in
    s# == State# 0

and see if the result is True, which implies that newMutVar# acts like id for State#.

But I can't do this because I don't know how to generate State# s value. I know how to do it only for RealWorld, and its meaningless cause RealWorld only has one value inside so it always be identical no matter what the mapping is.

Also, even if I succeed generating State# 0, there's no way to compare State# 0 to s#, cause State# s doesn't implement Eq.

You need to be clear with yourself; are you interested in an answer from the point of view of ordinary Haskell, or from the point of view of the implementation of `ST`? Remember that `ST` exists to give a pure interface to *genuine* mutation; it can't actually be implemented from the ground up in ordinary Haskell, which does not allow mutation. From the point of view of Haskell the `s` values "might" change, and that's part of what keeps the side effects executed in-order; it makes the order of side effects look like a data dependency to the compiler. — Ben, Apr 04 '23 at 05:48
But from the implementation point of view, I believe no `s` value is ever actually important (much like `RealWorld`, which also is only an implementation detail that isn't part of the interface exposed to ordinary Haskell). The fact that the `s` type variable is present (and is rank-2 polymorphic) enforces all the properties that are actually required, without an actual runtime value being important. So I don't believe there is any reason for the actual value to be changed (or to be anything other than an empty token like `RealWorld`), and so I would imagine it indeed does not. — Ben, Apr 04 '23 at 05:54
It helped me from both of the point of view. "```s``` keeping side effects in-order" seems interesting to me. — kwonryul, Apr 04 '23 at 06:10
I could write that as an answer, but I don't really know anything about how `ST` is actually implemented. But basically: from ordinary Haskell the actual values of type `s` (and indeed what types are ever chosen for `s`) are simply not part of the visible interface. So the question doesn't really have an answer. From the implementation of `ST` it has an answer, but there you're dealing with mutable memory and "unsafe" features, so ordinary reasoning based on equations and types isn't a sure guide to what happens. And the answer isn't *important* to how you're supposed to think about it anyway. — Ben, Apr 04 '23 at 06:22
In terms of implementation `ST` is equivalent to `IO`. In terms of API, it's much more limited such that it's safe to use local mutation to generate a pure result. (Assuming you're careful with the use of things like `unsafeInterleaveST`, which can be disastrous if used incorrectly.) — Carl, Apr 04 '23 at 08:55
What are you trying to do? It seems like you are trying to break Haskell's purity or something like that (considering the previous question on state). If so, there are probably some `unsafeIO-like` function which allow you to do it easierly. — lsmor, Apr 04 '23 at 08:56

Carl · Answer 1 · 2023-04-06T05:55:37.757

I think you've conflated the State type with ST to some extent. I see that you've looked behind the curtains to see how GHC defines ST and IO, but I think you've misinterpreted the meaning of what you see there. You're trying to reason about the implementation as if it was the same thing as State. That's understandable given the use of names like State# and the general structure of definitions like this:

type STRep s a = State# s -> (# State# s, a #)

That sure looks like the idea behind the State type, but that's very misleading. To see why, you need to look at the documentation for State# which ends with a critical sentence: "It is represented by nothing at all."

There is no such thing as a value of type State# Realworld or State# s. A function with a type like STRep s a up there actually is a 0-argument function (the State# s argument is represented by nothing at all) that returns a single value of type a (an unboxed pair of nothing at all and an a value).

This is very magical within the implementation of GHC. ~~You can't create a type like that.~~ (I've discovered that there are enough such types now that GHC actually does expose tools to create your own. But they're far from standard Haskell.) The compiler has recognize it and support all the special cases for handling it correctly. So why bother? Because it serves to bridge the difference in representation between a Haskell function and a machine code procedure. All the optimizations which take place at the Haskell (well, GHC core) level see a regular Haskell function that's doing token passing. The token passing serves to serialize execution so that GHC can't reorder operations when it's doing optimizations. But then when the GHC core is converted into either C-- for the native code gen backend or LLVM IR for the LLVM backend, all the State# values get stripped out so that at a low level you don't have all that useless token passing that the GHC core representation suggested is happening.

So the reason you don't know how to create a State# value is that they don't exist. They're never updated because ST doesn't work anything like State, despite the superficial similarities in their internals.

Should the state of the ST not change in order to be executed by the runST?

1 Answers1