6

Context: I need to write a mostly stateless compiler which transforms VM bytecode into machine codes. Most VM commands can be translated statelessly with pure function like the following:

compilePop = ["mov ax, @sp", "dec ax", "mov @sp, ax"]

compile :: VM_COMMAND -> [String]
compile STACK_POP = compilePop 

-- compile whole program
compileAll :: [VM_COMMAND] -> [String]
compileAll = flatMap compile

But some commands need inserting labels which should be different for every call.

I understand how to do this with a state object "global" for entire compiler:

compileGt n = [label ++ ":", "cmp ax,bx", "jgt " ++ label]
                where label = "cmp" ++ show n

compile :: Int -> COMPILER_STATE -> VM_COMMAND -> (COMPILER_STATE, [String])
-- here state currently contains only single integer, but it will grow larger
compile lcnt STACK_POP = (lcnt, compilePop)
compile lcnt CMP_GT    = (lcnt + 1, compileGt lcnt)

compileAll commands = snd $ foldr compile commands 0
                      -- incorrect, but you get the idea

But I think this is bad because every specialized compile function needs only little piece of a state or even none at all. For example in not such a purely functional JavaScript I'd implement specialized compile functions with local state in a closure.

// compile/gt.js
var i = 0;
export default const compileGt = () => {
  const label = "cmp" + i++;
  return [label ++ ":", "cmp ax,bx", "jgt " ++ label];
};
// index.js
import compileGt from './compile/gt';

function compile (cmd) {
  switch (cmd) {
  case CMP_GT: return compileGt();
  // ...
  }
}

export default const compileAll = (cmds) => cmds.flatMap(compile);

So the question is how can I do the same in Haskell or an explanation why it's really bad idea. Should it be something like that?

type compileFn = State -> VM_COMMAND -> [String]
(compileFn, State) -> VM_COMMAND -> ([String], (compileFn, State))
kirilloid
  • 14,011
  • 6
  • 38
  • 52
  • 3
    Use the state monad. – Benjamin Hodgson Mar 01 '17 at 12:31
  • @BenjaminHodgson that's pretty obvious, but I don't understand how it answers my question regarding global vs local state. Am I correct with proposed interface at the very end of my question? How to combine specific compilers if they have different types of states? – kirilloid Mar 01 '17 at 13:19

1 Answers1

8

If you have...

data Big = Big { little :: Little, stuff :: Whatever }

... you can define your...

littleProcessor :: State Little [String]

... and then use a function like this one...

innerState :: Monad m 
    => (s -> i) -> (i -> s -> s) -> StateT i m a -> StateT s m a
innerState getI setI (StateT m) = StateT $ \s -> do
    (a, i) <- m (getI s)
    return (a, setI i s)

... to lift it to the bigger state:

bigProcessor :: State Big [String]
bigProcessor = innerState little (\l b -> b {little = l}) littleProcessor

(Add auxiliary definitions to taste.)

The use of the getter/setter pair in innerState makes it look like it should be possible to phrase it in terms of lenses. Indeed, zoom from lens is basically innerState with minimised boilerplate:

{-# LANGUAGE TemplateHaskell #-}
import Control.Lens

data Big = Big { _little :: Little, _stuff :: Whatever }
makeLenses ''Big -- little is now a lens.
bigProcessor :: State Big [String]
bigProcessor = zoom little littleProcessor
duplode
  • 33,731
  • 7
  • 79
  • 150
  • Your answer is pretty deep, but I'd like to clarify how it answers my specific question. Do I understand you correctly that you suggest having "global" state for the compiler and make local states for specialized compiler functions via lenses focusing on this global state? – kirilloid Mar 06 '17 at 14:44
  • Yes, that's the idea. The key point is that a `State Little` computation has no access to the rest of the `Big` state, even though `zoom` allows you to use it as a `State Big` computation. – duplode Mar 06 '17 at 16:12
  • Here we achieve isolation, but if I'd like to add new VM command and new compiler I'd need to re-compile Big source. Is it possible and not too crazy to do hold scope together with a function like that `(State Scope, Scope -> In -> (Scope, Out))`? – kirilloid Mar 06 '17 at 16:33
  • @kirilloid [1/2] (1) Do note that e.g. `State Scope Int` doesn't actually store a `Scope`: it is a computation of an `Int` that uses (and transforms) a `Scope` state that will be supplied at the point of use. To put it in a different way, `Scope -> In -> (Scope, Out)` is equivalent to `In -> State Scope Out`. (2) "if I'd like to add new VM command and new compiler I'd need to re-compile Big source" -- not necessarily. If your new command can be defined in terms of the substates already defined in/for `Big` you don't need to change `Big`. – duplode Mar 06 '17 at 17:19
  • @kirilloid [2/2] (3) You might find [this question](http://stackoverflow.com/q/40698396/2751851) to be an interesting read, though I strongly suspect you don't actually need the trickery discussed there. – duplode Mar 06 '17 at 17:19
  • I've yet to comprehend "state doesn't actually store for _smth_, but is a computation". As for re-compilation, yes, I'd need something like "data à la carte" for `VM_COMMAND`s to make code extensible w/o recompilation anyway. – kirilloid Mar 06 '17 at 21:30
  • @kirilloid `newtype State s a = State { runState :: s -> (a, s) }` is a simple way of defining `State` -- it is a *function* that takes a state and returns a result and an updated state. (This isn't literally what you'll find if you go to the sources of *transformers*, but the definition there -- `type State s = StateT s Identity` -- is equivalent to this one.) – duplode Mar 06 '17 at 21:48