Typesafe StablePtrs

Question

I spent a lot of time encoding invariants in my data types and now I am working on exposing my library to C via the FFI. Rather than marshal data structures across the language barrier I simply use opaque pointers to allow C to build up an AST and then upon eval Haskell only needs to marshal a string over to C.

Here is some code to be more illuminating.

-- excerpt from Query.hs
data Sz = Selection | Reduction deriving Show

-- Column Datatype

data Column (a :: Sz) where
    Column  :: String -> Column Selection
    BinExpr :: BinOp  -> Column a -> Column b -> Column (OpSz a b)
    AggExpr :: AggOp  -> Column Selection -> Column Reduction

type family OpSz (a :: Sz) (b :: Sz) where
    OpSz Selection Selection = Selection
    OpSz Selection Reduction = Selection
    OpSz Reduction Selection = Selection
    OpSz Reduction Reduction = Reduction

data Query (a :: Sz) where
    ... etc


-- excerpt from Export.hs

foreign export ccall "selection"
    select :: StablePtr [Column a] -> StablePtr (Query b) -> IO (StablePtr (Query Selection))

foreign export ccall 
    add :: StablePtr (Column a) -> StablePtr (Column b) -> IO (StablePtr (Column (OpSz a b)))

foreign export ccall 
    mul :: StablePtr (Column a) -> StablePtr (Column b) -> IO (StablePtr (Column (OpSz a b)))

foreign export ccall
    eval :: StablePtr (Query Selection) -> IO CString

As far as I can tell however, is that this seems to throw type safety out the window. Essentially whatever C hands off to Haskell is going to be assumed to be of that type completely negating the reason I wrote the dsl in Haskell. Is there some way I can get the benefits of using StablePtr's and retain type safe? The last thing I want is to re implement the invariants in C.

C probably doesn't have a strong enough type system to check any part of your DSL. Maybe c++ templates could do it .. but that's a nightmare. I think you have to accept that the function *could* be called incorrectly on the C side and check on the Haskell side that it was, at runtime. Of course you can still get type information at runtime (e.g. with `Typeable`) so you can still use your well-typed Haskell functions (after ensuring the input from C is well typed). — user2407038, Apr 08 '16 at 20:39

chi · Answer 1 · 2016-04-08T18:08:00.043

The C counterpart to StablePtr a is a typedef for void * -- losing type safety at the FFI boundary.

The problem is that there are infinitely many possibilities for a :: *, hence for StablePtr a. Encoding these types in C, which has a limited type system (no parametric types!) can not be done unless resorting to very unidiomatic C types (see below).

In your specific case, a, b :: Sz so we only have finitely many cases, and some FFI tool could help in encoding those cases. Still, this can cause a combinatorial explosion of cases:

typedef struct HsStablePtr_Selection_ { void *p; } HsStablePtr_Selection;
typedef struct HsStablePtr_Reduction_ { void *p; } HsStablePtr_Reduction;

HsStablePtr_Selection
add_Selection_Selection(HsStablePtr_Selection a, HsStablePtr_Selection b);
HsStablePtr_Selection
add_Selection_Reduction(HsStablePtr_Selection a, HsStablePtr_Reduction b);
HsStablePtr_Selection
add_Reduction_Selection(HsStablePtr_Reduction a, HsStablePtr_Selection b);
HsStablePtr_Reduction
add_Reduction_Reduction(HsStablePtr_Reduction a, HsStablePtr_Reduction b);

In C11 one could reduce this mess using type-generic expressions, which could add the "right" type casts without combinatorial explosion. Still, no one wrote a FFI tool exploiting that. For instance:

void *add_void(void *x, void *y);
#define add(x,y) \
   _Generic((x) , \
   HsStablePtr_Selection: _Generic((y) , \
      HsStablePtr_Selection: (HsStablePtr_Selection) add_void(x,y), \
      HsStablePtr_Reduction: (HsStablePtr_Selection) add_void(x,y)  \
      ) \
   HsStablePtr_Reduction: _Generic((y) , \
      HsStablePtr_Selection: (HsStablePtr_Selection) add_void(x,y), \
      HsStablePtr_Reduction: (HsStablePtr_Reduction) add_void(x,y)  \
      ) \
   )

(The casts above are from pointer to struct, so they don't work and we should use struct literals instead, but let's ignore that.)

In C++ we would have richer types to exploit, but the FFI is meant to use C as a common lingua franca for binding to other languages.

A possible encoding of Haskell (monomorphic!) parametric types could be achieved, theoretically, exploiting the only type constructors c has: pointers, arrays, function pointers, const, volatile, ....

For instance, the stable pointer to type T = Either Char (Int, Bool) could be represented as follows:

typedef struct HsBool_ { void *p } HsBool;
typedef struct HsInt_ { void *p } HsInt;
typedef struct HsChar_ { void *p } HsChar;
typedef struct HsEither_ HsEither;  // incomplete type
typedef struct HsPair_ HsPair;      // incomplete type

typedef void (**T)(HsEither x1, HsChar x2
                  void (**)(HsPair x3, HsInt x4, HsBool x5));

Of course, from the C point of view, the type T is a blatant lie!! a value of type T would actually be void * pointing to some Haskell-side representation of type StablePtr T and surely not a pointer-to-pointer to C function! Still, passing T around would preserve type safety.

Note that the above one can only be called as an "encoding" in a very weak sense, namely it is an injective mapping from monomorphic Haskell types to C types, totally disregarding the semantics of C types. This is only done to ensure that, if such stable pointers are passed back to Haskell, there is some type checking at the C side.

I used C incomplete types so that one can never call these functions in C. I used pointers-to-pointers since (IIRC) pointers to functions can not be cast to void * safely.

Note that such a sophisticated encoding could be used in C, but could be hard to integrate with other languages. For instance, Java and Haskell could be made to interact using JNI + FFI, but I'm not sure the JNI part can cope with such a complex encoding. Perhaps, void * is more practical, albeit unsafe.

Safely encoding polymorphic functions, GADTs, type classes ... is left for future work :-P

TL;DR: the FFI could try harder to encode static types to C, but this is tricky and there is no large demand for that at this moment. Maybe in the future this could change.

I was hoping for a solution from the haskell side, as it seems infeasible to replicate the types in C. For instance could I perhaps do something like ```newtype QuerySelection = QS (StablePtr (Query Selection))``` make `QuerySelection` et al instances of `Storable` and then marshal that type over to C? — Brandon Ogle, Apr 08 '16 at 18:34
@BrandonOgle That could work, but you still need many variants of `add` etc. on the C side I think. — chi, Apr 08 '16 at 19:12

Typesafe StablePtrs

1 Answers1