2

Possible Duplicate:
Can the FFI deal with arrays? If so, how?

I have a tiny assembler written in Haskell which takes a string with assembly code and returns a string of binary machine code. I want to be able to use this function in C by building this Haskell library as a shared library. The binary machine code can contain null values so I can't use CString as return type, since that is a regular null-terminated string. And since I cannot use a CStringLen as a return value in FFI.

What type should I use to able to accomplish this?

The type signature of the internal assembly function:

assembly :: String -> ByteString 

Here is an example of input and output of this function:

Input:

decl r0 0x02
decl r1 0x10
add r0 r1 
mov rr rs

Output (Binary data represented as hexadecimal with 3 bytes per row):

01 00 02
01 01 10
03 00 01
02 05 04
Community
  • 1
  • 1
rzetterberg
  • 10,146
  • 4
  • 44
  • 54
  • I'm not strong on GHC FFI, but can you do manual memory manipulation and return a pointer to a `CStringLen`? (I.e. have a function `convert :: ByteString -> IO (Ptr CStringLen)`? Or something along those lines.) – huon Sep 30 '12 at 11:54
  • @dbaupp Yes, but I believe I have to create a custom structure and implement marshaling using `Storable`. I'm reading up on the subject, but haven't found a straight-forward solution. – rzetterberg Sep 30 '12 at 11:59

3 Answers3

3

If I were writing it in C, I might give it a prototype like this:

void assemble(char **out, size_t *outlen, const char *in);

This translates to something like this (untested):

import qualified Assemble -- your module with the "assemble" function

import Foreign.Ptr (Ptr)
import Foreign.Storable (poke)
import Foreign.Marshal.Utils (copyBytes)
import Foreign.Marshal.Alloc (mallocBytes)
import Foreign.C.Types (CSize, CChar)
import Foreign.C.String (CString, peekCString)
import Data.ByteString.Unsafe (unsafeUseAsCStringLen)

foreign export ccall assemble :: Ptr (Ptr CChar) -> Ptr CSize -> CString -> IO ()

assemble :: Ptr (Ptr CChar) -> Ptr CSize -> String -> IO ()
assemble out outlen instrptr = do
  instr <- peekCString instrptr
  unsafeUseAsCStringLen (Assemble.assemble instr) $ \(p, n) -> do
    outval <- mallocBytes n
    copyBytes outval p n
    poke out outval
    poke outlen (fromIntegral n)

This copies the data into a malloc region, which is nice because it is ''safe'' and the C code doesn't need to do anything special to free it (other than free()).

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • That is exactly the kind of straight-forwardness I was looking for in an example! There is one thing I'm finding hard to grasp, and that is why `out` is `void **` and not `char **`. It seems that the compile is at loss also. It is expecting `Ptr ()` as type in the second argument of poke, but now it receives `Ptr Foreign.C.Types.CChar`. – rzetterberg Sep 30 '12 at 15:24
  • @rzetterberg: I wrote the code off the top of my head. Using `void **` is a habit from C when I work with binary data, I think of it as more of a hint to the debugger than anything else. Using `char **` is also fine. – Dietrich Epp Sep 30 '12 at 15:51
  • Thank you for an excellent answer. I just had to adapt the types with hints from the compiler, then your code worked perfectly :) This is what I changed: `Ptr (Ptr ())` to `Ptr (Ptr CChar))` and `String` to `CString` and adding code to "convert" the input `CString` to a regular `String` to pass into my `assemble` function. – rzetterberg Sep 30 '12 at 15:57
1

Can you do something with raw pointers and manual memory allocation? (See Foreign.Marshal.Alloc.) It sounds like you could just malloc a chunk of memory and write your binary data there...

MathematicalOrchid
  • 61,854
  • 19
  • 123
  • 220
  • Yes, that seems to be the way to go. I couldn't find any good examples, though. The ones I found was using hsc2cs and marshaling structures which became too much to grasp and seemed a bit over the top for this problem. – rzetterberg Sep 30 '12 at 15:30
  • @rzetterberg Yeah, this isn't really my speciallity. I was just trying to offer some useful hints on where to get started. – MathematicalOrchid Sep 30 '12 at 16:32
  • 1
    Well, I appreciate the input nonetheless! As you can see in the accepted answer it was what you suggested that worked out for me. – rzetterberg Sep 30 '12 at 16:44
0

I don't know Haskell enough to be certain, but can't you pass an additional out parameter length to the haskell function? Upon returning from function, length would tell c program the size of string returned. I believe i have done similar things between c and python.

Alternately, can't you return a custom object like c++ string which has a length field. Even if you are using pure c, if their is a way to share types between c and haskell (which i believe should exist), you could write a small string struct with a char array and length fields and return that object from haskell.

fkl
  • 5,412
  • 4
  • 28
  • 68
  • Yes, something along those lines. But I would like to know more specific what alternatives there are in Haskell to do this. What are the best practices, can we solve this with pointer manipulation in Haskell, etc etc. – rzetterberg Sep 30 '12 at 10:04
  • This might help http://stackoverflow.com/questions/6140348/convert-haskell-bytestrings-to-c-stdstring – fkl Sep 30 '12 at 10:14
  • Thanks, but I've seen that already. And it deals with input, not output. As I stated in my question `CStringLen` is not allowed to be returned via the FFI since it's a tuple. Only scalar types can be used. – rzetterberg Sep 30 '12 at 10:15