4

I'm trying to parse a binary format (PES) using Haskell:

import qualified Data.ByteString.Lazy as BL
import Data.Word
import Data.Word.Word24
import qualified Data.ByteString.Lazy.Char8 as L8

data Stitch = MyCoord Int Int deriving (Eq, Show)

data PESFile = PESFile {
      pecstart :: Word24
    , width :: Int
    , height :: Int
    , numColors :: Int
    , header :: String
    , stitches :: [Stitch]
    } deriving (Eq, Show)


readPES :: BL.ByteString -> Maybe PESFile
readPES bs =
        let s = L8.drop 7 bs
            pecstart = L8.readInt s in
            case pecstart of
        Nothing -> Nothing
        Just (offset,rest) ->   Just (PESFile offset 1 1 1 "#PES" [])

main = do
  input <- BL.getContents
  print $ readPES input

I need to read pecstart to get the offset of the other data (width,height and stiches) But this isn't working for me because I need to read a 24 bit value, and the ByteString package doesn't seem to have a 24 bit version.

Should I be using a different approach? The Data.Binary package seems good for simple formats, but I'm not sure how it would work for something like this, since you have to read a value to find the offset of the other data in the file. Something I'm missing?

nont
  • 9,322
  • 7
  • 62
  • 82
  • 1
    Wouldn't you just add an instance of `Binary` for `PESFile`? The binary package looks like it would be fine, because the put/get functions are a sequence of actions (e.g. you can read pecstart to get to the next bit). – Jeff Foster Jun 04 '11 at 20:09
  • I'd love to try that approach Jeff. I've been working from the Real World Haskell chapter on binary input. If there's a tutorial on creating new Binary instances, I'd love to give it a shot. – nont Jun 04 '11 at 20:14
  • You should probably keep the header as a bytestring, for efficiency reasons. And `MyCoord` should use strict `Int` fields (e.g. `!Int`). – Don Stewart Jun 04 '11 at 20:26
  • If you'll be doing a lot of work with 24-bit ints, I'd recommend you look into the `word24` package, http://hackage.haskell.org/package/word24. It provides 24-bit signed and unsigned integers with proper bounds, bit shifts, etc. There's a storable instance also, but for just reading one value from a bytestring I'd probably use Don Stewart's solution. – John L Jun 05 '11 at 10:42

1 Answers1

6

Well, you can parse a 24 bit value out by indexing 3 bytes (here in network order):

import qualified Data.ByteString as B
import Data.ByteString (ByteString, index)
import Data.Bits
import Data.Int
import Data.Word

type Int24 = Int32

readInt24 :: ByteString -> (Int24, ByteString)
readInt24 bs = (roll [a,b,c], B.drop 3 bs)
   where a = bs `index` 0
         b = bs `index` 1
         c = bs `index` 2

roll :: [Word8] -> Int24
roll   = foldr unstep 0
  where
    unstep b a = a `shiftL` 8 .|. fromIntegral b
Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • Thanks Don! At first glance, this looked like sourcery, but then I was able to piece it together except the bitwise or, which hoogle helped me out with. – nont Jun 04 '11 at 20:30
  • 1
    sourcery is the best unintentional pun ever for "magic code". – rampion Jun 04 '11 at 22:27