5

I'm porting some imperative code to Haskell. My goal is to analyze an executable, therefore each byte of the text section gets assigned a number of flags, which would all fit in a byte (6 bits to be precise).

In a language like C, I would just allocate an array of bytes, zero them and update them as I go. How would I do this efficiently in Haskell?

In other words: I'm looking for a ByteString with bit-wise access and constant time updates as I disassemble more of the text section.

Edit: Of course, any kind of other data structure would do if it was similarly efficient.

Sebastian Graf
  • 3,602
  • 3
  • 27
  • 38

2 Answers2

9

The implementation for unboxed arrays of Bool-s in array is a packed bitarray. You can do mutable updates on such arrays in the ST Monad (this is essentially the same runtime behaviour as in C).

András Kovács
  • 29,931
  • 3
  • 53
  • 99
  • These are both excellent answers, but I think I go with gspr's because it seems more idiomatic to me. – Sebastian Graf Oct 11 '14 at 09:49
  • @Sebastian: `vector` doesn't pack Bool-s into bits, therefore you have to write your own wrapper if you want to read/write specific bits. Most of the time `vector` is a better choice because of the richer API, but here `array` is probably simpler. – András Kovács Oct 11 '14 at 10:28
  • 1
    I think I will allocate one byte for each element anyway, because I *think* that would be more performant from an element access perspective. Also it feels like implementing my flags on top of `Bits` is somehow more convenient than 6 consecutive `Bool`s, however that may be subjective. – Sebastian Graf Oct 11 '14 at 10:43
  • @Sebastian valid points, I retract my opinion about array being preferable here. – András Kovács Oct 11 '14 at 12:51
  • Although the votes beg to differ. Maybe I'm missing something? I will be able to tell in a few days. – Sebastian Graf Oct 11 '14 at 13:00
6

You can use a vector with any data type that's an instance of the Bits typeclass, for example Word64 for 64 bits per element in the vector. Use an unboxed vector if you want the array to be contiguous in memory.

gspr
  • 11,144
  • 3
  • 41
  • 74