Updating a value in Data.ByteString

Question

The C language provides a very handy way of updating the nth element of an array: array[n] = new_value. My understanding of the Data.ByteString type is that it provides a very similar functionality to a C array of uint8_t - access via index :: ByteString -> Int -> Word8. It appears that the opposite operation - updating a value - is not that easy.

My initial approach was to use the take, drop and singleton functions, concatetaned in the following way:

updateValue :: ByteString -> Int -> Word8 -> ByteString
updateValue bs n value = concat [take (n-1) bs, singleton value, drop (n+1) bs]

(this is a very naive implementation as it does not handle edge cases)

Coming with a C background, it feels a bit too heavyweight to call 4 functions to update one value. Theoretically, the operation complexity is not that bad:

take is O(1)
drop is O(1)
singleton is O(1)
concat is O(n), but here I am not sure if the n is the length of the concatenated list altogether or if its just, in our case, 3.

My second approach was to ask Hoogle for a function with a similar type signature: ByteString -> Int -> a -> ByteString, but nothing appropriate appeared.

Am I missing something very obvious, or is really that complex to update the value?

I would like to note that I understand the fact that the ByteString is immutable and that changing any of its elements will result into a new ByteString instance.

EDIT: A possible solution that I found while reading about the Control.Lens library uses the set lens. The following is an outtake from GHCi with omitted module names:

> import Data.ByteString
> import Control.Lens
> let clock = pack [116, 105, 99, 107]
> clock
"tick"
> let clock2 = clock & ix 1 .~ 111
> clock2
"tock"

Depends on what your end goal is. If you really want to read a file in, modify one byte, and write it out again, this approach seems reasonable. If what you're actually trying to do is modify every single byte, but one byte at a time, you probably want `map`. Can you explain a bit more about what you're ultimately after? — MathematicalOrchid, Nov 18 '15 at 22:28
@MathematicalOrchid I get your point. I am trying to build a *very* small virtual machine and I am using `ByteString` as the backing memory storage. The size in question is in order of tens of bytes. An instruction comes in and executing it modifies a specific memory byte, e.g. `mov 4, mem7` (and that is where our `updateValue` function comes in). Therefore, `map`ping is not an option here. What might be an important addition is that performance is a valid concern in this case. — Daniel Lovasko, Nov 18 '15 at 23:16

score 3 · Answer 1 · answered Nov 18 '15 at 23:01

3

One solution is to convert the ByteString to a Storable Vector, then modify that:

import Data.ByteString (ByteString)
import Data.Vector.Storable (modify)
import Data.Vector.Storable.ByteString  -- provided by the "spool" package
import Data.Vector.Storable.Mutable (write)
import Data.Word (Word8)

updateAt :: Int -> Word8 -> ByteString -> ByteString
updateAt n x s = vectorToByteString . modify inner . byteStringToVector
  where
    inner v = write v n x

See the documentation for spool and vector.

answered Nov 18 '15 at 23:01

Lambda Fairy

13,814
7
42
68

If I understand it correctly, this uses 4 separate functions, two of which are just the conversion. Do you think that `Data.Vector` could be a feasible replacement for `Data.ByteString` in this case? (In order to get rid of the two conversions) – Daniel Lovasko Nov 19 '15 at 00:16
2

@DanielLovasko I absolutely think replacing `ByteString` with a mutable structure is appropriate here. I would go even farther than using `Vector` to use e.g. `STArray` and lift your virtual machine stepping operation into `ST`. Moving from repeated O(n) update to O(1) is a no-brainer when performance matters. – Daniel Wagner Nov 19 '15 at 00:18
@DanielWagner I never heard of the `ST` monad before, but it totally makes sense to use it here. For anyone reading this, I highly recommend the IRC logs in the beginning of [this page](https://wiki.haskell.org/Monad/ST). – Daniel Lovasko Nov 19 '15 at 02:20

Updating a value in Data.ByteString

1 Answers1