4

Given a Builder what's the most efficient way to determine if the serialized/reified data is greater than, say, 1kB? . My best plan currently is using toLazyByteStringWith with a 1kB initial chunk size, and inspect just the first chunk to see if it's full.

But is there some way to do this without writing any data at all? (and preferably in a pure function?)

I got a bit lost trying to understand how running Builder directly on a socket works.

jberryman
  • 16,334
  • 5
  • 42
  • 83

1 Answers1

3

If efficiency is super important, you may want to write a small wrapper around a monoid that tracks the length explicitly:

type SizedBuilder = (Sum Int, Builder)

byteString = liftA2 (,) (Sum . BS.length) Builder.byteString
word8 = (,) 1 . Builder.word8
word32LE = (,) 4 . Builder.word32LE
string8 = liftA2 (,) (Sum . length) Builder.string8
-- etc.

There's already a suitable monoid instance for this type, but of course if you choose to use newtype instead of type you may want to add one with deriving.

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • It would be a nice optimization if Builder accumulated size, even if it was just a lower bound. that would allow it to be smarter about allocating buffers as well – jberryman Aug 24 '22 at 15:28
  • Of course the builders that do variable length encoding are more of a pain with this scheme. If you need the length to be accurate, you essentially have to re-implement the encoding logic of every builder (and make sure it produces perfectly identical results) to measure the length, only to throw it away and have the builder encode it again. I suppose Builder doesn't provide length because it only wants to execute its encoders outputting directly into a buffer of the final ByteString (or Handle), but without running encoders it can't easily know how many bytes they'd write. – Ben Aug 25 '22 at 02:45