6

The core datatypes of Data.ByteString.Builder are

newtype Builder = Builder (forall r. BuildStep r -> BuildStep r)

type BuildStep a = BufferRange -> IO (BuildSignal a)

data BuildSignal a =
    Done {-# UNPACK #-} !(Ptr Word8) a
  | BufferFull
      {-# UNPACK #-} !Int 
      {-# UNPACK #-} !(Ptr Word8)
                     (BuildStep a)
  | InsertChunk
      {-# UNPACK #-} !(Ptr Word8)
                     S.ByteString
                     (BuildStep a)

What purpose does the type parameter (r or a) serve?

sjakobi
  • 3,546
  • 1
  • 25
  • 43
  • 1
    I suspect it's similar to the phantom parameter in `ST` – it prevents you from using pointers in a context where the data doesn't exist. – leftaroundabout Apr 04 '18 at 12:12
  • 2
    @leftaroundabout Doesn't look like phantom type parameter since there's a field with type `a` in `Done` constructor. – Shersh Apr 04 '18 at 12:41
  • @leftaroundabout Can you expand on that functionality with `ST` or do you maybe have a reference? – sjakobi Apr 04 '18 at 21:03
  • 1
    @sjakobi the [`ST` monad](http://hackage.haskell.org/package/base-4.11.0.0/docs/Control-Monad-ST-Lazy-Safe.html#t:ST) has an extra `s` parameter that's not actually used for anything. The type system ensures this parameter is “synchronised” along any given mutable computation. For actually getting any result out of such a computation, you need to use `runST`, whose Rank-2 type prevents you from fixing the `s` to anything particular. So if you smuggle out a reference to e.g. a mutable array out of such a computation, there is you can do with it elsewhere, because the `s` type doesn't match. – leftaroundabout Apr 04 '18 at 21:15

2 Answers2

5

It is not needed. As proof, I have created a fork which does not change any of the public APIs -- only the API of modules named Internal -- but removes this type argument.

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • 1
    Do you happen to know what `bytestring`'s compat policy is like for `Internal` modules? I know that eg `attoparsec` actually does rummage around in `bytestring`'s internals ([example](https://github.com/bos/attoparsec/blob/master/Data/Attoparsec/ByteString/Buffer.hs#L57)), so, like it or not, changing internal APIs might break downstream dependencies. I imagine the `bytestring` maintainers might be willing to accept such a PR with a major rev bump... – Benjamin Hodgson Apr 04 '18 at 19:59
  • Thanks for your investigation, @daniel-wagner! I've noticed that Simon Meier's [original design](http://lambda-view.blogspot.de/2010/11/on-design-of-efficient-builder-monoid.html) doesn't have the type arguments, but I'm still wondering why they were added. – sjakobi Apr 04 '18 at 20:42
  • @sjakobi I suspect it was put there as a "Just In Case": just in case somebody wanted to compute some interesting, potentially structured value *about* the `ByteString` as they were building it. Then it turned out not to be the case that people wanted to do that -- at least, not in the code we have today in the library. Whether there's somebody out there relying on that capability is a question we may never be able to reasonably answer in the negative. – Daniel Wagner Apr 04 '18 at 21:06
  • @Shersh Perhaps. Somebody interested in turning it into a PR would probably want to be more careful about ripping out only what wasn't possible any more without the extra type parameter. I went for the changes that were easiest to throw together and verify type-checked rather than the minimal change -- and I'm not super interested in doing the work that's likely needed to take the patch from "proof of concept" to "acceptable to the bytestring maintainers". Of course, if you care about this, you're welcome to do that yourself. – Daniel Wagner Apr 04 '18 at 21:09
  • Well, as I already commented, the purpose of this parameter might actually just be to _prevent_ some (incorrect) code from compiling, not to do anything for correct code. – leftaroundabout Apr 04 '18 at 21:18
  • @leftaroundabout It is insufficiently polymorphic to serve that role. (I verified this mentally as I was refactoring.) – Daniel Wagner Apr 04 '18 at 21:23
  • 1
    @Shersh: I have created https://github.com/haskell/bytestring/issues/154 for now. – sjakobi Apr 06 '18 at 12:26
0

Simon Meier, the author of this Builder design, has responded to my question on the bytestring issue tracker:

The type parameter was there to support the Put monad (a Writer spezialized to fill a buffer during the computation of its value).

Put is defined as

newtype Put a = Put { unPut :: forall r. (a -> BuildStep r) -> BuildStep r }

and exported from Data.ByteString.Builder.Internal which is hidden in the current bytestring release.

sjakobi
  • 3,546
  • 1
  • 25
  • 43