I've come across an interesting problem with scodec. I have a peculiar encoding scheme that requires a stream byte alignment when the current bit pointer mod 8 is not zero (not aligned to the nearest byte).
Now normally, this would be handled by the byteAligned
codec, but the situation appears to need global context of the entire decoding operation.
Here's a minimal example of the problem
case class Baz(abc : Int, qux : Int)
case class Foo(bar : Int,
items : Vector[Baz])
object Foo {
implicit val codec : Codec[Foo] = (
("bar" | uint8L) ::
("items" | vectorOfN(uint8L, (
("abc" | uint8L) ::
("qux" | uint2L)
).as[Baz]
))).as[Foo]
}
items
will be encoded and decoded as a Vector
of Baz
items. Notice that the qux
member in Baz
is represented as a 2-bit unsigned int. We want the abc
member to be byte aligned, meaning that during it's decoding, it will decode and if the byte stream is misaligned, padding must be added after the decoded bits (Case C).
This means for a vector of size 1 (Case A), the output BitVector will be misaligned by 6-bits, which is fine because there are no more items (abc
is never reached again in the vector decoding loop).
A vector of size 2 without byte aligning is shown in Case B. A vector with byte aligning is shown in Case C.
Below each case is comment breaking down the bitstream. 0x0a
is hex for 10 decimal and 0b01
is binary for 1 decimal. I will switch between the two when more detail is needed and when the stream isn't byte aligned.
Case A
Codec.encode(Foo(10, Vector(Baz(4,1))))
res43: Attempt[BitVector] = Successful(BitVector(26 bits, 0x0a01044))
// bar N items abc(0) qux(0)
// 0x0a 0x01 0x04 0b01
Case B
Codec.encode(Foo(10, Vector(Baz(4,1), Baz(15,3))))
res42: Attempt[BitVector] = Successful(BitVector(36 bits, 0x0a020443f))
// bar N items abc(0) qux(0) abc(1) qux(1)
// 0x0a 0x02 0x04 0b01 0b00001111 0b11
Case C
Codec.encode(Foo(10, Vector(Baz(4,1), Baz(15,3))))
res42: Attempt[BitVector] = Successful(BitVector(42 bits, 0x0a020443c0c))
// bar N items abc(0) qux(0) abc(1) padding qux(1)
// 0x0a 0x02 0x04 0b01 0b00001111 0b000000 0b11
Another way of looking at Case C is unrolling the codecs
// bar N items abc(0) qux(0) abc(1) padding qux(1)
(uint8L :: uint8L :: uint8L :: uint2L :: uint8L :: ignore(6) :: uint2L).
dropUnits.encode(10 :: 2 :: 4 :: 1 :: 15 :: 3 :: HNil)
res47: Attempt[BitVector] = Successful(BitVector(42 bits, 0x0a020443c0c))
Notice that there wasn't any padding after abc(0)
as the stream was byte aligned at that point. At abc(1)
it was not aligned due to the decoding of qux(0)
. After the alignment, qux
will be decoding on an aligned boundary. We always want qux
to decode on a byte aligned position.
What I need is some method to either get a global context to figure out if the current bit pointer in the overall BitVector is aligned. If not, I need a codec to intelligently know when it's not decoding on an aligned boundary and to correct the stream after decoding it's current value.
I know codecs can get the current bitstream to decode from, but they have no idea about where they are in an overall bitstream.
Any direction or code would be most appreciated.
P.S.
If you want to test in Ammonite you'll need this to start
load.ivy("org.scodec" %% "scodec-core" % "1.8.3")
import scodec.codecs._
import scodec._
import shapeless._
When copying and pasting, you'll need to start with a {
to make sure the case class and case object get defined at the same time.