I'm trying to prove that encoding/decoding a LEB128 (well actually LEB64) varint is lossless. Here's my code:
function decode_varint(input: seq<bv8>) : bv64
requires |input| > 0
{
var byte := input[0];
var val := (byte & 0x7F) as bv64;
var more := byte & 0x80 == 0 && |input| > 1;
if more then val | (decode_varint(input[1..]) << 7) else val
}
function encode_varint(input: bv64) : seq<bv8>
{
var byte := (input & 0x7F) as bv8;
var shifted := input >> 7;
if shifted == 0 then [byte | 0x80] else [byte] + encode_varint(shifted)
}
lemma Lossless(input: bv64) {
var test := encode_varint(128);
var encoded := encode_varint(input);
var decoded := decode_varint(encoded);
assert decoded == input;
}
Unfortunately the assertion doesn't hold. I used the VSCode plugin's counter-example feature (F7) to inspect the values in Lossless
, and it picks input = 0x8000000000000000
. Fine, I think that should work, but the counter-example also shows that test
is test:seq<bv8> = ()
.
I don't understand that. Firstly don't sequences use square brackets? Second it doesn't look like it is possible for encode_varint()
to return an empty sequence in any case. In fact Dafny proves this successfully!
lemma NeverEmpty(input: bv64) {
var encoded := encode_varint(input);
assert |encoded| > 0;
}
What's going on here?
Edit: Also if I add these examples...
lemma Examples()
{
assert encode_varint(1 << 7) == [0x00, 0x81];
assert encode_varint(1 << 14) == [0x00, 0x00, 0x81];
assert encode_varint(1 << 28) == [0x00, 0x00, 0x00, 0x00, 0x81];
assert encode_varint(1 << 42) == [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x81];
assert encode_varint(1 << 56) == [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x81];
assert encode_varint(1 << 63) == [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x81];
assert encode_varint(0x8000000000000000) == [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x81];
}
Then they are all proven, but if I only have the last example (the counter-example Dafny generates) then it isn't!