2

I'm decoding a MessagePack message from an Apache Beam pipeline in a Java project. I'm using Maven to import the MessagePack library as dependency:

<dependency>
  <groupId>org.msgpack</groupId>
  <artifactId>msgpack-core</artifactId>
  <version>0.8.16</version>
</dependency>

I can use this to parse the MessagePack message it into key/value pairs in a Map, like this:

    @ProcessElement
    public void processElement(ProcessContext c) 
    {
        try 
        {           
            Map<Value, Value> map = MessagePack.newDefaultUnpacker(c.element().getPayload()).unpackValue().asMapValue().map();

The map contains a key/value pair for a MessagePack 'Timestamp' 'extension' type that looks like this, and which represents a date/time (see the 'Note' at the bottom, for an explanation of MessagePack extension types):

UTC=(-1,0x5b-161d46)

I can get this a 'timestamp' value, by getting the value with the key UTC, from the map. I retrieve it as a MessagePack ExtensionValue like this:

 Value date = map.get(ValueFactory.newString("UTC")).asExtensionValue();

date is then an object which has 2 properties:

`type` = 1
`data` = `0x5b-161d46`

How do I convert data to a meaningful representation of the date? The 'data' should translate to a 'current' date, sometime around 16th November 2018. It's not as simple as converting the hex value to decimal. Do I need to separately unpack this data somehow? I suspect that 5b-161d46 probably needs to be treated as a byte array and then converted somehow.

I can do this to get the data part of the extension type as a byte array:

byte[] date = map.get(ValueFactory.newString("UTC")).asExtensionValue().getData();

which gives me [91, -22, 29, 70]

... and I can try to unpack it like this:

MessagePack.newDefaultUnpacker(date).unpackValue()

... however that just gives me the first byte (5b) converted to a long i.e. 91

And if I try any of these I get org.msgpack.core.MessageTypeCastException, probably because unpackValue just gives me a single long number

MessagePack.newDefaultUnpacker(date).unpackValue().asIntegerValue();
MessagePack.newDefaultUnpacker(date).unpackValue().asMapValue();
MessagePack.newDefaultUnpacker(date).unpackValue().asRawValue();

I've also tried the following:

MessageUnpacker unpacker = MessagePack.newDefaultUnpacker(date);
    while(unpacker.hasNext()) {
        MessageFormat f = unpacker.getNextFormat();
            switch(f) {
                case POSFIXINT:
                case NEGFIXINT: {
                    int v = unpacker.unpackInt();
                    break;
                }
             }
    }

The values in the array are recognised as either POSFIXINT or NEGFIXINT, so I can use this to extract decimal integer values for each byte in the array, however that only allows me to extract the elements in the date array as integers, and I still don't know how to translate that to a date.

How do I need to interpret/unpack these dates?


Note - An extension value is special type of MessagePack value, represented as a tuple where -1 defines the extension type. -1 is a reserved extension for a MessagePack timestamp, and the remainder gives a hex value (0x5b-161d46):

https://github.com/msgpack/msgpack/blob/master/spec.md#timestamp-extension-type

Chris Halcrow
  • 28,994
  • 18
  • 176
  • 206
  • I wonder if that means that `61d46` (hex) is the number of seconds since the epoch? That would be 400710 (decimal) seconds or Monday 5th January 1970 15:18:30 UTC. Does that sound right to you? It doesn’t immediately to me, but you should know better. I cannot understand `0x5b-161d46` as one hex value, they don’t have a minus in the middle (at least not where I come from). – Ole V.V. Nov 16 '18 at 08:54
  • 1
    Thanks @Ole V.V. - the date should be a current date (i.e. sometime around our current date). This is what's confusing me. It's like 5b-161d46 should be a hex value that represents the number of seconds, however this format is strange. If I simply ignore the '-' sign (not a good assumption to make), then it gives me 1528175942 in decimal, which is a date of 5th June 2018, which is still wrong, it should be something like 17th November or close to that. – Chris Halcrow Nov 18 '18 at 23:19

1 Answers1

1

I figured it out! First, the short version (how to convert a MessagePack timestamp value to a meaningful number in Java):

import java.nio.ByteBuffer

byte[] timestampValues = myTimestampExtensionValue.asExtensionValue().getData();                            
ByteBuffer wrapped = ByteBuffer.wrap(timestampValues);
Long dateValue = wrapped.getLong();

In my own case, I was receiving the date as a timestamp extension value as part of a key/value pair in a map, like this:

UTC=(-1,0x5b-e28-35)

This could be in various formats, which was very confusing, e.g.:

(-1,0x5b-1b6f-24)
(-1,0x5b-1b7056)
(-1,0x5b-1b58-4)

What I discovered is that if I do this:

byte[] date = map.get(ValueFactory.newString("UTC")).asExtensionValue().getData();

... it always gives me a 32 bit byte array. For my example UTC=(-1,0x5b-e28-35), I get:

[91, -14, 40, -53]

This also confused me - I couldn't see how this could be an integer. The thing to recognise is that these are signed bytes, where negative values are the value to subtract from the maximum value of that byte, i.e. 255.

I'm not sure why this happens (probably to conserve memory by requiring less space within each byte). Anyway, the example above translates to the following in decimal:

[91, 241, 40, 202]

There's an easy way though in Java to convert the original byte array [91, -14, 40, -53] to an Integer, by importing java.nio.ByteBuffer and using:

ByteBuffer wrapped = ByteBuffer.wrap(date);
Integer num = wrapped.getInt();

For my example, this gives us 1542596811, which turns out to be seconds since the Unix Epoch. So, if we convert that to milliseconds we now have 1542596811000, or a date of Mon 19 November 2018, 14:06:51. Simple!

Chris Halcrow
  • 28,994
  • 18
  • 176
  • 206
  • 1
    Thanks for sharing your solution. java.time (the modern Java date and time API) accpets *seconds* since the epoch directly, and it’s nicer to avoid doing your own conversion to milliseconds: `Instant.ofEpochSecond(1_542_596_811).atZone(ZoneId.of("Australia/Melbourne"))`. – Ole V.V. Nov 19 '18 at 05:20