0

Does anyone have a clue why perl's MessagePack gives different results from that in Node.js .

I am trying to unpack a msgpack string that was created using Perl's Message Pack and it doesn't work

Example: Array ["a","b","c","d","f"]

packing it in Perl gives : ��a�b�c�d�e�f

packing it in node.js (using various modules) : ¡a¡b¡c¡d¡f

Does anyone have a clue

  • 1
    What does your perl script look like? Did you use utf8? Also your array from Perl is clearly different as it has an 'e' .. – David K-J Feb 26 '15 at 15:42

2 Answers2

2

I have read the spec for msgpack, and so have revised this answer.

The difference you are seeing between data encoded in Node and Perl is simply down to the representation of the data when printed. As msgpack is a binary format, you can't just print it to a terminal - it's not representative because the data structure bytes aren't printable or cause the next byte to be represented as something completely different.

This perl script outputs the same as the Node output. Nudging perl to represent the data slightly differently makes it look the same as the output from Node. The special part is use open qw/:std :utf8/;, which instructs Perl to convert as much as possible to a utf-8 representation. I haven't tested Node as the OP hasn't defined what packages are used.

#!/usr/env perl
use strict;
use warnings;

use open qw/:std :utf8/;

use Data::Dumper;
use Data::MessagePack;

my $mp = Data::MessagePack->new();

my $packed = $mp->pack([qw(a b c d f)]);

print sprintf("packed: %s\n", $packed);
print Dumper $mp->unpack($packed);

The output looks like this:

packed: ¡a¡b¡c¡d¡f
$VAR1 = [
          'a',
          'b',
          'c',
          'd',
          'f'
        ];

In my terminal, there is a zero-width character at the beginning of the packed string. That character doesn't paste. I initially thought it was a utf-8 BOM, but after checking the msgpack spec, found this is part of the binary message.

David K-J
  • 930
  • 7
  • 14
  • It worked, but can I actually encode the ��a�b�c�d�e�f to utf8 in node and then pass it to msgpack.unpack ? – user3683370 Feb 27 '15 at 09:39
  • Both are equal utf8 strings, so it should be all good. However, if you are having trouble then we will need to know what you're doing on the JS side.. – David K-J Feb 27 '15 at 10:27
  • Also how can I unpack in perl the encoded_utf8 message? in the example above – user3683370 Feb 27 '15 at 12:37
  • This example my $packed = $mp->pack([qw(a b c d f)]); print "packed:".$packed."\n"; my $encoded = Encode::encode_utf8($packed); print "packed encoded using encode_utf8 :".$encoded."\n"; my $decoded = Encode::decode_utf8($encoded); print "packed decoded using decode_utf8:".$decoded."\n"; my $unpacked = $mp->unpack($encoded); print $unpacked."\n"; – user3683370 Feb 27 '15 at 13:10
  • produces : packed decoded using decode_utf8:��a�b�c�d�f Data::MessagePack->unpack: extra bytes at /home/s.charitakis/workspace/minimob_trunk/test_msgpack.pl line 29. – user3683370 Feb 27 '15 at 13:11
0

What I am trying to achieve is the following : There is a perl script that writes to a Redis DB the msgpack but without the utf8 encoding . I then need to get the value using node.js and unpack it. Also I need for the perl script to be able to get the value from db and unpack it

If I use in Perl

use strict;
use warnings;

use Data::MessagePack;
use Encode;

my $mp = Data::MessagePack->new();

my $packed = $mp->pack([qw(a b c d f)]);
print "packed:".$packed."\n";

my $encoded =  Encode::encode_utf8($packed);
print "packed encoded using encode_utf8 :".$encoded."\n";

my $decoded = Encode::decode_utf8($encoded);
print "packed decoded using decode_utf8:".$decoded."\n";

my $unpacked = $mp->unpack($decoded);
print $unpacked."\n";

The output is: packed:��a�b�c�d�f

packed encoded using encode_utf8 :¡a¡b¡c¡d¡f

packed decoded using decode_utf8:��a�b�c�d�f

Data::MessagePack->unpack: extra bytes at /home/myname/workspace/test/test_msgpack.pl line 29.

Thus, I either don't convert anything in perl in utf8 before and just send it to db so that node.js does the rest , but it needs to convert the data also to a format that perl understands in order to unpack.

or

I don't do anything in node.js, but by just using any msgpack module that exist, I unpack the message for process and also pack and save it to db for Perl to fetch and unpack.

In the second option I have the problem stated above

Data::MessagePack->unpack: extra bytes at /home/myname/workspace/test/test_msgpack.pl line 29.

and in the first solution node.js does not understand the format of msgpack that perl saved to db