It seems that $dat[4]
is invalid data. At least the first field should contain a second byte because D5
indicates that there is at least one more byte following.
$dat[2]
is also invalid data because the length field for 0x09
is 0x03
, but the field itself contains four characters.
$dat[5]
contains an invalid hex escape. Instead of \xEO
, I use \xE0
.
With these two corrections, you can parse your input messages using the unpack function:
my( $number, $name ) = unpack 'xwxC/ax', $d;
The template for unpack
means:
x
- throw away this byte (0x08)
w
- read a BER-encoded number
x
- throw away this byte (0x09)
C
- read this byte and use it as the length for the following string
a
- read the next bytes and use them as string characters
x
- throw away this byte (0x1A)
If you want to keep the field numbers as well, use
unpack 'CwCC/aC', $d
At least for the data as shown the unpack
template works, with the assumptions I've stated. If this is actual ASN.1 data then there should be far more validation etc., and if the field separators might be missing, a regexp-based approach as shown by @ikegami is certainly more robust.
Fixed/dynamic field order
The template relies on a fixed order of the fields. If the field order is not certain to be fixed, you will need to determine the unpack
template(s) based on the type of each field in a loop. This brings the unpack approach close to the approach by ikegami.
my ($message_type), $d = unpack 'CA*', $d;
if( $message_type eq "\x08" ) {
my ($number), $d = unpack 'wA*', $d;
print "Field 0x08: $number\n";
} elsif ...
See the following complete program for the fixed field order:
#!perl
use strict;
use warnings;
my @dat;
$dat[1] = "\x08\xB3\xE3\x0C\x09\x07\x4D\x6F\x68\x61\x6D\x65\x64\x1A";
#$dat[2] = "\x08\x84\x03\x09\x03\x53\x6F\x6C\x6C\x1A";
$dat[3] = "\x08\xD4\xEA\x0E\x09\x03\x54\x6F\x6C\x1A";
#$dat[4] = "\x08\xD5\x09\x03\x55\x6F\x6C\x1A";
$dat[5] = "\x08\xD4\xEA\x09\x09\x03\x54\x6F\x6C\x1A";
$dat[6] = "\x08\xD4\xEA\x0E\x09\x09\x54\x6F\x6C\x61\x6D\x65\x64\x61\x61\x1A";
$dat[7] = "\x08\xD4\xEA\x09\x09\x09\x54\x6F\x6C\x61\x6D\x65\x64\x61\x61\x1A";
@dat = grep {defined } @dat;
use Data::Dumper;
for my $d (@dat) {
# Hardcoded message parser
print Dumper [
unpack 'CwCC/aC', $d
];
# Dynamic message parser
while( length $d ) {
(my ($message_type), $d) = unpack 'aa*', $d;
if( $message_type eq "\x08" ) {
(my ($number), $d) = unpack 'wa*', $d;
print "Field 0x08: $number\n";
} elsif ( $message_type eq "\x09" ) {
(my ($len)) = unpack 'C', $d;
(my ($name), $d) = unpack 'C/aa*', $d;
print "Field 0x09: $name\n";
} elsif ( $message_type eq "\x1A" ) {
# finished
print "Field 0x1A\n";
} else {
die sprintf "Unknown message type %08x", ord($message_type);
};
};
};
Output
$VAR1 = [
8,
848268,
9,
'Mohamed',
26
];
Field 0x08: 848268
Field 0x09: Mohamed
Field 0x1A
$VAR1 = [
8,
515,
9,
'Sol',
26
];
Field 0x08: 515
Field 0x09: Sol
Field 0x1A
$VAR1 = [
8,
1389838,
9,
'Tol',
26
];
Field 0x08: 1389838
Field 0x09: Tol
Field 0x1A
$VAR1 = [
8,
1389833,
9,
'Tol',
26
];
Field 0x08: 1389833
Field 0x09: Tol
Field 0x1A
$VAR1 = [
8,
1389838,
9,
'Tolamedaa',
26
];
Field 0x08: 1389838
Field 0x09: Tolamedaa
Field 0x1A
$VAR1 = [
8,
1389833,
9,
'Tolamedaa',
26
];
Field 0x08: 1389833
Field 0x09: Tolamedaa
Field 0x1A
See also
unpack function
(un)pack parameters