1

Okay, so I think I'm close here, but I'm hitting a limitation in my understanding of binary data.

I'm parsing some MySQL data inserted as geometry types, using PHP's unpack() as my parser, and everything was going swimmingly until I started trying to unpack complex geometry types (e.g. MULTIPOINT).

For a POINT data type, I've had good success using an unpack pattern that simply disregards the first chunk and then gives me an associative array of type, order, lat and lon:

$coords = unpack('x/x/x/x/corder/Ltype/dlat/dlon', $point);
// >>> [
//       'order' => 1,
//       'type' => 1,
//       'lat' => (expected value),
//       'lon' => (expected value)
//     ];

Naturally, applying the exact same pattern to MULTIPOINT geometries doesn't work the same. It gets the order and the type becomes 4, but the values of lat and lon are not at all what I'd expect. So, curious to see what the whole thing looked like, I changed the pattern to just spew it all out as "double (machine dependent size and representation)" types:

$coords = unpack('x/x/x/x/corder/Ltype/d*', $multipoint);
// >>> mayhem

What that unpacks actually includes five additional array items, instead of the 4 I'd expect for a multipoint with two points (2x2), and the values are totally wack. For instance, a value I'd expect to be somewhere in the 40-something-point-whatever range seems to read something like -1.0977282851114052e-218.

What is the proper way to unpack a MULTIPOINT? My instinct says I'm slicing the bytes where I shouldn't or casting them to an inappropriate type, but I'm not sure what those should be.

miken32
  • 42,008
  • 16
  • 111
  • 154
wosevision
  • 123
  • 11
  • 1
    I assume you're already familiar with `SELECT ST_AsText(points) FROM mytable`? – miken32 Oct 10 '18 at 14:48
  • I am, thank you. Manually parsing the points as text is my last resort, since the breadth of geometry types available is going to create a whole lot of regex-y boilerplate code using text parsing. A library feels like overkill, but that's also a second-last resort. I'm choosing to try unpacking from binary because it's clean and performant and gets me exactly what I want with one line of code. On `POINT` types at least. – wosevision Oct 10 '18 at 15:08

1 Answers1

1

Took a bit of looking, but I found a reference for the WKB formats of various geometries. As I guessed by looking at the output of

SELECT HEX(ST_GeomFromText('MULTIPOINT(1 1, 2 2, 3 3)'))

there is a count of points after the order and type. The challenge is that the order and type are re-declared for every point and unpack doesn't have a concept of repeating a group of bytes. So you'll need to pull the bytes for each point and run unpack on them again. Obviously this will start to get more involved once you try accounting for more spatial types.

<?php
$multipoint_wkb = hex2bin("000000000104000000030000000101000000000000000000F03F000000000000F03F010100000000000000000000400000000000000040010100000000000000000008400000000000000840");

function unpack_multipoint($multipoint)
{
    $data = unpack("x4/corder/Ltype/Lcount", $multipoint);
    for ($i = 0; $i < $data["count"]; $i++) {
        // the header is 1+4+8 bytes and each point record is 1+4+8+8 bytes
        $offset = ($i * 21) + 13;
        $return[] = unpack("corder/Ltype/dlat/dlon", $multipoint, $offset);
    }
    return $return;
}

print_r(unpack_multipoint($multipoint_wkb));

It's worth noting that those 4 NUL bytes are inserted by MySQL but are not part of the actual geometry object. I left them in place per your question, but if you retrieve the data using the ST_AsWKB function:

SELECT HEX(ST_AsWKB(ST_GeomFromText('MULTIPOINT(1 1, 2 2, 3 3)')))

the extra bytes are not prepended.

miken32
  • 42,008
  • 16
  • 111
  • 154
  • Tested and confirmed, this definitely works, on both counts! Thanks for that extra tip about the extra 4 bytes, that felt really smelly. Now I think the challenge is going to be deciding whether building out those utility functions _is_ in fact going to be less work than string parsing. A lot to chew on. Question beautifully answered though, + reference to boot. – wosevision Oct 10 '18 at 16:21
  • Glad to help, always nice to see a PHP question from a low-rep user that isn't just "give teh codez" ;) – miken32 Oct 10 '18 at 16:24
  • Simplified a bit, assuming you're using PHP >= 7.1 which was released 2 years ago. – miken32 Oct 10 '18 at 20:03