5

I have a database where one field gives spatial coordinates. I have learned the field is a serialised MSDN geometry Data Type (http://msdn.microsoft.com/en-us/library/bb933973.aspx).

I want to access this database from Python and was wandering if anyone knew the format of the Geometry Data Type, or any libraries capable of parsing it out into a set of Geo Coordinates in Python.

The link states that Microsoft used the "Open Geospatial Consortium (OGC) standard" in designing this data type, does this mean the spatial coordinates are defined by this standard?

Does any one else have any experience with this?

Any help would be much appreciated!

Nick Cartwright
  • 8,334
  • 15
  • 45
  • 56

2 Answers2

3

As figured out in the comments below (thanks MarkJ!):

  • geometry is a .NET datatype but uses its own custom serialization format; you could select the column whole and then reimplement this by opening Microsoft.SqlServer.Types.dll in Reflector and starting from
  • or you can use the SQL server support for the type to read out the properties of the geometry data from the database, e.g. select geocolumn.STX, geocolumn.STY from myTable;
  • or you can export the whole value as GML with e.g. select geocolumn.AsGml() from myTable; which can be processed by Python geometry libraries such as http://gispython.org/ http://mapnik.org/ http://www.qgis.org/wiki/Python_Bindings

I had originally thought SQL Server stored CLR data types as serialized .NET objects directly in the table but this turned out to be wrong.

Rup
  • 33,765
  • 9
  • 83
  • 112
  • I understand - thank you. I will have a play, but why on earth would the guys at Microsoft think it was a good idea to serialise an object into a Database table? It breaks the rule of separation! – Nick Cartwright Aug 12 '10 at 07:23
  • @nickcartwright @Rup The contents are not a serialized .NET object but a representation of geometry. This is a very good thing because it means you can do spatial queries in SQL, for instance "list all customers within 10 miles of this office" or "join customers to offices so that each customer is paired with their nearest office" [More here](http://www.microsoft.com/sqlserver/2008/en/us/spatial-data.aspx) I don't know anything about the Python geometry support or SQL server support, but I'd have hoped the Python SQL server connection would translate SQL Server geometries into Python ones. – MarkJ Aug 12 '10 at 07:36
  • @MarkJ OK, I'm not 100% sure how it's stored but it is a .NET object http://msdn.microsoft.com/en-us/library/cc280487.aspx "The planar spatial data type, geometry, is implemented as a common language runtime (CLR) data type in SQL Server." If you select the column whole in management studio you get a binary blob. I was going to try selecting back a column as-is then feeding it into BinarySerializer but the SqlClient class automatically returns you a SqlGeometry object instead so I can't conclude anything from that. – Rup Aug 12 '10 at 08:35
  • 2
    OK, I'm wrong - it is binary serialised but not as a whole object. Open Microsoft.SqlServer.Types.dll in Reflector and look at SqlGeometry's Read() method. You might do better working from the GML format which might already have support in your classes `select geocolumn.AsGml() from myTable` else it would be possible to replicate the SqlGeometry parse logic from Reflector. – Rup Aug 12 '10 at 08:54
  • 2
    OK, I was wrong too, it is a .Net object! I suppose my point is that the underlying representation isn't really relevant. "The guys at Microsoft" have provided a higher level abstraction for dealing with geometry in SQL Server, which is an excellent idea. [GML](http://en.wikipedia.org/wiki/Geography_Markup_Language) is a widely used XML standard for exchanging spatial geometry data. So GML ought to be a good option. I would hope that there are GML libraries for Python. Google is suggesting these http://gispython.org/ http://mapnik.org/ http://www.qgis.org/wiki/Python_Bindings – MarkJ Aug 12 '10 at 11:22
1

"…if anyone knew the format of the Geometry [data type]…"

The binary serialization format for SQL Server's spatial GEOMETRY and GEOGRAPHY types is specified here:

[MS-SSCLRT]: Microsoft SQL Server CLR Types Serialization Formats

"Specifies the binary format of the GEOGRAPHY, GEOMETRY, HIERARCHYID, and CLR user-defined type (UDT) structures that are managed by SQL Server."

This specification is well written and the binary format is easy to understand, so it shouldn't be much of a problem implementing a basic parser for the binary format yourself.

"…or any libraries capable of parsing it out into a set of [geo coordinates] in Python…"

Using Microsoft.SqlServer.Types via .NET interop to deserialize these types:

If you don't want to implement your own de-serializer (which should be fairly simple), but you can find a way to interact with a .NET assembly from Python — perhaps via pythonnet? —, then the following hints may be of interest:

The two T-SQL types GEOMETRY and GEOGRAPHY are implemented as a combination of a .NET assembly (Microsoft.SqlServer.Types) that performs the de-/serialization from/to the binary format just mentioned above, and an unmanaged DLL (SqlServerSpatial….dll) which contains almost everything else (i.e. the routines for spatial operations).

If you're only interested in de-serializing SQL Server spatial data, and you're careful not to invoke any spatial functions on SqlGeometry or SqlGeography, then you might be able to use Microsoft.SqlServer.Types to de-serialize spatial binary data for you, then inspecting it with an implementation of IGeometrySink110 that you have to provide to e.g. the SqlGeometry.Populate method.

Microsoft.SqlServer.Types and SqlServerSpatial….dll are available either as a .NET project-wide NuGet package, or as a system-wide MSI installation package (SQLSysClrTypes.msi). AFAIK the DLLs are also automatically installed with SQL Server.

Well-Known Text (WKT) and Well-Known Binary (WKB):

One more option would be to let SQL Server translate spatial values to Well-Known Text (WKT) or Well-Known Binary (WKB) using SELECT geometryColumn.STAsText() or SELECT geometryColumn.STAsBinary(), then look for a Python library that can parse these standard interchange formats.

(One word of caution: If you go down that route, just be careful if your data contains circular arcs. There are different versions of the WKT and WKB data format. They were first specified as part of the Simple Features Access specification of the Open Geospatial Consortium; that version doesn't understand about circular arcs. Support for circular curve segments was added in the SQL/MM Part 3: Spatial standard, which SQL Server implements.)

Community
  • 1
  • 1
stakx - no longer contributing
  • 83,039
  • 20
  • 168
  • 268