6

I have been experimenting with geography datatype lately and just love it. But I can't decide should i convert from my current schema, that stores latitude and longitude in two separate numeric(9,5) fields to geography type. I have calculated the size of both types and Lat/Long way of representing a point is 28 bytes for a single point whereas geography type is 26. Not a big gain in space but huge improvement in performing geospatial operations (intersect, distance measurement etc.) which are currently handled using awkward stored procedures and scalar functions. What I wonder is the indices. Will geography data type require more space for indexing the data? I have a feeling that it will, even though the actual data stored in columns is less, I thing the way geospatial indices work will eventually result in larger space allocation for them.

P.S. as a side note, it seems that SQL Server 2008 (not R2) does not automatically seek through geospatial indices unless explicitly told to using WITH(INDEX()) clause

Dimitri
  • 6,923
  • 4
  • 35
  • 49
  • I don't want to be "that guy" but try it out and see what happens! If it's infeasible to do it for your entire data set, pick a subset and compare that subset for both approaches. – Ben Thul May 09 '11 at 21:07
  • unfortunately trying is not an option since we're talking about tens of millions of records. And trying on a subset is not a realistic test, since throughout my experience bunch of times when something performs perfect on smaller scale is a complete disaster in production. – Dimitri May 09 '11 at 21:09
  • You asked specifically about space, not performance (unless I'm missing something). As such, taking 10000 rows (for instance) and comparing the two strategies is a good test. – Ben Thul May 10 '11 at 15:14
  • well, even if i do that, that would only answer YES/NO. But i'd really love to know WHY. – Dimitri May 10 '11 at 16:36

3 Answers3

1

I understand that you are trying to decide between keep one of the two, but you might want to consider keeping both. If you export your data into shape files, its a common practice to let lat lon field be along with the geom field.

Shaunak
  • 17,377
  • 5
  • 53
  • 84
  • i just read your comment about number of records :P. i dont think you will ever export it to different formant will you? – Shaunak May 09 '11 at 21:20
  • well, the database is very hard to back up every day as is (cause it changes dramatically each day). Duplicating lat/long will increase the size unnecessarily. – Dimitri May 09 '11 at 21:23
1

In my opinion you should definitely use the spatial types only. The spatial type are optimized for spatial queries and if spatial queries are what you need then I think it is an easy choice.

As a sideeffect you can get rid of your geographical functions and procedures since they are (probably) built-in in SQL server 2008. One caveat though, you might have to spend some time optimizing the spatial indexes, but this depends on your specific case.

steenhulthin
  • 4,553
  • 5
  • 33
  • 52
  • Yep, that's what I did. Spatial Indices do seem to take up more space, but once optimized they offer performance increase that's worth the extra disk space. – Dimitri Jun 15 '11 at 12:48
0

I would keep both. It can be useful to easily query the original coordinates of a particular feature without requiring spatial operations. You have the benefit of knowing the original points as well as the ability to create a new geometry from them in case you need it in a different coordinate system (like if you have your geometry in a particular projection that will lose a lot of precision going to another).

Jordan Parmer
  • 36,042
  • 30
  • 97
  • 119
  • Yeah but if i continue to "query the original coordinates" doesn't that defy the whole reason i want to move away from the old way of doing things? Plus we're talking millions of rows per day. And i would hate keeping duplicate data if i'm not gonna use it. – Dimitri Jun 06 '11 at 14:08
  • Well, it depends on who all would read the data. For instance, certain IT departments in various companies have testers separate from the development team. The testers do not understand spatial queries. For them, it is easy to quickly see what the internal lat/lon value is without having to extract it. If you don't have a good reason to store it, then I wouldn't keep a copy. – Jordan Parmer Jun 06 '11 at 19:16
  • Yeah, in our company the data is used only on backend, never touching the user directly, and every developer knows (or will have no problem understanding) how to work with geospatial data. But i guess i have already answered to my question. – Dimitri Jun 06 '11 at 20:37
  • Yep - no need to keep it then. – Jordan Parmer Jun 06 '11 at 21:13