0

I am parsing a JSON document in Python and I have gotten nearly the whole process to work except I am having trouble converting a GPS string into the correct form.

I have the following form:

"gsx$gps":{"$t":"44°21′N 68°13′W\ufeff / \ufeff44.35°N 68.21°W\ufeff / 44.35; -68.21\ufeff (Acadia)"}

and that is from this HTML form:

44°21′N 68°13′W / 44.35°N 68.21°W / 44.35; -68.21 (Acadia)

and I want the final product to be a string that looks like this:

(44.35, -68.21)

here are a few other example JSON strings just to give you some more to work with:

"gsx$gps":{"$t":"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"}

"gsx$gps":{"$t":"38°41′N 109°34′W\ufeff / \ufeff38.68°N 109.57°W\ufeff / 38.68; -109.57\ufeff (Arches)"}

I have the following:

GPSlocation = entry['gsx$gps']['$t']

and then I don't know how to get GPSlocation into the form that I want above.

Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
clifgray
  • 4,313
  • 11
  • 67
  • 116

4 Answers4

1

not super elegant but it works...also you are not parsing json ... just parsing a string...

import re
center_part = GPSLocation.split("/")[1]
N,W = centerpart.split()
N,W = N.split("\xb0")[0],W.split("\xb0")[0]
tpl = (N,W)
print tpl

on a side note these are not ints ...

Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
  • alright great. yes I am just parsing a string. this gives me what I need but what exactly is the \xb0 signifying? – clifgray Oct 02 '12 at 04:41
1

Here we go:

import json
jstr = """{"gsx$gps":{"$t":"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"}}"""
a = json.loads(jstr)
tuple(float(x) for x in a['gsx$gps']['$t'].split('/')[-1].split(u'\ufeff')[0].split(';'))

Gives:

(-14.25, -170.68)

Or from the plain string:

GPSlocation = u"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"
tuple(float(x) for x in GPSlocation.split('/')[-1].split(u'\ufeff')[0].split(';'))

Some timeit fancy, why to avoid fancy regexp ;)

import re
import timeit
setup='GPSlocation = u"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"; import re'
print timeit.timeit("map(float, GPSlocation.split('/')[-1].split(u'\ufeff')[0].split(';'))", setup=setup)
print timeit.timeit("map(float, re.findall(r'(-?\d+(?:\.\d+)?)', GPSlocation)[-2:])", setup=setup)

5.89355301857
22.6919388771
Michael
  • 7,316
  • 1
  • 37
  • 63
  • with GPSlocation all I have is this string: "14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)" but I suppose if I go back a step this works – clifgray Oct 02 '12 at 04:40
  • Just ignore the first two lines and replace `a['gsx$gps']['$t']` with `GPSlocation`. – Michael Oct 02 '12 at 04:42
  • the only problem I am having which I was having originally is that it isn't doing anything about the degree symbol and it cannot encode that – clifgray Oct 02 '12 at 05:07
  • Well, you have to enable unicode. Also remember the little `u'asdf'` in front of the strings. How you import the unicode string correctly depends on your data source. `json.loads` creates unicode automatically for example. – Michael Oct 02 '12 at 05:10
0

You can extract the data with regex:

>>> import re
>>> text = '''"gsx$gps":{"$t":"44?21?N 68?13?W\ufeff / \ufeff44.35?N 68.21?W\ufeff / 44.35; -68.21\ufeff (Acadia)"}'''
>>> map(float, re.findall(r'(-?\d+(?:\.\d+)?)', text)[-2:])
[44.35, -68.21]
Blender
  • 289,723
  • 53
  • 439
  • 496
0
re.sub(r'.+/ (-?\d{1,3}\.\d\d); (-?\d{1,3}\.\d\d)\\.+',
       "(\g<1>, \g<2>)",
       "44°21′N 68°13′W\ufeff / \ufeff44.35°N 68.21°W\ufeff / 44.35; -68.21\ufeff (Acadia)")
Need4Steed
  • 2,170
  • 3
  • 22
  • 30
  • This seems to have some issues, when you feed in unicode strings. Besides that, I don't think that the idea was, to output the values as string, but to get a tuple, where you can actually work with. – Michael Oct 02 '12 at 09:26