0

I am trying to write a python function using the module pyproj which will do coordinate conversion based on two factors - the ending of the file name and the name of 2 rows.

For example: if self.file_crs == 'IG' which is if the file ending is IG for Irish Grid

AND

for idx,el in enumerate(row):
  if keys[idx].capitalize() in ['Easting', 'Northing']:

which is if the two columns are called Easting and Northing

THEN RUN

inProj = Proj(init='epsg:29903') # Irish Grid
outProj = Proj(init='epsg:4326') # WGS84
x1, y1 = row[1], row[2]  # easting, northing
x2, y2 = transform(inProj, outProj, x1, y1)
row[1], row[2] = y2, x2

How can I combine these to look something like:

if self.file_crs == 'IG' and keys[idx].capitalize() in ['Easting', 'Northing']:
  inProj = Proj(init='epsg:29903') # Irish Grid
  outProj = Proj(init='epsg:4326') # WGS84
  x1, y1 = row[1], row[2]  # easting, northing
  x2, y2 = transform(inProj, outProj, x1, y1)
  row[1], row[2] = y2, x2

I need to be able to reference idx beforehand so it is recognised in my 'if' statement

EDIT

keys are the row names in the csv which is being parsed.

if line_count == 0:
                    keys = row

The rows are as follows

Name Easting Northing Time
Test1 169973 77712 01/01/2020 09:51:03 AM
  • What are `keys` and `row`? – quamrana Jun 22 '21 at 11:17
  • @quamrana sorry, see edited question! – justneedhelp Jun 22 '21 at 11:21
  • Do you mean that the `keys` are the column names and `row` is one row at a time? Can you give an example of one row? – quamrana Jun 22 '21 at 11:23
  • @quamrana exactly! And yes I have included an example of a row – justneedhelp Jun 22 '21 at 11:28
  • First I would say that you should check the column names when `line_count == 0` and throw an exception if they are not what you expect. That would eliminate one of the complexities. – quamrana Jun 22 '21 at 11:33
  • yes I have done that with ```line_count += 1``` thank you! – justneedhelp Jun 22 '21 at 12:14
  • `keys[idx].lower()` will never be in `['Easting', 'Northing']`, E/N are up-case ! You only need to iterate iterate on rows. – Balaïtous Jun 22 '21 at 18:03
  • apologies, they are changed to capitalize now. But I still need a way to run ```if self.file_crs == 'IG' and keys[idx].capitalize() in ['Easting', 'Northing']:``` – justneedhelp Jun 23 '21 at 09:03
  • Do you get any errors with: `keys[idx].capitalize()`? – quamrana Jun 23 '21 at 09:47
  • @quamrana nope, it works fine! My issue is just trying to combine the 2 IF statements into 1. Then I get that error ```invalid syntax``` – justneedhelp Jun 23 '21 at 10:18
  • Can you update your question to show all the parts in sequence? Its very difficult to see which bit follows which. I think you only need to check the keys at the time you are reading the first line. – quamrana Jun 23 '21 at 10:43
  • @quamrana that should be it now. It is at the ```# Changing of CRS based on file name``` part that I wish to include the new ```if self.file_crs == 'IG' and keys[idx].capitalize() in ['Easting', 'Northing']:``` – justneedhelp Jun 23 '21 at 10:56
  • Ok, I see what is going on now. I'll have a look to see if we can get what you need. – quamrana Jun 23 '21 at 10:57
  • Do you need the: `inProj` and `outProj` variables to be a new instance for each row? – quamrana Jun 23 '21 at 11:02
  • @quamrana I do yes! – justneedhelp Jun 23 '21 at 11:04
  • Oh, I was hoping that they could be instantiated once and reused for each row. – quamrana Jun 23 '21 at 11:10
  • So, does your `transform()` function modify the `inProj` and `outProj` variables? – quamrana Jun 23 '21 at 11:19
  • @quamrana hmm well I hadn't thought of that, but if you feel it would work well I am willing to give it a try! The ```transform()``` takes the CRS identified in the ```inProj``` and ```outProj``` and uses them to correctly convert the coordinates – justneedhelp Jun 23 '21 at 11:23

2 Answers2

0

Ok, I tried this in a make-shift test harness and this runs:

class TestPositions:
    def __init__(self, crs):
        self.file_crs = crs

    def process_incoming_file(self, bucket, key, event):
        if self.file_crs == 'BNG':
            inProj = Proj(init="epsg:27700")  # British National Grid
            outProj = Proj(init="epsg:4326")  # WGS84
        else:
            inProj = Proj(init='epsg:29903')  # Irish Grid
            outProj = Proj(init='epsg:4326')  # WGS84

        try:

            decoded_content = ['Name|Easting|Northing|Time', 'Test1|169973|77712|01/01/2020 09:51:03 AM']
            print('processing data')
            rows = csv.reader(decoded_content, delimiter='|')
            for line_count, row in enumerate(rows):
                if line_count == 0:
                    keys = [title.lower() for title in row]
                    print('keys', keys)
                    isEasting = ('easting' in keys)
                else:
                    json_doc = {}
                    for idx, el in enumerate(row):
                        if keys[idx] in ['time', 'date serviced', 'timestamp']:
                            timestamp = self.format_timestring(el)
                        else:
                            json_doc[keys[idx]] = el

                    if isEasting:
                        json_doc['easting'], json_doc['northing'] = transform(inProj, outProj, json_doc['easting'], json_doc['northing'])
                        json_doc['latitude'] = json_doc.pop('easting')
                        json_doc['longitude'] = json_doc.pop('northing')
                    geom = Point(json_doc['Longitude'], json_doc['Latitude'])
                    WKB_format = wkb.dumps(geom, hex=True, srid=4326)

                    fid = uuid.uuid4()  # assign new UUID to each row

                    print(json_doc['name'])
                    print(fid)
                    print(timestamp)
                    print(WKB_format)
                    print(json_doc)

        except Exception as e:
            print(f'Exception: {e.__class__.__name__}({e})')

You can see how I generate inProj and outProj at the beginning of the method so it is done once per call to process_incoming_file().

I have hardcoded decoded_content. You will need your original:

            response = self.client.get_object(Bucket=bucket, Key=key)
            decoded_content = response['Body'].read().decode('utf-8')
            print(decoded_content)

            rows = csv.reader(decoded_content.splitlines(), delimiter=',')
            # no need to convert to a list

You will notice that I check the column names at the first line and throw an exception as it will be pointless to carry on with missing information.

Also, once I copy the cells into json_doc, there is no need to refer to row again in the loop.

Update:

I added a check for 'easting' as a proxy for which set of column names would be present. So, if isEasting: make the conversion happen, else it is assumed that no conversion is necessary.

quamrana
  • 37,849
  • 12
  • 53
  • 71
  • I seem to be getting the error ```invalid syntax (, line 1)``` if I try this method. I am using python 3.6 for reference – justneedhelp Jun 23 '21 at 13:38
  • Ok, try this line: `raise RuntimeError(f'keys={keys} not valid')` – quamrana Jun 23 '21 at 13:45
  • yes that works thanks! Just one more question - see the following lines ```if ('easting' not in keys) or ('northing' not in keys): raise RuntimeError(f'{keys=} not valid')``` The column headings can be easting and northing or longitude and latitude - I need it the function to recognise if its ```easting``` and ```northing``` then carry out ```inProj``` and ```outProj```, otherwise, if it is ```longitude``` and ```latitude``` leave as is. – justneedhelp Jun 23 '21 at 13:58
  • Ok, so will the input be invalid if its the case that the headings are neither pair, or is it *so* likely that it *will* be one or the other it isn't worth checking? – quamrana Jun 23 '21 at 14:06
  • The headings will always be either ```easting``` and ```northing``` or ```longitude``` and ```latitude```. Coordinate conversion is only required to take place if the headings are in ```easting and northing``` format – justneedhelp Jun 23 '21 at 14:37
  • Ok, see my update. I've assumed that exactly one or the other will be present. – quamrana Jun 23 '21 at 14:42
  • Okay yes I think this will work. I just need to figure out where to add ```line_count += 1```, as at present the eastings/northings can't be converted as it only recognises the string and not the numbers below. – justneedhelp Jun 23 '21 at 14:57
  • I have `line_count` incorporated in the for loop by using `enumerate()`. It gets incremented automatically. – quamrana Jun 23 '21 at 14:58
  • yes but I get the error ```Exception: TypeError(must be real number, not str)``` when the ```inProj and outProj``` are ran due to it picking up on the column names rather than the values underneath – justneedhelp Jun 23 '21 at 15:01
  • Perhaps you should paste your current code as an answer so I can see what you're doing. With my code, there is no way that `transform()` can get column names instead of lat/lon numbers. – quamrana Jun 23 '21 at 15:05
  • ok, I have done so! Apologies for all the confusion and your help is greatly appreciated – justneedhelp Jun 23 '21 at 15:16
0

In reply to above answer

    def process_incoming_file(self, bucket, key, event):
        if self.file_crs == 'BNG':
            inProj = Proj(init="epsg:27700")  # British National Grid
            outProj = Proj(init="epsg:4326")  # WGS84
        else:
            inProj = Proj(init='epsg:29903')  # Irish Grid
            outProj = Proj(init='epsg:4326')  # WGS84

        try:
            response = self.client.get_object(Bucket=bucket, Key=key)
            decoded_content = response['Body'].read().decode('utf-8')
            print(decoded_content)
            print('processing data')
            rows = csv.reader(decoded_content.splitlines(), delimiter=',')
            for line_count, row in enumerate(rows):
                if line_count == 0:
                    keys = [title.lower() for title in row]
                    print('keys', keys)
                    isEasting = ('easting' in keys)
                else:
                    json_doc = {}
                    for idx, el in enumerate(row):
                        if keys[idx] in ['time', 'date serviced', 'timestamp']:
                            timestamp = self.format_timestring(el)
                        else:
                            json_doc[keys[idx]] = el

                    if isEasting:
                        json_doc['easting'], json_doc['northing'] = transform(inProj, outProj, json_doc['easting'], json_doc['northing'])
                        json_doc['latitude'] = json_doc.pop('easting')
                        json_doc['longitude'] = json_doc.pop('northing')
                    geom = Point(json_doc['longitude'], json_doc['latitude'])
                    WKB_format = wkb.dumps(geom, hex=True, srid=4326)

                    fid = uuid.uuid4()  # assign new UUID to each row

                    print(json_doc['name'])
                    print(fid)
                    print(timestamp)
                    print(WKB_format)
                    print(json_doc)

        except Exception as e:
            print(f'Exception: {e.__class__.__name__}({e})')
  • You may need to temporarily remove the `try:except:` in order to get an Error Traceback to be able to tell exactly where the error about numbers not strings comes from. – quamrana Jun 23 '21 at 15:25
  • I've gone through the code and it is ```geom = Point(json_doc['longitude'], json_doc['latitude']) WKB_format = wkb.dumps(geom, hex=True, srid=4326)``` which throws it off with the error ```TypeError(must be real number, not str)```. This happened me in my original code until I added ```line_count += 1``` – justneedhelp Jun 23 '21 at 15:31
  • What happens if you have either: `geom = Point(int(json_doc['longitude']), int(json_doc['latitude']))` or `geom = Point(float(json_doc['longitude']), float(json_doc['latitude']))`? – quamrana Jun 23 '21 at 15:36
  • The first gives me this error ```ValueError(invalid literal for int() with base 10: '-8.6077881')``` – justneedhelp Jun 23 '21 at 15:38
  • the second works perfectly!!!! Finally - thank you!!!! – justneedhelp Jun 23 '21 at 15:38