I've gathered information from many separate excel files into a dataframe that contains East and North data given in 10 different coordinate systems (EPSG codes). I can use pyproj
to convert one given data point with (epgs_code, east, north) into (lat, lon) for plotting in plotly express, for example.
To add the latitud and longitud data to my dataframe I created a function convert_to_lonlat(epsg, east, north)
and I am attempting to apply it to the dataframe to get the new columns, as shown in the code below.
I created a function that takes in the epsg code, east and north, and returns lon and lat values:
def convert_to_lonlat(epsg, east, north):
import pyproj
# Define the projection
epsg_proj = pyproj.CRS.from_epsg(int(epsg))
# Create a transformation function using the projection
transform = pyproj.Transformer.from_crs(epsg_proj, 'epsg:4326', always_xy=True)
# Transform the coordinates
lon, lat = transform.transform(north, east)
return lon, lat
A simple test dataframe would be:
geo = pd.DataFrame({'East':[1e6], 'North':[1e6], 'EPSG':[3116]})
I'm trying to use the .apply()
method to create two columns in a dataframe that contains data in many epsg codes, as:
geo[['lat','lon']] = geo.apply(
lambda row: convert_to_lonlat(
row['EPSG'],
row['East'],
row['North'],
axis=1,
result_type='expand',
)
)
which incidentally was suggested by ChatGPT.
However, when I run the code on my dfataframe I get a KeyError:'EPSG'
message. Any suggestions as of how this could be fixed? Thank you in advance!
For reference, the full answer from ChatGPT to the query: "use .apply() in pandas using a function that takes 3 arguments and returns two values to create two new columns" was:
import pandas as pd
# Define the function that takes three arguments and returns two values
def my_function(arg1, arg2, arg3):
result1 = arg1 + arg2 + arg3
result2 = arg1 * arg2 * arg3
return result1, result2
# Create a sample dataframe
df = pd.DataFrame({'arg1': [1, 2, 3, 4, 5],
'arg2': [10, 20, 30, 40, 50],
'arg3': [100, 200, 300, 400, 500]})
# Use the apply() function to create two new columns using the my_function
df[['result1', 'result2']] = df.apply(lambda row: my_function(row['arg1'], row['arg2'], row['arg3']), axis=1, result_type='expand')
# Print the resulting dataframe
print(df)
which works, printing out:
arg1 arg2 arg3 result1 result2
0 1 10 100 111 1000
1 2 20 200 222 8000
2 3 30 300 333 27000
3 4 40 400 444 64000
4 5 50 500 555 125000