16

Hello I am trying to convert a list of X and Y coordinates to lines. I want to mapped this data by groupby the IDs and also by time. My code executes successfully as long as I grouby one column, but two columns is where I run into errors. I referenced to this question.

Here's some sample data:

ID  X           Y           Hour
1   -87.78976   41.97658    16
1   -87.66991   41.92355    16
1   -87.59887   41.708447   17
2   -87.73956   41.876827   16
2   -87.68161   41.79886    16
2   -87.5999    41.7083     16
3   -87.59918   41.708485   17
3   -87.59857   41.708393   17
3   -87.64391   41.675133   17

Here's my code:

df = pd.read_csv("snow_gps.csv", sep=';')

#zip the coordinates into a point object and convert to a GeoData Frame
geometry = [Point(xy) for xy in zip(df.X, df.Y)]
geo_df = GeoDataFrame(df, geometry=geometry)

# aggregate these points with the GrouBy
geo_df = geo_df.groupby(['track_seg_point_id', 'Hour'])['geometry'].apply(lambda x: LineString(x.tolist()))
geo_df = GeoDataFrame(geo_df, geometry='geometry')

Here is the error: ValueError: LineStrings must have at least 2 coordinate tuples

This is the final result I am trying to get:

ID          Hour     geometry
1           16       LINESTRING (-87.78976 41.97658, -87.66991 41.9... 
1           17       LINESTRING (-87.78964000000001 41.976634999999... 
1           18       LINESTRING (-87.78958 41.97663499999999, -87.6... 
2           16       LINESTRING (-87.78958 41.976612, -87.669785 41... 
2           17       LINESTRING (-87.78958 41.976624, -87.66978 41.... 
3           16       LINESTRING (-87.78958 41.97666, -87.6695199999... 
3           17       LINESTRING (-87.78954 41.976665, -87.66927 41.... 

Please any suggestions or ideas would be great on how to groupby multiple parameters.

mm_nieder
  • 431
  • 1
  • 4
  • 10
  • 1
    The error message you get seems to suggest that a LineString is trying to be created from a single Point, which fails. Is it possible you have a group consisting of only a single row? – joris Jun 27 '18 at 21:13
  • What if you tried x, y instead of xy? – ted Jun 27 '18 at 21:20
  • @joris I don't believe that's the issue? ted are you suggesting instead of having x and y in separate columns to join them? – mm_nieder Jun 28 '18 at 15:47
  • If I just take the file `df` and run `grouped = df.groupby(['ID', 'Hour']) grouped.groups` this is the result `{(0, 16): Int64Index([5, 14, 16, 55, 130], dtype='int64'), (0, 17): Int64Index([7, 27, 126, 141, 185, 235], dtype='int64'),...` Except now I dont have the coordinates and I dont know how to convert the points to lines? Maybe I could index each position to get their xy then convert? – mm_nieder Jun 28 '18 at 16:00
  • Can you check with `(df.groupby(['ID', 'Hour']).size() < 2).sum()` if it is certainly giving 0 ? – joris Jun 28 '18 at 21:54

1 Answers1

15

Your code is good, the problem is your data.

You can see that if you group by ID and Hour, then there is only 1 point that is grouped with an ID of 1 and an hour of 17. A LineString has to consist of 1 or more Points (must have at least 2 coordinate tuples). I added another point to your sample data:

ID   X          Y           Hour
1   -87.78976   41.97658    16
1   -87.66991   41.92355    16
1   -87.59887   41.708447   17
1   -87.48234   41.677342   17
2   -87.73956   41.876827   16
2   -87.68161   41.79886    16
2   -87.5999    41.7083     16
3   -87.59918   41.708485   17
3   -87.59857   41.708393   17
3   -87.64391   41.675133   17

and as you can see below the code below is almost identical to yours:

import pandas as pd
import geopandas as gpd
from shapely.geometry import Point, LineString, shape

df = pd.read_csv("snow_gps.csv", sep='\s*,\s*')

#zip the coordinates into a point object and convert to a GeoData Frame
geometry = [Point(xy) for xy in zip(df.X, df.Y)]
geo_df = gpd.GeoDataFrame(df, geometry=geometry)

geo_df2 = geo_df.groupby(['ID', 'Hour'])['geometry'].apply(lambda x: LineString(x.tolist()))
geo_df2 = gpd.GeoDataFrame(geo_df2, geometry='geometry')
Martin Valgur
  • 5,793
  • 1
  • 33
  • 45
Tom G.
  • 376
  • 2
  • 6
  • Some good tutorials on creating geospatial data can be found at https://pygis.io/docs/e_new_vectors.html – mmann1123 Dec 14 '21 at 16:34