0

I am trying to join a few geodataframes using the Dask python package. While implementing my data processing algorithm I faced up with the next exception: AttributeError: 'DataFrame' object has no attribute '_example'

Here is my code:

import dask.dataframe as dd
import dask_geopandas as dg
import pandas as pd
import dask


df1= gpd.read_file("shapefile1.shp")
df2= gpd.read_file("shapefile2.shp")

df1= dd.from_pandas(df1, npartitions=1)
df2= dd.from_pandas(df2, npartitions=1)

gf2 = dg.sjoin(df1, df2, how='inner', op='intersects')

Here is my stacktrace:

Traceback (most recent call last):
  File "test.py", line 21, in <module>
    gf2 = dg.sjoin(df1, df2, how='inner', op='intersects')
  File "/usr/local/lib/python3.6/dist-packages/dask_geopandas-0.0.1-py3.6.egg/dask_geopandas/core.py", line 413, in sjoin
    example = gpd.tools.sjoin(left._example, right._example, how=how, op=op)
  File "/home/mapseeuser/.local/lib/python3.6/site-packages/dask/dataframe/core.py", line 2414, in __getattr__
    raise AttributeError("'DataFrame' object has no attribute %r" % key)
AttributeError: 'DataFrame' object has no attribute '_example'

So, could anybody tell me what I am doing wrong and how to join two datasets using Dask package library.

Tequila
  • 726
  • 7
  • 23
  • As mentioned in the README of the dask-geopandas project *This project is not in a functional state and should not be relied upon. No guarantee of support is provided.* – MRocklin May 25 '18 at 13:37

1 Answers1

0

Python package library:

sudo pip install dask[dataframe]
sudo pip install geopandas

Try this Code:

import dask.dataframe as dd
import geopandas as gpd
import pandas as pd
import dask

df1= gpd.read_file("shapefile1.shp")
df2= gpd.read_file("shapefile2.shp")

df1= dd.from_pandas(df1, npartitions=1)
df2= dd.from_pandas(df2, npartitions=1)

gf2 = gpd.sjoin(df1, df2, how='inner', op='intersects')
Sankar guru
  • 935
  • 1
  • 7
  • 16
  • Thanks for reply, but if I take your code it will display an error such as :"AttributeError: 'DataFrame' object has no attribute 'crs'". Secondly, if we change the join function from dg.sjoin to gpd.sjoin we will use the Geopandas package instead of Dask_geopandas – Tequila May 25 '18 at 10:16