df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
'B': 'one one two three two two one three'.split(),
'C': np.arange(8), 'D': np.arange(8) * 2})
Just imagine this dataframe now with pandas it is easy for me to find a column based on another column's value just like this:
df.loc[df["B"] == "three", "A"]
but with dask the output i receive if i use the same code doesn't really help me
df.loc[df["ActionGeo_Lat"] == "42#.5", "SQLDATE"]
after executing this line i receive the following output, which doesn't really help me:
The problem i'm having is that everytime i try to execute df.compute
i receive
ValueError:ValueError: could not convert string to float: '42#.5'.
After cutting out some columns i found out that the error is caused somewhere in the column ActionGeo_Lat
, now i would like to manually edit the csv file to resolve the error, but cannot find out on which date the error occurs.
Thanks for the help in advance!