Anyone can tell me how i should select one column with 'loc' in a dataframe using dask?
As a side note, when i am loading the dataframe using dd.read_csv with header equals to "None", the column name is starting from zero to 131094. I am about to select the last column with column name as 131094, and i get the error.
code:
> import dask.dataframe as dd
> df = dd.read_csv('filename.csv', header=None)
> y = df.loc['131094']
error:
File "/usr/local/dask-2018-08-22/lib/python2.7/site-packages/dask-0.5.0-py2.7.egg/dask/dataframe/core.py", line 180, in _loc "Can not use loc on DataFrame without known divisions") ValueError: Can not use loc on DataFrame without known divisions
Based on this guideline http://dask.pydata.org/en/latest/dataframe-indexing.html#positional-indexing, my code should work right but don't know what causes the problem.