I'm trying to get the value of 'n' in the last row of a dask dataframe.
If I understand correctly, positional indexing isn't an option. I don't know the index of the last row. I thought tail() would be the solution, but it returns and empty dataframe.
print( df.compute() ) # df has 47 rows
returns
file str n
11027 /Users/... XXX... 901
11028 /Users/... XXX... 902
...
11099 /Users/... XXX... 946
11100 /Users/... XXX... 947
then i do
tail = df.tail( n=10, compute=True )
print(tail)
which takes A MINUTE AND FIFTEEN SECONDS which is unacceptably slow since I need to do several thousand of these and returns
Empty DataFrame
Columns: [file, str, n]
Index: []
What am I missing here?
Note, I found a solution for head() returning empty but the solution doesn't apply to tail(). dask dataframe head() returns empty df