1

I have read csv file using dask this way:

import dask.dataframe  as dd
train = dd.read_csv('act_train.csv')

Then I would like to apply simple logic per row , that works pretty fine in pandas:

columns = list(train.columns)

for col in columns[1:]:
    train[col] = train[col].apply(lambda x: x if x == -1 else x.split(' ')[1])

Unfortunately, last line of code generates the following error: Length of values does not match length of index

What am I doing wrong?

Rocketq
  • 5,423
  • 23
  • 75
  • 126
  • Hi @Rocketq, can you provide an example that can be easily run by someone without your dataset? http://stackoverflow.com/help/mcve – MRocklin Aug 08 '16 at 12:58

1 Answers1

0

If x doesn't contain space character than x.split(' ') will return a list containing single element x.

So, when u are trying to access the second element of x.split(' ') by calling x.split(' ')[1]. It will give the error :

"Length of values does not match length of index", as there is no element at index 1 in x.split(' ').

surru
  • 26
  • 5