'Unalignable boolean Series key provided' while doing selection

Question

I am just trying to categorize some data with pandas Basically my data is a string and I want to modify it depending on the value of the X first characters.

I tried this :

data['BO In Code'].loc[data['BO In Code'][:2]=='XU']=1

Unalignable boolean Series key provided

This :

data['BO In Code'].loc[str(data['BO In Code'])[:2]=='XU']=1

and this :

data['BO In Code'].loc[data['BO In Code'].index[:2]=='XU']=1

gave me :

'Cannot use a single bool to index into setitem'

This will help you : http://stackoverflow.com/questions/33817842/keyerror-when-using-boolean-filter-on-pandas-data-frame — Harsha W, Apr 18 '17 at 04:40

piRSquared · Accepted Answer · 2017-04-18T05:35:47.877

3

You need to use the str string accessor

data.loc[data['BO In Code'].str[:2]=='XU', 'BO In Code'] = 1

explanation

.loc for dataframes can take two indexers. Those indexers can be a single index value, a list of index values, or an array of booleans of equal length as the corresponding dimension being sliced.

In this case, the first indexer is a boolean array where each value is the truth of whether the first 2 characters in the column 'BO In Code' is equal to 'XU'. We use this to filter the rows of the the dataframe. We still need to specify what column we want. Happens that we want 'BO In Code'.

So the first reference to 'BO In Code' was to find the boolean slice. The second reference to 'BO In Code' was to specify the column we wanted. It did not have to be the same column.

edited Apr 18 '17 at 05:35

answered Apr 18 '17 at 04:52

piRSquared

285,575
57
475
624

why do I have to put 'BO in Code' again at the end ? isn't it working already without it ? – Mayeul sgc Apr 18 '17 at 05:28
Actually when I put it have an index error, and when I don't it is working fine, so i'm gonna get rid of the second one – Mayeul sgc Apr 18 '17 at 05:50

'Unalignable boolean Series key provided' while doing selection

1 Answers1