1

I have this dataframe:

1/1/1990,1.9
1/2/1990,1.9
1/29/1990,1.9
1/4/1990,1.7775
1/5/1990,1.76
1/6/1990,1.76
1/7/1990,1.76
1/8/1990,1.76
1/1/1991,1.9
1/2/1991,1.9
1/29/1991,1.9
1/4/1991,1.7775
2/5/1991,1.76
2/6/1991,1.76
1/7/1991,1.76
3/29/1991,1.76
4/30/1991,1.76

proxy of a bigger database.

I would like to drop all the data referring to the 29th of February.

This is how I read the dataframe:

dfr = pd.read_csv('test.csv', sep=',', index_col=0, parse_dates=True)

this is the best solution that I have found so far:

dfr = dfr.loc[~(dfr.index.month==2 & dfr.index.day==29)]

However, I get the following error:

TypeError: unsupported operand type(s) for &: 'int' and 'Int64Index'

It is strange, because dfr.index.month==2 as well as dfr.index.day==29 work. I have the feeling that they have to be converted to pandas date but I do not know how.

mozway
  • 194,879
  • 13
  • 39
  • 75
diedro
  • 511
  • 1
  • 3
  • 15

2 Answers2

3

Your parentheses are incorrect as & has higher precedence than ==.

Your expression is equivalent to ~(dfr.index.month == (2 & dfr.index.day) == 29), which triggers the error unsupported operand type(s) for &: 'int' and 'Int64Index'.

You need to use:

dfr = dfr.loc[~((dfr.index.month==2) & (dfr.index.day==29))]
mozway
  • 194,879
  • 13
  • 39
  • 75
1

You may also use strftime for a solution without hassling of parantheses:

dfr[dfr.index.strftime('%m-%d') != '02-29']
Nuri Taş
  • 3,828
  • 2
  • 4
  • 22