0

I am reading a csv in python with multiple columns.

The first column is the date and I have to delete the rows that correspond to years previous to 2017.

           time      high        low  Volume      Plot     Rango
0    2017-12-22  25.17984  24.280560     970  0.329943  0.899280
1    2017-12-26  25.17984  23.381280    2579  1.057921  1.798560
2    2017-12-27  25.17984  23.381280    2499  0.998083  1.798560
3    2017-12-28  25.17984  24.280560    1991  0.919885  0.899280
4    2017-12-29  25.17984  24.100704    2703  1.237694  1.079136
..          ...       ...        ...     ...       ...       ...
580  2020-04-16   5.45000   4.450000  117884  3.168380  1.000000
581  2020-04-17   5.35000   4.255200   58531  1.370538  1.094800
582  2020-04-20   4.66500   4.100100   25770  0.582999  0.564900
583  2020-04-21   4.42000   3.800000   20914  0.476605  0.620000
584  2020-04-22   4.22000   3.710100   23212  0.519275  0.509900

I want to delete the rows corresponding to years prior to 2018, so 2017,2016,2015... should be deleted

I am trying with this but does not work

if 2017 in datos['time']: datos['time'].remove()  #check if number 2017 is in each of the items of the column 'time'

The dates are recognized as numbers, not as datatime but I think I do not need to declare it as datatime.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158

1 Answers1

0

In pandas

  • Given your data
  • Use Boolean indexing
  • time must be datetime64[ns] format
    • df.info() will give the dtypes
    • df['date'] = pd.to_datetime(df['date'])
df[df['time'].dt.year >= 2018]
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158