0

I have the following dataset where the index is DatetimeIndex and the column "value" is made of floats. The index: creationDate contains only information of the day 2020-06-13.

df

df[df.index.duplicated()]

The day is the same but not the hour or the minute, so I'm confused of what is happening here, is it only filtering by day? but why those exact values from the dataset? Is there any way to filter by hour? Or creating a DatetimeIndex only by hour/minute/second ? Thanks!

I know I can convert it to a string and then use it only the hour/minute/second information, but will this remove the possibility to create the DatetimeIndex aswell?

a.civit
  • 36
  • 4

1 Answers1

0

pandas.Index.duplicated() returns a boolean array, in which for each set of duplicated values, the first occurrence is set to False and all others to True. In short, it finds the duplicated index.

Your case is that your two 2020-06-13 index are duplicated in original index thus you select these two rows with df[df.index.duplicated()].

It means in your original index it might like

2020-06-13 20:38:53
2020-06-13 20:38:53
2020-06-13 21:39:01
2020-06-13 21:39:01

so df.index.duplicated() will return [False, True, False, True]. Then you use the boolean indexing to choose the duplicated index.

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52