Pandas: find the first couple of timestamp indices that are the same given two dataframes

Question

I have the following df1:

                      Value
2020-06-02 00:00:00   17
2020-06-03 00:00:00   10
2020-06-05 00:00:00   86

and the following df2:

                      Value
2020-06-01 00:00:00   16
2020-06-04 00:00:00   9
2020-06-05 00:00:00   86

and I know that the two dataframes have different length. How can I find the day of the first and the last couple of indices where df1.index == df2.index?

In my example, the day of the first matching couple would be 2020-06-05 00:00:00.

score 3 · Accepted Answer · answered Jul 10 '20 at 10:12

3

Use Index.intersection and then use minimal and maximal values:

idx = df1.index.intersection(df2.index)
min1 = idx.min()
max1 = idx.max()

answered Jul 10 '20 at 10:12

jezrael

822,522
95
1,334
1,252

Uhm I noticed that this is fairly slow given two dataframes of less than 100,000 rows. Is there anything I can do to increase the speed? – Zizzipupp Jul 10 '20 at 10:29
@Zizzipupp - Are index sorted? – jezrael Jul 10 '20 at 10:30
1

@Zizzipupp - Is possible use [this](https://numpy.org/doc/stable/reference/generated/numpy.intersect1d.html) ? like `idx = numpy.intersect1d(df1.index, df2.index)` ? – jezrael Jul 10 '20 at 10:32
Yes, but it requires to turn `min1` and `max1` into strings before using them as slicers in `df1.loc[min1:max1]`. If you don't do this, you may get `KeyError: "Cannot get left slice bound for non-unique label:...` – Zizzipupp Jul 10 '20 at 10:47
1

@Zizzipupp - Last should be used `min1 = pd.Timestamp(idx.min())` ? – jezrael Jul 10 '20 at 10:48
1

Correct. The exact sequence is `str(pd.Timestamp(idx.min()))`. – Zizzipupp Jul 10 '20 at 10:50

Pandas: find the first couple of timestamp indices that are the same given two dataframes

1 Answers1