-1

I am trying to perform an outer join (union) of two time series in python. The left time series is larger than the first.

An example:

Right time series - df_1

Time Series 1
3 1
4 2
5 3

Left time series - df_2

Time Series 2
0 10
1 11
2 12
3 13
4 14
5 15
6 16

I simply tried:

pd.merge([df_1, df_2], how = "outer", on = "Time")

I expected to get something like this:

Time Series 1 Series 2
0 nan 10
1 nan 11
2 nan 12
3 1 13
4 2 14
5 3 15
6 nan 16

Instead I got this:

Time Series 1 Series 2
3 1 13
4 2 14
5 3 15
6 nan 16

It is crucial for the exercise to keep the time series in the correct sequence, I do not want to switch the order of the merge.

TeoTi
  • 11
  • 5
  • I think I have solved it using the `pd.reindex()` function I was not aware of. basically you can provide a list like mapping and the dataframe columns are rearranged automatically! I simply have to understand how to structure the merging procedure but I believe it would only take few conditional statements and the game should be done :) – TeoTi Aug 06 '23 at 14:34

1 Answers1

0

instead of using 'outer' in how argument you can use 'right'. Below is the sample code:

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'Time': ['time 3', 'time 4', 'time 5'],
     'Series 1': [1, 2, 3]})
df2 = pd.DataFrame({'Time': ['time 0', 'time 1', 'time 2', 'time 3', 'time 4', 'time 5', 'time 6'],'Series 2': [10, 11, 12, 13, 14, 15, 16]})

merged_df = pd.merge(df1, df2, on='Time', how='right')
merged_df = merged_df.sort_values('Time')
merged_df = merged_df.reset_index(drop=True)
merged_df['Series 1'] = merged_df['Series 1'].fillna(np.nan)

I hope this will solve your problem.

  • I ended up using `pd.merge(df1, df2, on='Time', how='outer')` using conditional statements and `pd.reindex()` to get the columns in the correct order I wanted. The problem was a bit more complicated then the one I have presented. Anyway your answer is super useful and correct indeed. – TeoTi Aug 06 '23 at 15:31