I am just learning python and run into a question. Any help is appreciated. So I am trying to Interpolating one time series onto another in pandas. To do this, I need to reindex a concatenated dataframe.
df4=pd.concat([df1['mid_price'].copy(),df3]).sort_index().fillna(method='ffill')
df4 = pd.DataFrame(df4, index=df1.index)
but I get the following error about unable to reindex from duplicate axis:
> Traceback (most recent call last):
File "<ipython-input-622-27eed25ece8c>", line 1, in <module>
df4 = pd.DataFrame(df4, index=df1.index)
File "/home/student/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py", line 222, in __init__
dtype=dtype, copy=copy)
File "/home/student/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py", line 130, in _init_mgr
copy=False)
File "/home/student/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 3558, in reindex_axis
fill_value=fill_value, copy=copy)
File "/home/student/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 3586, in reindex_indexer
self.axes[axis]._can_reindex(indexer)
File "/home/student/anaconda3/lib/python3.5/site-packages/pandas/indexes/base.py", line 2293, in _can_reindex
raise ValueError("cannot reindex from a duplicate axis"
ValueError: cannot reindex from a duplicate axis
I have checked that my index has no duplicates:
df1.index.is_unique
Out[623]: True
I have also checked the index is DatetimeIndex
:
In [624]: type(df1.index)
Out[624]: pandas.tseries.index.DatetimeIndex
In [625]: type(df4.index)
Out[625]: pandas.tseries.index.DatetimeIndex
Is there anything I am doing wrong? Any suggestions? Thank you very much.
Update: I found a work-around, but I still hope to figure out what happened and what is the problem with my original code. My work-around is: 1. only fill nan values for columns from df3. 2. use a column of df1 being notnull to extract the relevant rows from df4:
df4 = df4[df4.ix[:,0].notnull()]
not elegant, but works for now.