2
  • I use pandas and get my data from a SQL database
  • I have two tickers. One is a U.S stock, the other an European stock. Dates are not necessarily the same for both stocks (holidays, etc).
  • All my data is stored in a multi-index DataFrame.
  • Looking to fill missing values based on levels

Running the below code:

import pandas as pd
import datetime
ticker_date = [('US',datetime.date.today()-datetime.timedelta(3)),
('US',datetime.date.today()-datetime.timedelta(2)),
('US',datetime.date.today()-datetime.timedelta(1)),
('EU',datetime.date.today()-datetime.timedelta(3)),
('EU',datetime.date.today()-datetime.timedelta(1))]
index_df = pd.MultiIndex.from_tuples(ticker_date)
example = pd.DataFrame([12.2,12.5,12.6,15.1,15],index_df,['value'])

Output:

Output from code above

enter image description here

I am looking for a method to reshape my output filling the missing data with the previous value:

Objective: add a dec 11th line and fill with previous value

enter image description here

MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
ruh
  • 139
  • 3
  • 9
  • 1
    I'm not sure what's going on with your [previous question](https://stackoverflow.com/questions/47793240/vlookup-method-using-pandas) where you posted this code as an "answer" to what is essentially the same question? – roganjosh Dec 13 '17 at 14:11
  • The objective was to formulate the question in a more clear and readable way – ruh Dec 13 '17 at 14:13
  • 1
    Except it leaves the other question in some limbo state. You should have edited the code into that existing question and not posted code as an answer if it doesn't answer the issue. – roganjosh Dec 13 '17 at 14:14
  • Didn't know I could edit the code into an existing question. New in stackoverflow. I will keep that in mind for my next question. Thxs – ruh Dec 13 '17 at 14:17

2 Answers2

1

I'd do it this way:

In [24]: idx = pd.MultiIndex.from_product((
                   example.index.get_level_values(0).unique(), 
                   example.index.get_level_values(1).unique()))

In [25]: example = example.reindex(idx).ffill()

In [26]: example
Out[26]:
               value
US 2017-12-10   12.2
   2017-12-11   12.5
   2017-12-12   12.6
EU 2017-12-10   15.1
   2017-12-11   15.1
   2017-12-12   15.0
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
0

You need to be careful with this solution. It does not account for crossing the primary index. For instance,

import pandas as pd
import datetime
ticker_date = [ ('US',datetime.date.today()-datetime.timedelta(3)),
                ('US',datetime.date.today()-datetime.timedelta(2)),
                ('US',datetime.date.today()-datetime.timedelta(1)),
                ('EU',datetime.date.today()-datetime.timedelta(2)),
                ('EU',datetime.date.today()-datetime.timedelta(1))]
index_df = pd.MultiIndex.from_tuples(ticker_date)
example = pd.DataFrame([12.2,12.5,12.6,15.1,15],index_df,['value'])

idx = pd.MultiIndex.from_product((
        example.index.get_level_values(0).unique(), 
        example.index.get_level_values(1).unique()))

example = example.reindex(idx).ffill()
print(example)

produces:

               value
US 2019-11-23   12.2
   2019-11-24   12.5
   2019-11-25   12.6
EU 2019-11-23   12.6  <== 
   2019-11-24   15.1
   2019-11-25   15.0
Lloyd
  • 1
  • 1
  • 1
    Welcome to Stack Overflow! Please [edit your answer](https://stackoverflow.com/posts/59057841/edit) and reword it so it addresses the question directly. As it's written, it appears to address the other answer. This is generally considered bad practice, as those kinds of clarifications should be done through comments. But since you don't currently have enough reputation to comment, rephrasing your answer will do :) – Das_Geek Nov 26 '19 at 19:14