2

I have two dataframes with Date as index,
df1 as follows:

ticker          AAPL       AMD      BIDU     GOOGL      IXIC      MSFT   
Date                                                                     
2011-06-29  0.017664  0.024379  0.029592  0.016947  0.010632  0.014888   
2011-07-14  0.011054  0.023751  0.020682  0.018319  0.011720  0.008974   
2011-07-28  0.015654  0.061570  0.027052  0.040214  0.013136  0.018023   
2011-08-11  0.033560  0.051619  0.050874  0.033624  0.039744  0.033243   
2011-08-25  0.023524  0.037700  0.039888  0.026866  0.025491  0.017059   
...              ...       ...       ...       ...       ...       ...   
2021-03-25  0.019135  0.022755  0.058591  0.013458  0.013935  0.014096   
2021-04-09  0.011087  0.015793  0.027506  0.014799  0.008590  0.014066   
2021-04-23  0.015193  0.035461  0.021222  0.011206  0.010541  0.010495   
2021-05-07  0.013870  0.019531  0.009501  0.014049  0.008334  0.012325   
2021-05-21  0.017664  0.022013  0.028316  0.017536  0.016039  0.016377   

And df2 as follows:

ticker               AAPL       AMD      BIDU     GOOGL      IXIC      MSFT   
Date                                                                          
2011-06-29       0.005953  0.010331  0.008564  0.010089  0.000288  0.004542   
2011-07-14       0.006609  0.000628  0.008910  0.001372  0.001088  0.005914   
2011-07-28       0.004600  0.037819  0.006369  0.021895  0.001416  0.009049   
2011-08-11       0.017906  0.009951  0.023822  0.006591  0.026608  0.015220   
2011-08-25       0.010036  0.013919  0.010986  0.006758  0.014252  0.016183   
...                   ...       ...       ...       ...       ...       ...   
2021-03-25       0.010546  0.017421  0.015649  0.010738  0.009917  0.005328   
2021-04-09       0.008049  0.006963  0.031085  0.001341  0.005345  0.000031   
2021-04-23       0.004107  0.019669  0.006285  0.003594  0.001951  0.003571   
2021-05-07       0.001323  0.015930  0.011720  0.002843  0.002206  0.001831   
2021-05-21       0.003793  0.002482  0.018815  0.003487  0.007704  0.004052   

I checked that both dataframes have the same number of rows and columns, and no NaN. ie, for both dataframes, df.isnull().values.any() returns False.

And I would like to calculate the element-wise division between those two division, ie, for the first element, I want the value of 0.017664/0.005953.
I've seen the post here: What does .div do in Pandas (Python)
So I simply try df1.div(df2) which returns all NaN values, but doubles the number of columns.

Thanks for @Rafa's comment, I checked with the columns, but

df1.columns

returns

MultiIndex([('Close',  'AAPL'),
            ('Close',   'AMD'),
            ('Close',  'BIDU'),
            ('Close', 'GOOGL'),
            ('Close',  'IXIC'),
            ('Close',  'MSFT'),
           names=[None, 'ticker'])

So I checked with the index.

My

df1.index

returns

Index(['2011-06-29', '2011-07-14', '2011-07-28', '2011-08-11', '2011-08-25',
       '2011-09-09', '2011-09-23', '2011-10-07', '2011-10-21', '2011-11-04',
       ...
       '2021-01-12', '2021-01-27', '2021-02-10', '2021-02-25', '2021-03-11',
       '2021-03-25', '2021-04-09', '2021-04-23', '2021-05-07', '2021-05-21'],
      dtype='object', name='Date', length=250)

but if tried

df1.index.get_level_values(1)

it says

IndexError: Too many levels: Index has only 1 level, not 2

Now I am confused with whether my df1 is a multiindex dataframe or not?

And how I should proceed? Thanks for help.

Lazer
  • 45
  • 6
  • Did you check out [Divide two dataframes with python](https://stackoverflow.com/questions/39454542/divide-two-dataframes-with-python)? – DarrylG Jul 19 '21 at 13:02
  • 2
    It looks like the columns in each of the dataframes are different, despite looking like they are the same. Can you print df1.columns and df2.columns and compare them? If they are different, just change the columns by e.g. df2.columns = df1.columns and it should be ready to go. – Rafa Jul 19 '21 at 13:03
  • @Rafa Thank you. It looks like you are right. I checked `df1.columns` and it says `'MultiIndex' object is not callable`. Maybe it is due to the `ticker`? But my `df1.index` are the Dates and if I tried `df1.index.get_level_values(1)`, it says `Too many levels: Index has only 1 level, not 2`. Any help with how I should deal with this problem? – Lazer Jul 19 '21 at 13:13
  • If you have a multi index, you should get a list of tuples back when accessing `.columns` or `.index`. – ifly6 Jul 19 '21 at 13:42
  • @ifly6 Thank you, and yes there was some mistake, I have edited the question. Any help with how to proceed? – Lazer Jul 19 '21 at 14:11
  • Somewhat obviously, if the index has only one level, it isn't a multi-index. In your columns, collapse by reassigning level 1 to `df.columns`. – ifly6 Jul 19 '21 at 14:15
  • @ifly6 I'm still a bit confused, why it says `Index has only one level, not 2` but `df.columns` returns `MultiIndex([...])`. And how I should collapse by reassigning level 1 to `df.columns`? I tried `df.index = df.columns` but `df.index` is `Date` which contains 250 elements, while `df.columns` has only 12 elements. – Lazer Jul 19 '21 at 14:40
  • Columns are not row labels. When you say index has one level, that's talking about rows, not columns... you queried `df1.index.get_level_values(1)` not `df1.COLUMNS.get_level_values(1)`. Your multi-index is on the columns axis; flatten it by reassigning `df1.columns = df1.columns.get_level_values(1)`. – ifly6 Jul 19 '21 at 15:01
  • If dividing one dataframe with another doubles only your columns, it means that the indices are the same, but columns are different. What happens when you print df2.columns? – Rafa Jul 20 '21 at 07:23
  • @Rafa Yes you are right. `df2.columns` gives the same as my `df1.columns` except that `'Close'` is replaced with `'Logret'`. I don't need the `'Close'` nor the `'Logret'` in my `df1/df2`. How can I collapse the column tuples? – Lazer Jul 20 '21 at 19:40

1 Answers1

0

The problem you're having is that the columns are different - that's why the number of columns doubled, but indices stayed the same. As both df1 and df2 are multiindex and you need second level of multiindex, the only thing needed to be done is:

df1.columns = df1.columns.get_level_values(1)
df2.columns = df2.columns.get_level_values(1)

in case it won't work (for any reason), just make sure that both dataframes have the same list of columns:

df1.columns = df2.columns

and then perform the dividing.

Rafa
  • 564
  • 4
  • 12