0

I have a quite silly task but haven't found a way to do it, I have a huge df, here is the head

    Deal Date Period Name  Price Quarter Start                   Quarter End
0  2011-11-01     2011-Q4  30.76    2011-10-01 2011-12-31 23:59:59.999999999
1  2011-11-01     2012-Q1  30.95    2012-01-01 2012-03-31 23:59:59.999999999
2  2011-11-01     2012-Q2  30.67    2012-04-01 2012-06-30 23:59:59.999999999
3  2011-11-01     2012-Q3  29.87    2012-07-01 2012-09-30 23:59:59.999999999
4  2011-11-01     2012-Q4  29.49    2012-10-01 2012-12-31 23:59:59.999999999

I wish to have an additional column which shows "month", the above 5 rows will become 15 rows, for example the initial row 0 will repeat twice

    Deal Date Period Name  Price Quarter Start                   Quarter End  Month
0  2011-11-01     2011-Q4  30.76    2011-10-01 2011-12-31 23:59:59.999999999  10
1  2011-11-01     2011-Q4  30.76    2011-10-01 2011-12-31 23:59:59.999999999  11 
2  2011-11-01     2011-Q4  30.76    2011-10-01 2011-12-31 23:59:59.999999999  12

as there are these 3 months included in Q4... similar for the rest of rows.

Is there an easy way to achieve this? Thanks

neutralname
  • 383
  • 2
  • 4
  • 11

1 Answers1

0

You can extract the quarter value from period, then perform pandas.merge with a dataframe with only 12 rows containing quarter -> month mapping.

Simplified example code:

import pandas as pd

df_test = pd.DataFrame({'quart':[1,2,3,4,1,2], 'val': ['a','b','c','d','e','f']})

df_quart_to_month = pd.DataFrame({'quart':[1,1,1,2,2,2,3,3,3,4,4,4], 'month': [1,2,3,4,5,6,7,8,9,10,11,12]})

df_with_months = df_test.merge(df_quart_to_month ,on='quart', how='outer')

If you want to keep the original order:

df_with_months = df_test.reset_index().merge(df_quart_to_month ,on='quart', how='outer').set_index('index')

df_sorted = df_with_months.sort_values(['index', 'month'], ascending=[True, True])

Alternatively you could split your dataset into 4 DataFrames based on their quarter, copy each sub-dataframe twice and add the corresponding month. Then concatenate the resulting 12 sub-dataframes together.

Chris
  • 168
  • 8
  • Thanks Chris! I need to keep the original order of rows, with your code the Q1 value will appear before Q2 etc, FYI the data are from several years – neutralname Aug 19 '20 at 14:34
  • @neutralname You can order the resulting dataframe by 2 columns, i.e. first by year, second by quarter. `df1.sort_values(['a', 'b'], ascending=[True, False])` [Source](https://stackoverflow.com/questions/17141558/how-to-sort-a-dataframe-in-python-pandas-by-two-or-more-columns) – Chris Aug 20 '20 at 07:52
  • Thanks again Chris, is there a way to just keep its original order? instead of ordering by new criteria – neutralname Aug 24 '20 at 08:14
  • You can use the orignal index of your DataFrame for that, i.e. keep the index as a column when merging and then sort by the original index (and by month if you want) [Source](https://stackoverflow.com/questions/20277358/sort-pandas-dataframe-both-on-values-of-a-column-and-index). I updated the answer accordingly. Please accept the answer if it answers the original question. – Chris Aug 24 '20 at 13:56