1

I have sales by year:

pd.DataFrame({'year':[2015,2016,2017],'value':['12','24','30']})
    year    value
0   2015    12
1   2016    24
2   2017    36

I want to extrapolate to months:

yyyymm value
201501 1 (ie 12/12, etc)
201502 1
...
201512 1
201601 2
...
201712 3

any suggestions?

frank
  • 3,036
  • 7
  • 33
  • 65

1 Answers1

0

One idea is use cross join with helper DataFrame, convert columns to strings and add 0 by Series.str.zfill:

df1 = pd.DataFrame({'m': range(1, 13), 'a' : 1})

df = df.assign(a = 1).merge(df1).drop('a', 1)
df['year'] = df['year'].astype(str) + df.pop('m').astype(str).str.zfill(2)
df = df.rename(columns={'year':'yyyymm'})

Another solution is create MultiIndex and use DataFrame.reindex:

mux = pd.MultiIndex.from_product([df['year'], range(1, 13)], names=['yyyymm','m'])
df = df.set_index('year').reindex(mux, level=0).reset_index()
df['yyyymm'] = df['yyyymm'].astype(str) + df.pop('m').astype(str).str.zfill(2)

print (df.head(15))
    yyyymm value
0   201501    12
1   201502    12
2   201503    12
3   201504    12
4   201505    12
5   201506    12
6   201507    12
7   201508    12
8   201509    12
9   201510    12
10  201511    12
11  201512    12
12  201601    24
13  201602    24
14  201603    24
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • awesome. If there was a specific value that fell in the middle of a year, and not the beginning, how would that work? I will show with another question – frank Oct 15 '19 at 07:43
  • @alex - do you think ther is also column for month in original? Or need starting from e.g. April for each years? – jezrael Oct 15 '19 at 07:49
  • see updated question at https://stackoverflow.com/questions/58389758/expand-mid-year-values-to-month-in-pandas – frank Oct 15 '19 at 07:53