Here is the problem I pull a csv into a pandas data frame that looks like so:
Identity Date value1 value2 Random
Apple 1/1/2005 10 10 Orange
Apple 12/1/2005 1 1 Orange
I need to then call the Identity Apple, find its min and max dates and insert rows ie months in order to interpolate between the two points so the end result becomes
Identity Date value1 value2 Random
Apple 1/1/2005 10 10 Orange
Apple 2/1/2005 0 0 Orange
Apple 3/1/2005 0 0 Orange
. . . . .
. . . . .
. . . . .
Apple 12/1/2005 1 1 Orange
The problem is that although I can loop through a list of identities and get all rows associated I cant seem to find a way to then insert extra rows, especially without a nasty for loop. essentially I need to bridge the date gap and fill the associated Identity values with zeros.
list = ['Apple','Banana','Orange']
for i in range(0,len(list))
data.loc[data['Identity'].isin(list[i])]
EDIT:
Working Code Below:
import pandas as pd
df = pd.DataFrame([['Apple',pd.to_datetime('1/1/2005'),10,10,'Orange'], ['Orange', pd.to_datetime('8/1/2005'),1, 1 ,'Apple'],['Apple', pd.to_datetime('12/1/2005'),1, 1 ,'Orange']])
df.columns = ['Identity','Date', 'value1' , 'value2','Random']
full_df = pd.DataFrame()
dummydata = []
indentity = ['Apple','Orange']
random = ['Orange','Apple']
years = ['2005','2005']
for i in range(0,2):
full_df = pd.DataFrame()
full_df['Date'] = [pd.to_datetime(str(x)+'/1/'+str(years[i])) for x in range(1,13)]
full_df['Identity'] = indentity[i]
full_df['Random'] = random[i]
dummydata.append(full_df)
full_df = pd.concat(dummydata)
result = full_df.merge(df,how='left').fillna(0)
#print(dummydata)
#print(full_df)
print(result )