Pandas: dynamically append dataframe column into a running total dataframe within a loop

Question

I am writing a simulation and I am trying to append the result of EACH ITERATION INTO a dataframe that keep track of all the iterations.

While everything works fine with collecting the results, I cannot find a way to append the results into a new column each time. I have been banging my head on that issue for a while now and cannot unblock the problem.

I have built a simplified version of what I am doing to best explain my issue:

import simpy
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
import pandas as pd

###dataframe for the simulation
df = pd.DataFrame({'Id' : ['1183', '1187']})
df['average_demand'] = [7426,989]
df['lead_time'] = [1.5, 1.5]
df['sale_price'] = [1.98, 2.01]
df['buy_price'] = [0.11, 0.23]
df['beg_inventory'] = [1544,674]
df['margin'] = df['sale_price'] - df['buy_price']
df['holding_cost'] = 0.2/12
df['aggregate_order_placement_cost'] = 1000
df['review_time'] = 0
df['periods'] = 30
#df['cap_ts'] = 1.5
df['min_ts'] = 1
df['low_demand'] = [300, 30]#,3000,350,220,40,42,40,10,25,240]
df['high_demand'] = [1000, 130]#,12000,700,500,100,90,210,135,200,800]
df['low_sd'] = [160,30]#,3400,100,90,10,5,50,26,45,170]
df['high_sd'] = [400,90]#,5500,200,160,60,50,100,78,113,300]
cap_ts = 0

big_df = pd.DataFrame(df)

for i in df.index:
    for cap_ts in range(1,12, 1):
        def warehouse_run(env, df):

            df['inventory'] = df['beg_inventory']
            df['balance'] = 0.0
            df['quantity_on_order'] = 0
            df['count_order_placed'] = 0
            df['commands_on_order'] = 0
            df['demand'] = 0
            df['safety_stock'] = 0
            df['stockout_occurence'] = 0
            df['inventory_position'] = 0

            while True:
                interarrival = generate_interarrival()
                yield env.timeout(interarrival)
                df['balance'] -= df['inventory'] * df['holding_cost'] * interarrival
                df['demand'] = generate_demand()
                if df['demand'].loc[i] < df['inventory'].loc[i]:
                    df['balance'] += df['sale_price'] * df['demand']
                    df['inventory'] -= df['demand']

                    print('{:.2f} sold {}'.format(env.now, df['demand'].loc[i]))

                else:
                    df['balance'] += df['sale_price'] * df['inventory']
                    df['inventory'] = 0
                    df['stockout_occurence'] += 1
                    print('{:.2f} demand {} but inventory{}'.format(env.now, df['demand'].loc[i], df['inventory'].loc[i]))
                    print('{:.2f} sold {} ( nb stockout)'.format(env.now, df['stockout_occurence'].loc[i]))
                if df['demand'].loc[i] > df['inventory'].loc[i]:

                    env.process(handle_order(env,
                                             df))
                    df['count_order_placed'] += 1
                    print("inventory", df['inventory'].loc[i])
                    print("number of orders placed", df['count_order_placed'].loc[i])

        def handle_order(env, df):

            df['quantity_ordered'] = cap_ts *df['average_demand']
            df['quantity_on_order'] += df['quantity_ordered']
            df['commands_on_order'] += 1
            print("{:.2f} placed order for {}".format(env.now, df['quantity_ordered'].loc[i]))
            df['balance'] -= df['buy_price'] * df['quantity_ordered'] + df['aggregate_order_placement_cost']

            yield env.timeout(df['lead_time'].loc[i], 0)
            df['inventory'] += df['quantity_ordered']
            df['quantity_on_order'] -= df['quantity_ordered']
            df['commands_on_order'] -= 1
            print('{:.2f} receive order,{} in inventory'.format(env.now, df['inventory'].loc[i]))


        # number of orders per month
        def generate_interarrival():
            return np.random.exponential(1. / 1)


        # quantity of demand per months
        def generate_demand():
            return np.random.randint(df['low_demand'].loc[i], df['high_demand'].loc[i])


        def generate_standard_deviation():
            return np.random.randint(df['low_sd'].loc[i], df['high_sd'].loc[i])


        obs_time = []
        inventory_level = []
        demand_level = []
        safety_stock_level = []
        inventory_position_level = []


        def observe(env, df):
            while True:
                obs_time.append(env.now)
                inventory_level.append(df['inventory'].loc[i])
                demand_level.append(df['demand'].loc[i])
                safety_stock_level.append(df['safety_stock'].loc[i])
                inventory_position_level.append(df['inventory_position'].loc[i])
                yield env.timeout(0.1)


        np.random.seed(0)

        env = simpy.Environment()
        env.process(warehouse_run(env, df))
        env.process(observe(env, df))

        # #RUN FOR 12 MONTHS
        env.run(until=36.0)
        recap = pd.DataFrame(df.loc[i])
        recap = recap.transpose()

        #big_df.append(recap)
        big_df['Iteration {}'.format(i)] = recap
        print(recap)

So in this code, the issue is in appending the results contained into recap to big_df. Ideally at the end of the simulation, big_dfshould contain 24 columns, which would be one column of results for each iteration of the simulation. Any help on this would be greatly appreciated, thank you

UPDATE: thanks to wnsfan40 I have been able to get a df that concat the result for each iteration, but big_dfreset at each iteration and does not continually append each new df.

expected output looks kind of like that:

      Id result_columns
0   11198              x
1   11198              x
2   11198              x
3   11198              x
4   11198              x
5   11198              x
6   11198              x
7   11198              x
8   11198              x
9   11198              x
10  11198              x
11  11198              x
12  11187              y
13  11187              y
14  11187              y
15  11187              y
16  11187              y
17  11187              y
18  11187              y
19  11187              y
20  11187              y
21  11187              y
22  11187              y
23  11187              y

with the result columns is a shortcuts for all the columns containing results about each row.

twnsfan40 · Answer 1 · 2020-07-06T22:57:32.817

0

Assign df's columns as the index to big_df when it is initialized using

big_df = pd.DataFrame(index = df.index)

Try changing from append to assigning a column value, such as

big_df['Iteration {}'.format(i)] = recap

edited Jul 06 '20 at 22:57

answered Jul 06 '20 at 22:36

twnsfan40

41
5

with a little tweak, this somewhat work but it big df is reseting at each iteration, how could i keep it running all the way through? – Murcielago Jul 06 '20 at 22:53

Pandas: dynamically append dataframe column into a running total dataframe within a loop

1 Answers1