0

I have certain data that needs to be extrapolated for 'Y' vs 'X', corresponding to every 'id/group' till 'Y' drops to zero.

Here, 'X' = cycle and Y = 'Covariate'

However, the last value of 'Covariate' for every 'id' is different.

These are my codes:

## Import the required Python libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Create a dataset
data = {'id': [1, 1, 1, 1, 1, 1, 1, 1,
               2, 2, 2, 2, 2, 2,
               3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
               4, 4, 4, 4],
        'cycle': [1, 2, 3, 4, 5, 6, 7, 8,
                  1, 2, 3, 4, 5, 6,
                  1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
                  1,2,3,4],
        'Covariate': [1,  0.99,  0.97,  0.95,  0.9,  0.87,  0.86,  0.81,
                      1,  0.97,  0.94,  0.93,  0.90,  0.89,
                      1,  0.99,  0.96,  0.93,  0.89,  0.88,  0.85,   0.83,  0.82,  0.8,
                      1,  0.96,  0.94,  0.9],
        }



## Convert to dataframe
df = pd.DataFrame(data)
print("df = \n", df)

The above data frame for 'Covariate' vs 'cycle', for each 'id', appears as such:

enter image description here

For extrapolation, these are my codes:

## 1st order fitting
Order_fit = 1 

## Create an empty array 
df_Extrapolated = np.empty(shape=[0, 3])
    

## Iterate over all ids
for i in range (0,4) :   


    ## id under consideration
    id_Number = i+1

   
    ## Initialize the fit for the spcific id
    fit = np.polyfit(df.groupby(by="id").get_group(id_Number)['cycle'],    ## X-axis data
                     df.groupby(by="id").get_group(id_Number)['Covariate'],    ## Y-axis data
                     Order_fit,                                                ## Order of fit
                     ) 
    Extrapolate = np.poly1d(fit)


    ## Get the last cycle for every id
    Last_cycle_ith_id = df[['id', 'cycle']].groupby('id').max().reset_index()['cycle'][i]

    
    ## Create new X-axis data points for every id
    X_axis_data_new_ith_id = np.arange(5) + Last_cycle_ith_id
 
    
    ## Create new Y-axis data points for every id
    Y_axis_data_new_ith_id = Extrapolate(X_axis_data_new_ith_id)
   

    ## Create an array for ith id
    array_ith_id =l = np.array([i+1] * np.shape(Y_axis_data_new_ith_id)[0])


    ## Store the extrapolated data for ith id in an array
    Extrapolated_data_ith_id = np.vstack((array_ith_id,X_axis_data_new_ith_id, Y_axis_data_new_ith_id)).transpose()


    ## Extrapolated_data for all ids
    df_Extrapolated = np.append( df_Extrapolated, Extrapolated_data_ith_id, axis=0)
    df_Extrapolated = pd.DataFrame(df_Extrapolated, columns =['id', 'cycle_extrapol', 'Covariate_extrapol'])
    print("\n df_Extrapolated = \n",df_Extrapolated)
  

##Plot the data

## Plot the data
plt.figure(figsize=(10,10))
plt.subplot(221)
plt.plot(df.groupby(by="id").get_group(1)['cycle'], df.groupby(by="id").get_group(1)['Covariate'],'b',label = 'Actual Data')
plt.plot(df_Extrapolated.groupby(by="id").get_group(1)['cycle_extrapol'], df_Extrapolated.groupby(by="id").get_group(1)['Covariate_extrapol'], 'r',label = 'Extrapolated Data')
plt.xlabel('cycle')
plt.ylabel('Covariate')
plt.legend()
plt.title('id 1')

plt.subplot(222)
plt.plot(df.groupby(by="id").get_group(2)['cycle'], df.groupby(by="id").get_group(2)['Covariate'],'b',label = 'Actual Data')
plt.plot(df_Extrapolated.groupby(by="id").get_group(2)['cycle_extrapol'], df_Extrapolated.groupby(by="id").get_group(2)['Covariate_extrapol'], 'r',label = 'Extrapolated Data')
plt.xlabel('cycle')
plt.ylabel('Covariate')
plt.legend()
plt.title('id 2')

plt.subplot(223)
plt.plot(df.groupby(by="id").get_group(3)['cycle'], df.groupby(by="id").get_group(3)['Covariate'],'b',label = 'Actual Data')
plt.plot(df_Extrapolated.groupby(by="id").get_group(3)['cycle_extrapol'], df_Extrapolated.groupby(by="id").get_group(3)['Covariate_extrapol'], 'r',label = 'Extrapolated Data')
plt.xlabel('cycle')
plt.ylabel('Covariate')
plt.legend()
plt.title('id 3')

plt.subplot(224)
plt.plot(df.groupby(by="id").get_group(4)['cycle'], df.groupby(by="id").get_group(4)['Covariate'],'b',label = 'Actual Data')
plt.plot(df_Extrapolated.groupby(by="id").get_group(4)['cycle_extrapol'], df_Extrapolated.groupby(by="id").get_group(4)['Covariate_extrapol'], 'r',label = 'Extrapolated Data')
plt.xlabel('cycle')
plt.ylabel('Covariate')
plt.legend()
plt.title('id 4')

plt.show() 

The extrapolated data thus looks as such:

enter image description here

Here, the loophole is, I am generating new X-axis data points (X_axis_data_new_ith_id) and then passing it through the 'Extrapolate' function to get new Y-axis data points (Y_axis_data_new_ith_id) for every id

However, for every id, I need to run the extrapolation till the 'Y' data i.e. 'Covariate' drops down to zero, as such:

enter image description here

Can anyone please let me know how to achieve this task in Python?

NN_Developer
  • 417
  • 6

0 Answers0