I inherited a VBA code that I want to convert to Python.
Think of a suvrival matrix where :
- each row is a different product
- each column represent the age of the product
I want to create a survival matrix of zeros, where I then apply the normal distribution (age-life_exp)/sd, where age is the number of the column.
Results : the numbers themselves in my DF lifeleft_2
are good, but not at the right place, dimensions of the results are not OK and lifeleft_2
's column indexes are broken.
Question : How do I make SciPy to return the results for each "observation" instead of the whole array in each observation ?
import pandas as pd
import numpy as np
from scipy.stats import norm
df = pd.DataFrame({'qty' : [20, 30, 40],
'price' : [100, 50, 20],
'life_exp' : [5, 4, 3]})
df['sd'] = df['life_exp'] / 4
nrows = df.shape[0]
ncols = df['life_exp'].max()*2 + 1 # "+1" because 0 = equals the past
# Survival matrix of zeros where column index = age, used for the normal distribution --> (age-life_exp) / sd
lifeleft = pd.DataFrame(np.zeros((nrows, ncols)))
l_cols = lifeleft.columns
# ---> PROBLEM IS HERE <---
lifeleft_2 = lifeleft.apply(lambda x: 1 - norm.cdf((col-df['life_exp']) / df['sd']) for col in l_cols)
display(lifeleft_2)