As is known all probabilities need to sum up to 1. I do have a Pandas Dataframe where sometimes the probabiltiy of one event does miss.
Since I know all elements of a row need to sum up to one. I want to replace Nan by a calculated Value.
With something like the following for each row in my Pandas Data Frame
for item, row in df:
df.replace(Nan,(1-sum of row())
As an example, here's the array I do use as testing Data the moment:
matrixsum
e f g
a 0.3 0.2 Nan
b 0.2 0.2 0.6
c 0.7 0.1 Nan
By using df.fillna(0) i do get this:
matrixsum
e f g
a 0.3 0.2 0.0
b 0.2 0.2 0.6
c 0.7 0.1 0.0
An additional problem is the fact that only rows with float
or int
format can be summed to 1, but nan
has string-formated. At the moment I just use df.fillna(0)
but this is a bad thing to do.
Expectedt Output:
matrixsum
e f g
a 0.3 0.2 0.5
b 0.2 0.2 0.6
c 0.7 0.1 0.2