We have the following dataframe (df)
print(df)
#Gene GSM772 GSM773 GSM774 GSM775 GSM776
0610007P14Rik 0.003485 0.003415 0.005431 0.003667 0.007146
0610009B22Rik 0.001220 0.001351 0.001762 0.001404 0.002177
0610009L18Rik 0.000055 0.000009 0.000152 0.000082 0.000179
0610009O20Rik 0.000000 0.006830 00000000 0.006653 0.006907
0610010F05Rik 0.008310 0.008329 0.007091 0.006919 0.006915
We want to calculate Geometric Mean for every row.
- And append the result as the last column with the column name GeometricMean.
For some rows there are "zero" values, which needs to be ignored so the geometric mean for that row is regarded as zero.
We wrote the following python script,
import scipy
import numpy
import numpy as np
from scipy.stats.mstats import gmean
from scipy import stats
numpy.seterr(divide = 'ignore')
scipy.stats.gmean(df.iloc[:,1:5],axis=1)
gmean = scipy.stats.gmean(df.iloc[:,1:5],axis=1)
df.assign(GeometricMean=gmean)
results = df.assign(GeometricMean=gmean)
print(results)
Following error is encountered:
AttributeError: 'str' object has no attribute 'log' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "calculate_gmean.py", line 99, in <module> scipy.stats.gmean(df.iloc[:,1:5],axis=1) #calculates gmean rowwise, axis=1 for rowwise File "/home/.local/lib/python3.6/site-packages/scipy/stats/stats.py", line 402, in gmean log_a = np.log(np.array(a, dtype=dtype)) TypeError: loop of ufunc does not support argument 0 of type str which has no callable log method
Can anyone please suggest the best way to resolve this issue?
Thanks !!