Find minimum of the entire dataframe?

Question

I have a dataFrame like:

I need to calculate min, mean, std, sum, for all dataFrame as a single list of numbers. (e.g minimum here is 1)

EDIT: The data may have Nans or different size columns.

df.to_numpy().mean()

Produce Nan, because there are nans in the arrays and they have different length.

How to calculate all normal math stuff on all of these numbers ?

Did you try something like `df['a'].to_numpy().concat(df['b'].to_numpy())`. I am not sure if it's work but I think youg et the spirit. — Eloi, Nov 13 '22 at 21:52
yes but the lists are different sizes and have Nan inside. So results are completely wrong. — gotiredofcoding, Nov 14 '22 at 06:49
I also did tried profits.to_numpy().mean() which produce Nan. — gotiredofcoding, Nov 14 '22 at 06:54
This question is similar to this one and so the answers there may help: [What's the best way to sum all values in a Pandas dataframe](https://stackoverflow.com/q/38733477/1609514) — Bill, Nov 14 '22 at 07:01

jezrael · Accepted Answer · 2022-11-14T07:06:49.793

Pandas solution is with reshape by DataFrame.stack and Series.agg:

def std_ddof0(x):
    return x.std(ddof=0)

out = df.stack().agg(['mean','sum',std_ddof0, 'min'])
print (out)
mean          3.888889
sum          35.000000
std_ddof0     2.424158
min           1.000000
dtype: float64

Numpy solution with np.nanmean, np.nansum, np.nanstd, np.nanmin:

totalp = df.to_numpy().reshape(-1)

out = np.nanmean(totalp), np.nansum(totalp), np.nanstd(totalp), np.nanmin(totalp)
print (out)
(3.888888888888889, 35.0, 2.4241582476968255, 1.0)

Another idea is remove missing values first:

totalp = df.to_numpy().reshape(-1)
totalp = totalp[~np.isnan(totalp)]
print (totalp)
[4. 7. 3. 2. 1. 9. 3. 4. 2.]

out = np.mean(totalp), np.sum(totalp), np.std(totalp), np.min(totalp)
print (out)
(3.888888888888889, 35.0, 2.4241582476968255, 1.0)

thanks (again) ! you always have the easiest solutios. – gotiredofcoding Nov 14 '22 at 07:09 — gotiredofcoding, Nov 14 '22 at 07:09

Find minimum of the entire dataframe?

1 Answers1