Effective numpy array calculating in Python

Question

I need to calculate this special variance estimate (see pic. below). I have feature matrix X - dxl (d - # features, l - # objects). It's simply to do this in for cycles:


    var_list = []

    for i in range(X.shape[0]):
        for j in range(i + 1, X.shape[0]):
            var_list.append(((X[i, :] - X[j, :]) ** 2).sum())

    variance = np.median(var_list)

But this is ineffective because of python cycle. Is there a way to do it by numpy faster?

Formula for variance:

Question has actually nothing to do with `machine-learning` - kindly do not spam irrelevant tags (removed). As for the rest, please post a [mcve]. — desertnaut, Feb 18 '21 at 12:54

score 0 · Answer 1 · answered Feb 18 '21 at 12:54

0

You can use numpy.var() to find the variance faster. You can see the documentation https://numpy.org/doc/stable/reference/generated/numpy.var.html

answered Feb 18 '21 at 12:54

Also note that if you want the empirical (or sample variance) you need to use `np.var(x, ddof=1)`. See: https://stackoverflow.com/questions/41204400/what-is-the-difference-between-numpy-var-and-statistics-variance-in-python – Loic RW Feb 18 '21 at 12:57
But I said, that I need a special variance estimate, which is not implemented in numpy. – Domino Fortune Feb 18 '21 at 13:20

Effective numpy array calculating in Python

1 Answers1