I have a dataset consisting of some 17,000 images, each of size 3000x1700.
I have two methods for computing the mean and standard deviation of my dataset: a pixel-wise approach and an image-wise approach. The pixel-wise approach simply computes the mean and std of all the pixels in the dataset, (using Welford's online algorithm since there are too many pixels in total to store them all in memory at once). The image-wise approach computes the means and variances of all the images in the dataset, then averages them (and applies sqrt to the average variance to get the standard deviation).
These two approaches will, in general, obviously not return the same results, since the image-wise approach will give equal weight to images of different sizes. However, all the images in my dataset are of the exact same size, so in my case, the two algorithms should return the same mean and std values.
import numpy as np
import os
from PIL import Image
class RunningStatisticsVar:
def __init__(self, ddof=0):
self.mean = 0
self.var = 0
self.std = 0
self._n = 0
self._s = 0
self._ddof = ddof
def update(self, values):
values = np.array(values, ndmin=1)
n = len(values)
self._n += n
old_mean = self.mean
delta = values - self.mean
self.mean += (delta / self._n).sum()
self._s += (delta * (values - self.mean)).sum()
self.var = self._s / (self._n - self._ddof) if self._n > self._ddof else 0
self.std = np.sqrt(self.var)
def __str__(self):
if self.std:
return f"{self.name} (\u03BC \u00B1 \u03C3): {self.mean} \u00B1 {self.std}"
else:
return f"{self.name}: {self.mean}"
def calculate_mean_and_std(source_dir, pixelwise=True):
source_fs = [os.path.join(source_dir, f) for f in os.listdir(source_dir)]
if pixelwise:
stats = {colour: RunningStatisticsVar() for colour in 'rgb'}
for source in source_fs:
_process_image_pixelwise(source, stats)
print(f"\u03BC: {[stats[colour].mean for colour in 'rgb']}")
print(f"\u03C3: {[stats[colour].std for colour in 'rgb']}")
else:
means, vars = zip(*(_process_image(source) for source in source_fs))
means = np.array(means)
vars = np.array(vars)
print(f"\u03BC: {[[means[:i+1, c].mean() for c in range(3)] for i in range(len(means))]}")
print(f"\u03C3: {[[np.sqrt(vars[:i+1, c].mean()) for c in range(3)] for i in range(len(vars))]}")
def _process_image(source_f):
img = np.array(Image.open(source_f))
return img.mean(axis=(0,1)), img.var(axis=(0,1))
def _process_image_pixelwise(source_f, stats):
img = np.array(Image.open(source_f))
for c, colour in enumerate('rgb'):
stats[colour].update(img[:, :, c].flatten())
calculate_mean_and_std('/path/to/dataset', True)
calculate_mean_and_std('/path/to/dataset', False)
Now, when running this code, it starts off fine, after the first image it reports:
Pixelwise
μ: [106.0911049019608, 67.80728647058824, 45.90995117647062]
σ: [41.59660208236723, 34.5791272546266, 26.781512448936052]
Imagewise
μ: [106.09110490196079, 67.80728647058824, 45.909951176470585]
σ: [41.596602082409774, 34.579127254617084, 26.78151244893938]
Only a slight aberration in the σ values. However this error gets bigger and bigger as more images are used.
Pixelwise
μ: [101.21394647058824, 65.62210166666667, 46.48841911764708]
σ: [40.41893639932673, 33.3795015706431, 26.946594699203892]
μ: [101.21668875816994, 65.61455176470588, 45.89977104575165]
σ: [39.97502569826382, 33.022993336061994, 27.14311280813053]
μ: [102.36255480392157, 65.52340710784313, 45.7435418137255]
σ: [39.24577507415308, 32.14645170321761, 26.46386419015868]
μ: [103.53420298039215, 66.6776919607843, 47.33793435294118]
σ: [39.50097527065126, 32.707463761715545, 27.354952039420294]
μ: [101.47019483660131, 64.78871330065358, 45.96562905228758]
σ: [39.004169349180536, 32.23864799529698, 26.801524930594407]
μ: [101.37475316526611, 64.89486834733893, 45.781395238095236]
σ: [39.01205310849703, 31.98696402185754, 26.461034312514027]
μ: [100.67556311274511, 64.35973938725489, 45.18712225490196]
σ: [39.4077244026275, 32.09956875358911, 26.236057166690983]
μ: [99.85794368191722, 63.74056908496731, 44.48019496732026]
σ: [39.711590893165045, 32.16450280203762, 26.02186219040715]
μ: [99.51202813725492, 63.552036745098036, 44.69082580392157]
σ: [39.67742805536442, 32.360665575523704, 26.47325733239239]
μ: [99.68222427807488, 63.67247614973262, 44.77697739750446]
σ: [39.704289274013554, 32.212222956614625, 26.25452243516258]
Imagewise
μ: [101.21394647058824, 65.62210166666667, 46.488419117647055]
σ: [40.12360583609732, 33.30789834971034, 26.940384940174805]
μ: [101.21668875816994, 65.61455176470588, 45.899771045751635]
σ: [39.77618485504031, 32.97475731200615, 27.126232255358936]
μ [102.36255480392157, 65.52340710784313, 45.74354181372549]
σ: [39.04354601878276, 32.108905824505584, 26.44949551028786]
μ: [103.53420298039217, 66.6776919607843, 47.33793435294118]
σ: [39.27047375795457, 32.596298448583816, 27.157260809967887]
μ: [101.47019483660131, 64.78871330065358, 45.96562905228759]
σ: [38.53431972571801, 31.865963477681383, 26.4560983662707]
μ: [101.37475316526611, 64.89486834733893, 45.78139523809524]
σ: [38.60904936868449, 31.66418210935895, 26.157487873733803]
μ: [100.6755631127451, 64.3597393872549, 45.18712225490196]
σ: [39.01506478558764, 31.78679779866631, 25.920704631785057]
μ: [99.85794368191722, 63.74056908496732, 44.48019496732026]
σ: [39.29746139951266, 31.839074510873296, 25.66162709110527]
μ: [99.5120281372549, 63.55203674509804, 44.69082580392156]
σ: [39.290881869376754, 32.064732326778504, 26.14723081778954]
μ: [99.68222427807487, 63.672476149732624, 44.77697739750445]
σ: [39.34959988736086, 31.939785095367306, 25.95437650452569]
and finally, after running the procedure on the entire dataset, the error becomes quite significant:
Pixelwise
μ: [101.8700454273955, 74.8429931459155, 64.76057496565133]
σ: [48.662522310391594, 41.730824001340146, 39.50453890881377]
Imagewise
μ: [101.87004542739516, 74.84299314591567, 64.76057496565105]
σ: [36.19880052359236, 32.38430878761532, 31.063098948011394]
As you can see, the means remain pretty much identical with both approaches even when the entire dataset is used. However, the standard deviation results are very far apart, even though they should be the same for both approaches. Which of my two approaches is at fault here and why?