How do PNG encoders pick which filter to use?

Question

The PNG specification states that there are five possible filters, which can be chosen individually for each row of the image. It gives some general suggestion first:

Indexed color and < 8-bit color images should use no filter
Truecolor and greyscale images: "any of the five filters may prove the most effective"
If you can only use one filter for the whole image, Paeth will likely be best

Then it suggests the following, which sounds like it was written in the early days of PNG's development and isn't very specific:

The following simple heuristic has performed well in early tests: compute the output scanline using all five filters, and select the filter that gives the smallest sum of absolute values of outputs. (Consider the output bytes as signed differences for this test.) This method usually outperforms any single fixed filter choice. However, it is likely that better heuristics will be found as more experience is gained with PNG.

How do encoders typically pick the filters used? Is the method above typical or are there more modern methods?

EDIT:
@Mark-Setchell points to the libpng source code, which contains this long comment starting on line 2416:

/* The prediction method we use is to find which method provides the
 * smallest value when summing the absolute values of the distances
 * from zero, using anything >= 128 as negative numbers.  This is known
 * as the "minimum sum of absolute differences" heuristic.  Other
 * heuristics are the "weighted minimum sum of absolute differences"
 * (experimental and can in theory improve compression), and the "zlib
 * predictive" method (not implemented yet), which does test compressions
 * of lines using different filter methods, and then chooses the
 * (series of) filter(s) that give minimum compressed data size (VERY
 * computationally expensive).
 *
 * GRR 980525:  consider also
 *
 *   (1) minimum sum of absolute differences from running average (i.e.,
 *       keep running sum of non-absolute differences & count of bytes)
 *       [track dispersion, too?  restart average if dispersion too large?]
 *
 *  (1b) minimum sum of absolute differences from sliding average, probably
 *       with window size <= deflate window (usually 32K)
 *
 *   (2) minimum sum of squared differences from zero or running average
 *       (i.e., ~ root-mean-square approach)
 */

My `C` is super bad and I have no idea what's going on there – something about summing the values? Can you add an answer with some explanation? — JeffThompson, Dec 27 '19 at 00:02
I don't find it the easiest code in the world to read myself, but my reading is indeed that, unless the caller has specifically identified the filter he wishes to use, all filters are tried and the one resulting in the smallest output length is selected. I'm happy to be corrected if found wrong :-) — Mark Setchell, Dec 27 '19 at 09:21
On line 2416 there is an extended comment – maybe that explains the choice? — JeffThompson, Dec 28 '19 at 19:08
More info... https://www.oreilly.com/library/view/png-the-definitive/9781565925427/18_chapter-09.html — Mark Setchell, Dec 23 '20 at 11:17

How do PNG encoders pick which filter to use?

0 Answers0