This might not completely answer your questions, but I found the intricacies of FFmpeg and libswscale and the various filters to be not very well-documented, so here goes my understanding of the thing.
-pix_fmt
, -colorspace
, -color_primaries
, -color_trc
and -color_range
are for setting values inside the AVFrame
structure described here. They don't do any conversion on their own, they just tag the input or output stream (depending on whether they're placed before or after -i
). As far as I know this doesn't modify the pixel values themselves.
Some codecs or formats also need to know what colorspace they're operating in because they need to insert that in the encoded stream, and sometimes FFmpeg doesn't pass what it knows. -x264-params
and -x265-params
do that manually. As far as I know this alone doesn't modify the pixel values either.
The various -vf
options are there to convert between colorspaces. They actively change the pixel values and layout in memory.
The output of FFmpeg reports what it knows for each stream. For example, when you see something like gbrp10le(tv, bt2020nc/bt2020/arib-std-b67)
for the input or output stream, this means:
- Pixel format is gbrp10le (
-pix_fmt
)
- Range is tv/limited (
-color_range
)
- The YUV <-> RGB color matrix is bt2020nc (
-colorspace
)
- The primaries are bt2020 (
-color_primaries
)
- The EOTF/tone curve is arib-std-b67 (HLg) (
-color_trc
)
Now, this is where it gets fun: some filters are automatically applied depending on what FFmpeg knows of the streams, usually using the libswscale component (-vf scale
). This is obviously needed if you're converting between pixel formats, but you can also use that filter to to some conversions manually or to change the image size, for example.
The big issue is that libswscale
doesn't handle all cases very well, especially in command-line where you can't set all parameters, and especially for Wide-Gamut / HDR spaces like in BT2020. There are additional filters (colorspace
, colormatrix
and zscale
) which don't cause as much trouble and which replace parts of libswscale
. They do appear to set the correct AVFrame
pixel formats as far as I could tell, so libswscale
doesn't try to apply conversion twice, but I'm a bit hazy on which is automatic and which is not.
EDIT: For a concrete example explaining the full command-line in detail, see How can I encode RGB images into HDR10 videos in ffmpeg command-line?.