Traditional videos need to have the following sets of metadata:
- The transfer function determines how you should map encoded RGB or YUV values into display luminance (
-color_trc
in ffmpeg options, transfer
in x265 options). Common options for HDR videos are HLG (arib-std-b67
) or PQ (smpte2084
), and a proper application of HDR10 requires PQ.
- The colorspace primaries define how encoded RGB values map into real colors (
-color_primaries
for ffmpeg, colorprim
for x265). For common HDR formats (including HDR10) you need bt2020
.
- The color matrix is used to convert between RGB values (which are used to display data on the screen) and YUV values (which make the video encoders more efficient, because they roughly separate luma (Y) from chroma (UV) channels), and correspond to
-colorspace
in ffmpeg and colormatrix
in x265. For common HDR formats (including HDR10) you need bt2020nc
.
- The signal range determines if RGB values between 0-100% are mapped to the full 0-255 (in 8-bit) or 0-1023 (in 10-bit) range, or if a margin is reserved for internal use (
-color_range
in ffmpeg, range
in x265). It is traditionally "limited" (or: "tv", "narrow") in videos and "full" (or: "pc") in RGB images.
HDR10 defines additional metadata on top of that, which are intended to be used by the OS/player to tune the video output on non-HDR capable screens so that they can be displayed as best as they can under those constraints:
- MaxFALL and MaxCLL describe resp. the average and maximum luminance of the video (
max-cll
x265 option)
- Mastering display characteristics describe the "perfect" display the video is intended to be displayed on (
master-display
x265 option)
Now, a few things need some special care:
- Traditional metadata can be set in both the video container and the x265 stream (so they should be set in both ffmpeg and x265 so as to avoid mismatch when playing).
- Traditional metadata should be set for both the input (before
-i
) and output (after) formats, so that ffmpeg can do proper conversions when encoding.
- HDR10 metadata are only present in the x265 options; ffmpeg knows nothing about them and they don't affect the video pixel values.
- Unfortunately the default filter used by ffmpeg for format conversion (swscale, or
-vf scale
in the command-line) has a lot of issues when applied automatically. I found that it is best to always specify at least the signal range explicitly, or even use zscale
which replaces it and is much better behaved. Several issues exist with swscale, so it is tricky to get right.
- YUV to/from RGB conversions (which is what we're doing here since we use RGB images as inputs!) are different between traditional SDR (BT709) videos and HDR10 (BT2020). This is a very common source of errors, because a lot of ffmpeg internals assume the default (potentially wrong) conversion if not explicitly specified in both input and output formats!
- The
-pix_fmt
for the input images should be determined automatically, but I included it here to be comprehensive ; it is also required if you want to stream from raw RGB data.
- The
-pix_fmt
for the output video specifies that we want to pass 10-bit YUV data to the encoder and is required. HDR10 requires 10-bits with 4:2:0 subsampling i.e. yuv420p10le
.
- This command-line assumes your TIFF images already contain the correct color profile (they are already Rec.2020 PQ images). If you want to do conversions from e.g. sRGB or other spaces you need to set the specific formats for the input file and
zscale
filter properly.
Bearing that in mind, we can use the following command-line to convert Rec2020 PQ RGB images into an HDR10 Rec2020 PQ x265/HEVC video:
ffmpeg \
# Some basic options
-hide_banner \
-loglevel verbose \
# Scaling flags for the input
# The print_info flag is very useful to debug cases when swscale is applied automatically!
# The other flags ensure ffmpeg favors accuracy over (very limited) performance gains.
-sws_flags print_info+accurate_rnd+bitexact+full_chroma_int \
# Traditional metadata for input images (note the "pc" or "full" range)
-color_range pc \
-color_trc smpte2084 \
-color_primaries bt2020 \
-colorspace bt2020nc \
-pix_fmt rgb48be \
# Input frames as 16-bit big-endian RGB TIFF images with pixel values in Rec2020 HLG
-framerate 30 -start_number 1 -i input-%03d.tif \
# Explicitly configure the range conversion (so as to avoid using swscale)
-vf zscale=rangein=full:range=limited \
# Specify output codec (x265/HEVC)
-c:v libx265 \
# Traditional metadata for output video (note the "tv" or "limited" range)
-color_range tv \
-color_trc smpte2084 \
-color_primaries bt2020 \
-colorspace bt2020nc \
-pix_fmt yuv420p10le \
# Scaling flags for the output
-sws_flags print_info+accurate_rnd+bitexact+full_chroma_int \
# x265 HDR10 metadata
-x265-params colorprim=bt2020:transfer=smpte2084:colormatrix=bt2020nc:range=limited:master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,1):max-cll=1000,400 \
# Output file
output.mov
EDIT 2022/08/29: Changed the command-line proposed and clarified the use of the proper transfer function ; technically a correct application of the HDR10 format requires PQ, but the previous answer used HLG. HLG-encoded videos are still HDR, but cannot really be called "HDR10".