I found openRTSP (livemedia-utils
on Debian, live-media
on Arch) that fetches a parameter o
... (using an arbitrary RTSP example source)
$ openRTSP -r rtsp://109.98.78.106
Created new TCP socket 3 for connection
Connecting to 109.98.78.106, port 554 on socket 3...
...remote connection opened
[...]
o=- 1604163122724055 1 IN IP4 109.98.78.106
[...]
... that seem to be the cameras UTC system time in microseconds extracted from the underlying RTC stream.
E.g.:
$ date -d@$( echo $(openRTSP -r rtsp://109.98.78.106 2>&1 | grep -Po '(?<=o=-\s)\d+' | head -n1 ) / 1000000 | bc )
Sat Oct 31 05:55:45 PM CET 2020
The cameras we use feature NTP functionality. I hence set up a local NTP server on the recording computer to serve as time source for the cameras.
From the delay ...
time_camera () {
# Returns cameras system time as embedded in the RTC stream in nanoseconds after UNIX 0
echo $(($(openRTSP -r ${STREAM_CAMERA} 2>&1 | grep -Po '(?<=o=-\s)\d+' | head -n1)000))
}
time_local () {
# Returns local system time in nanoseconds after UNIX 0
date +%s%N
}
vdelay=$(($(time_local) - $(time_camera)))
... I can estimate how long the frame took to arrive.
You might fine tune this to your needs.
For me, it is around (900 +- 200) ms and matches the audio-video offset really well.
As mentioned above, I use Pulseaudio and can hence (regularly) set the input latency offset directly without having to mess with ffmpeg's -itsoffset
via:
# See: pacmd list-cards
pacmd set-port-latency-offset <card> <port> $((vdelay/1000))