0

When I stack two videos using vstack, the result has audio sync issues for the bottom video.

My starting point: four separate RTP tracks captured from a 2 person video chat:

 Actor1Video.webm,
 Actor1Audio.webm,
 Actor2Video.webm,
 Actor2Audio.webm

I use vstack to put Actor1 on top and Actor2 on bottom:

ffmpeg -i Actor1Video.webm -i Actor2Video.webm -i Actor1Audio.webm -i Actor2Audio.webm  -filter_complex "[1][0]scale2ref=oh*mdar:ih[2nd][ref];[ref][2nd]vstack=inputs=2[v];[2:a][3:a]join=inputs=2:channel_layout=stereo:map=0.0-FL|1.0-FR[a]" -c:a libfdk_aac -map "[v]" -map "[a]"  -vsync 2 ActorsCombined.mp4

Here's the log:

ffmpeg version git-2021-02-08-89f78dd Copyright (c) 2000-2021 the FFmpeg developers
  built with Apple clang version 11.0.3 (clang-1103.0.32.62)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/HEAD-89f78dd_6 --enable-shared --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libaom --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-libsnappy --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-demuxer=dash --disable-libjack --disable-indev=jack --enable-opencl --enable-videotoolbox --disable-htmlpages --enable-libfdk-aac --enable-nonfree
  libavutil      56. 64.100 / 56. 64.100
  libavcodec     58.121.100 / 58.121.100
  libavformat    58. 67.100 / 58. 67.100
  libavdevice    58. 11.103 / 58. 11.103
  libavfilter     7.103.100 /  7.103.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
Input #0, matroska,webm, from 'Actor1Video.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 447576:28:17.41, start: 1611273978.135000, bitrate: N/A
  Stream #0:0: Video: vp8, yuv420p(tv, bt470bg/unknown/unknown, progressive), 1280x720, SAR 1:1 DAR 16:9, 29.97 fps, 29.97 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      DURATION        : 447576:28:17.408999
Input #1, matroska,webm, from 'Actor2Video.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 447576:28:17.45, start: 1611273978.257000, bitrate: N/A
  Stream #1:0: Video: vp8, yuv420p(tv, bt470bg/unknown/unknown, progressive), 320x180, SAR 1:1 DAR 16:9, 29.97 fps, 29.97 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      DURATION        : 447576:28:17.453999
Input #2, matroska,webm, from 'Actor1Audio.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 447576:28:17.49, start: 1611273978.112000, bitrate: N/A
  Stream #2:0: Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 447576:28:17.492000
Input #3, matroska,webm, from 'Actor2Audio.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 447576:28:17.45, start: 1611273978.208000, bitrate: N/A
  Stream #3:0: Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 447576:28:17.447999
File 'ActorsCombined.mp4' already exists. Overwrite? [y/N] y
Stream mapping:
  Stream #0:0 (vp8) -> scale2ref:ref
  Stream #1:0 (vp8) -> scale2ref:default
  Stream #2:0 (opus) -> join:input0
  Stream #3:0 (opus) -> join:input1
  vstack -> Stream #0:0 (libx264)
  join -> Stream #0:1 (libfdk_aac)
Press [q] to stop, [?] for help
[libx264 @ 0x7ff0c1831a00] using SAR=1/1
[libx264 @ 0x7ff0c1831a00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x7ff0c1831a00] profile High, level 4.0, 4:2:0, 8-bit
[libx264 @ 0x7ff0c1831a00] 264 - core 161 r3043 59c0609 - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'ActorsCombined.mp4':
  Metadata:
    title           : FFmpeg
    encoder         : Lavf58.67.100
  Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(progressive), 1280x1440 [SAR 1:1 DAR 8:9], q=2-31, 29.97 fps, 30k tbn (default)
    Metadata:
      encoder         : Lavc58.121.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
  Stream #0:1: Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, s16, 139 kb/s (default)
    Metadata:
      encoder         : Lavc58.121.100 libfdk_aac
frame=36626 fps= 15 q=-1.0 Lsize=  389420kB time=00:21:59.38 bitrate=2417.9kbits/s dup=0 drop=34791 speed=0.535x    
video:365641kB audio:22446kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.343645%
[libx264 @ 0x7ff0c1831a00] frame I:158   Avg QP:15.51  size:107833
[libx264 @ 0x7ff0c1831a00] frame P:9670  Avg QP:18.71  size: 25824
[libx264 @ 0x7ff0c1831a00] frame B:26798 Avg QP:24.90  size:  4018
[libx264 @ 0x7ff0c1831a00] consecutive B-frames:  0.6%  5.2%  0.6% 93.5%
[libx264 @ 0x7ff0c1831a00] mb I  I16..4: 13.2% 75.5% 11.3%
[libx264 @ 0x7ff0c1831a00] mb P  I16..4:  1.2%  3.6%  0.2%  P16..4: 43.1% 10.4%  5.9%  0.0%  0.0%    skip:35.6%
[libx264 @ 0x7ff0c1831a00] mb B  I16..4:  0.1%  0.1%  0.0%  B16..8: 28.3%  0.7%  0.1%  direct: 2.3%  skip:68.5%  L0:45.1% L1:53.6% BI: 1.3%
[libx264 @ 0x7ff0c1831a00] 8x8 transform intra:71.6% inter:85.4%
[libx264 @ 0x7ff0c1831a00] coded y,uvDC,uvAC intra: 50.4% 77.2% 47.8% inter: 6.9% 17.0% 3.8%
[libx264 @ 0x7ff0c1831a00] i16 v,h,dc,p: 37% 28% 14% 22%
[libx264 @ 0x7ff0c1831a00] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 17% 25%  4%  6%  7%  5%  6%  5%
[libx264 @ 0x7ff0c1831a00] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 35% 24% 16%  4%  6%  5%  4%  4%  2%
[libx264 @ 0x7ff0c1831a00] i8c dc,h,v,p: 60% 16% 17%  6%
[libx264 @ 0x7ff0c1831a00] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x7ff0c1831a00] ref P L0: 63.1%  9.9% 20.5%  6.6%
[libx264 @ 0x7ff0c1831a00] ref B L0: 90.0%  8.9%  1.1%
[libx264 @ 0x7ff0c1831a00] ref B L1: 94.7%  5.3%
[libx264 @ 0x7ff0c1831a00] kb/s:2270.36

The resulting file begins in sync, but after a few minutes the bottom video is suddenly out of sync with its audio.

The strange thing is, if I merge these videos with their audio separately, without using vstack, there's no sync issue:

ffmpeg -i Actor1Video.webm -i Actor1Audio.webm -vsync 2 Actor1.mp4 &&
ffmpeg -i Actor2Video.webm -i Actor2Audio.webm -vsync 2 Actor2.mp4

When I do the above, the two videos are perfectly in sync. But if I take these two mp4s and stack them, I have the same issue where the bottom video goes out of sync.

Any suggestions?


UPDATE

This question does not appear to be a duplicate of anything on this site (though, as @llogan noted, other users have had issues with WebRTC timestamps). It seems unlikely, though, that WebRTC recordings are simply impossible to sync?

Arlo
  • 31
  • 3
  • Show the complete log from the vstack command but using the webm files as inputs instead of the mp4. – llogan Feb 03 '21 at 21:54
  • Thank you, @llogan! I just updated the question per your suggestion. – Arlo Feb 05 '21 at 22:30
  • @llogan I've done as you suggested, and just rewrote the question to simplify it—not sure if there's anything else I should do to make my issue clearer? – Arlo Feb 08 '21 at 14:29
  • WebRTC videos have sloppy timestamps. I'm sure this has been answered a few times on this site. Didn't find the one I had in mind but keep searching (maybe it was on [su] or [video.se]. – llogan Feb 08 '21 at 18:39
  • Thanks, @llogan! I promise I have spent weeks searching for and reading any relevant questions—I've had this issue for a couple months, and have been unable to find a solution, hence my posting here; if there are search terms I'm missing, I'm happy to be enlightened! Also: when I convert the videos to mp4 separately, there are no sync issues, so my thought was that the timestamp issues should be gone by that point? But maybe that's not the case. – Arlo Feb 08 '21 at 20:20

0 Answers0