通过管道中的帧号图像进行裁剪和连接 2 个视频叠加无法按预期工作

通过管道中的帧号图像进行裁剪和连接 2 个视频叠加无法按预期工作

我有一个任务:

作为输入,我有 2 个视频文件(视频文件路径1视频文件路径2) 和管道,我应该用它们以编程方式发送在内存中创建的带有帧号的图像。

我需要

  1. 使用输入参数剪切两个视频:每个视频的开始/持续时间(开始秒数 1,持续时间秒数 1,开始秒数 2,持续时间秒数 2
  2. 2 个视频合为 1
  3. 将 2 的结果与管道中的图像叠加,因此每一帧都有其编号(帧 256 应该有数字 256),

只需一次 ffmpeg 调用即可

我的解决方案

a.为了生成正确的帧数图像,我设置了 frameRate 并使用它来计算所需的图像数量:framesCount = (durationSeconds1 + durationSeconds2) * FRAME_RATE

bI 使用以下参数进行 ffmpeg 调用:

-y -loop 1 -thread_queue_size {framesCount} -f image2pipe -framerate {FRAME_RATE} -i pipe:0 -i {videoFilePath1} -i {videofilePath2} -filter_complex 
"[1:v]trim=start={startSeconds1}:duration={durationSeconds1},fifo,setpts=PTS-STARTPTS[av];    
[1:a]atrim=start={startSeconds1}:duration={durationSeconds1},afifo,asetpts=PTS-STARTPTS[aa]; 
[2:v]trim=start={startSeconds2}:duration={durationSeconds2},fifo,setpts=PTS-STARTPTS[bv]; 
[2:a]atrim=start={startSeconds2}:duration={durationSeconds2},afifo,asetpts=PTS-STARTPTS[ba];    
[av][aa][bv][ba]concat=n=2:v=1:a=1[outv][outa];
[outv][0:v]overlay=shortest=1[outvv] " 
-r {FRAME_RATE} -map [outvv] -map [outa] -vcodec libx264 -pix_fmt yuv420p -crf 27 -level 3.1 -preset slow -b:v 1200000 -acodec aac -subq 7 -me_range 16 -threads 2 result.mp4

c. 在 C# 中我调用ffmpeg处理并发送帧数动态生成的图像

它可以工作,但在结果视频中我有一个错误的帧编号,它与叠加视频不完全同步,例如,编号 1 有前 2 帧,但在第 9 帧 - 没有叠加的 frame_number。

有趣的是:

  1. 在 10 秒的 24fps 视频中我有 243 帧(应该是 240 帧)
  2. 当我生成额外帧并使用最短=1在覆盖中剪切它,帧编号停止在 241
  3. 我得到了无数输出流中的非单调 DTS 0:1警告

另外,当我分两步进行时:

a. 剪切并连接

b. 将 a 的结果视频与管道中的图像叠加

它按预期工作并且没有任何警告。

但当我尝试通过 1 次操作完成此操作时,它无法正常工作。错误叠加的原因是什么?或者可能更早?

编辑完整的 ffmpeg 日志:

    ffmpeg version N-94421-gb3b7523feb Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 9.1.1 (GCC) 20190716
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
  libavutil      56. 32.100 / 56. 32.100
  libavcodec     58. 55.100 / 58. 55.100
  libavformat    58. 30.100 / 58. 30.100
  libavdevice    58.  9.100 / 58.  9.100
  libavfilter     7. 58.100 /  7. 58.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Input #0, image2pipe, from 'pipe:0':
  Duration: N/A, bitrate: N/A
    Stream #0:0: Video: bmp, bgra, 13x18, 24 fps, 24 tbr, 24 tbn, 24 tbc
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '20170625_124223.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 0
    compatible_brands: isom3gp4
    creation_time   : 2017-06-25T09:43:00.000000Z
  Duration: 00:00:29.90, start: 0.000000, bitrate: 11822 kb/s
    Stream #1:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 1280x720, 11692 kb/s, 29.19 fps, 30 tbr, 90k tbn, 180k tbc (default)
    Metadata:
      creation_time   : 2017-06-25T09:43:00.000000Z
      handler_name    : VideoHandle
    Stream #1:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 125 kb/s (default)
    Metadata:
      creation_time   : 2017-06-25T09:43:00.000000Z
      handler_name    : SoundHandle
Input #2, mov,mp4,m4a,3gp,3g2,mj2, from '20170805_202152.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 0
    compatible_brands: isom3gp4
    creation_time   : 2017-08-05T17:22:05.000000Z
  Duration: 00:00:12.57, start: 0.000000, bitrate: 11849 kb/s
    Stream #2:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 1280x720, 11945 kb/s, 29.96 fps, 30 tbr, 90k tbn, 180k tbc (default)
    Metadata:
      creation_time   : 2017-08-05T17:22:05.000000Z
      handler_name    : VideoHandle
    Stream #2:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 124 kb/s (default)
    Metadata:
      creation_time   : 2017-08-05T17:22:05.000000Z
      handler_name    : SoundHandle
Stream mapping:
  Stream #0:0 (bmp) -> overlay:overlay
  Stream #1:0 (h264) -> trim
  Stream #1:1 (aac) -> atrim
  Stream #2:0 (h264) -> trim
  Stream #2:1 (aac) -> atrim
  overlay -> Stream #0:0 (libx264)
  concat:out:a0 -> Stream #0:1 (aac)
[libx264 @ 000001fe2d1b0c80] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 000001fe2d1b0c80] profile High, level 3.1, 4:2:0, 8-bit
[libx264 @ 000001fe2d1b0c80] 264 - core 158 r2984 3759fcb - H.264/MPEG-4 AVC codec - Copyleft 2003-2019 - http://www.videolan.org/x264.html - options: cabac=1 ref=5 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=2 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=50 rc=crf mbtree=1 crf=27.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'result.mp4':
  Metadata:
    encoder         : Lavf58.30.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 1280x720, q=-1--1, 1200 kb/s, 24 fps, 12288 tbn, 24 tbc (default)
    Metadata:
      encoder         : Lavc58.55.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/1200000 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      encoder         : Lavc58.55.100 aac
[aac @ 000001fe2d1b0800] Queue input is backward in time
[mp4 @ 000001fe2f36ddc0] Non-monotonous DTS in output stream 0:1; previous: 16384, current: 0; changing to 16385. This may result in incorrect timestamps in the output file.

[SKIPPED MANY ROWS WITH SIMILAR WARNINGS]

[mp4 @ 000001fe2f36ddc0] Non-monotonous DTS in output stream 0:1; previous: 239850, current: 239616; changing to 239851. This may result in incorrect timestamps in the output file.
frame=  243 fps= 23 q=-1.0 Lsize=    1459kB time=00:00:10.04 bitrate=1189.7kbits/s dup=0 drop=189 speed=0.961x
video:1304kB audio:146kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.637435%
[libx264 @ 000001fe2d1b0c80] frame I:9     Avg QP:22.90  size: 17528
[libx264 @ 000001fe2d1b0c80] frame P:61    Avg QP:27.07  size: 10207
[libx264 @ 000001fe2d1b0c80] frame B:173   Avg QP:29.43  size:  3206
[libx264 @ 000001fe2d1b0c80] consecutive B-frames:  4.1%  0.0%  8.6% 87.2%
[libx264 @ 000001fe2d1b0c80] mb I  I16..4: 14.3% 81.5%  4.3%
[libx264 @ 000001fe2d1b0c80] mb P  I16..4:  3.7%  5.7%  0.5%  P16..4: 42.5%  8.6%  4.1%  0.0%  0.0%    skip:35.1%
[libx264 @ 000001fe2d1b0c80] mb B  I16..4:  0.2%  0.2%  0.0%  B16..8: 38.0%  2.5%  0.3%  direct: 1.1%  skip:57.7%  L0:48.9% L1:48.7% BI: 2.4%
[libx264 @ 000001fe2d1b0c80] 8x8 transform intra:70.6% inter:72.7%
[libx264 @ 000001fe2d1b0c80] direct mvs  spatial:98.8% temporal:1.2%
[libx264 @ 000001fe2d1b0c80] coded y,uvDC,uvAC intra: 27.2% 40.7% 3.5% inter: 5.1% 12.4% 0.1%
[libx264 @ 000001fe2d1b0c80] i16 v,h,dc,p: 19% 27%  8% 47%
[libx264 @ 000001fe2d1b0c80] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 39% 16% 22%  3%  4%  4%  5%  3%  4%
[libx264 @ 000001fe2d1b0c80] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 24% 33% 13%  4%  5%  6%  6%  4%  6%
[libx264 @ 000001fe2d1b0c80] i8c dc,h,v,p: 63% 19% 11%  7%
[libx264 @ 000001fe2d1b0c80] Weighted P-Frames: Y:26.2% UV:9.8%
[libx264 @ 000001fe2d1b0c80] ref P L0: 54.2% 18.1% 15.7%  6.8%  4.5%  0.8%
[libx264 @ 000001fe2d1b0c80] ref B L0: 80.4% 14.6%  4.4%  0.7%
[libx264 @ 000001fe2d1b0c80] ref B L1: 94.5%  5.5%
[libx264 @ 000001fe2d1b0c80] kb/s:1054.81
[aac @ 000001fe2d1b0800] Qavg: 27157.621

答案1

存在2个问题。

  1. 与ffmpeg相关。

这是关于输入文件 fps 的问题,它不同于我们的输出 fps,也与时间基有关,应该在叠加之前进行校正。因此,可以使用每秒帧数设定点

-y -thread_queue_size {framesCount} -f image2pipe -framerate {FRAME_RATE} -i \.\pipe\ffpipe -i {filename1} -i {filename2} -filter_complex

“[1:v]trim=开始={startSeconds1}.00:持续时间={durationSeconds1}.00,fps={FRAME_RATE},setpts=PTS-STARTPTS[av];

[1:a]atrim =开始= {startSeconds1}.00:持续时间={durationSeconds1}.00,asetpts = PTS-STARTPTS [aa];

[2:v]trim = 开始 = {startSeconds2}.00:持续时间 = {durationSeconds2}.00,fps = {FRAME_RATE},setpts = PTS-STARTPTS [bv];

[2:a]atrim =开始= {startSeconds2}.00:持续时间={durationSeconds2}.00,asetpts = PTS-STARTPTS [ba];

[av][aa][bv][ba]concat=n=2:v=1:a=1[coutv][outa];

[coutv][0:v]覆盖=最短=1[outv] "

-r {FRAME_RATE} -map [outv] -map [outa] -vcodec libx264 -pix_fmt yuv420p -crf 27 -level 3.1 -preset slow -b:v 1200000 -acodec aac -subq 7 -me_range 16 -threads 2 {RESULT_FILENAME}”

  1. 与ffmpeg没有直接关系。

修正 fps 和时间基准后,我发现在 9 和 10 之间总是有 2 个未重叠的帧。原因是 9 和 10 的图像大小不同,因此在重叠时会出现问题。解决方案很简单 - 使管道的所有图像具有相同的大小(宽度和高度):)

相关内容