我有一个任务:
作为输入,我有 2 个视频文件(视频文件路径1,视频文件路径2) 和管道,我应该用它们以编程方式发送在内存中创建的带有帧号的图像。
我需要
- 使用输入参数剪切两个视频:每个视频的开始/持续时间(开始秒数 1,持续时间秒数 1,开始秒数 2,持续时间秒数 2)
- 2 个视频合为 1
- 将 2 的结果与管道中的图像叠加,因此每一帧都有其编号(帧 256 应该有数字 256),
只需一次 ffmpeg 调用即可
我的解决方案
a.为了生成正确的帧数图像,我设置了 frameRate 并使用它来计算所需的图像数量:framesCount = (durationSeconds1 + durationSeconds2) * FRAME_RATE
bI 使用以下参数进行 ffmpeg 调用:
-y -loop 1 -thread_queue_size {framesCount} -f image2pipe -framerate {FRAME_RATE} -i pipe:0 -i {videoFilePath1} -i {videofilePath2} -filter_complex
"[1:v]trim=start={startSeconds1}:duration={durationSeconds1},fifo,setpts=PTS-STARTPTS[av];
[1:a]atrim=start={startSeconds1}:duration={durationSeconds1},afifo,asetpts=PTS-STARTPTS[aa];
[2:v]trim=start={startSeconds2}:duration={durationSeconds2},fifo,setpts=PTS-STARTPTS[bv];
[2:a]atrim=start={startSeconds2}:duration={durationSeconds2},afifo,asetpts=PTS-STARTPTS[ba];
[av][aa][bv][ba]concat=n=2:v=1:a=1[outv][outa];
[outv][0:v]overlay=shortest=1[outvv] "
-r {FRAME_RATE} -map [outvv] -map [outa] -vcodec libx264 -pix_fmt yuv420p -crf 27 -level 3.1 -preset slow -b:v 1200000 -acodec aac -subq 7 -me_range 16 -threads 2 result.mp4
c. 在 C# 中我调用ffmpeg处理并发送帧数动态生成的图像
它可以工作,但在结果视频中我有一个错误的帧编号,它与叠加视频不完全同步,例如,编号 1 有前 2 帧,但在第 9 帧 - 没有叠加的 frame_number。
有趣的是:
- 在 10 秒的 24fps 视频中我有 243 帧(应该是 240 帧)
- 当我生成额外帧并使用最短=1在覆盖中剪切它,帧编号停止在 241
- 我得到了无数输出流中的非单调 DTS 0:1警告
另外,当我分两步进行时:
a. 剪切并连接
b. 将 a 的结果视频与管道中的图像叠加
它按预期工作并且没有任何警告。
但当我尝试通过 1 次操作完成此操作时,它无法正常工作。错误叠加的原因是什么?或者可能更早?
编辑完整的 ffmpeg 日志:
ffmpeg version N-94421-gb3b7523feb Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 9.1.1 (GCC) 20190716
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
libavutil 56. 32.100 / 56. 32.100
libavcodec 58. 55.100 / 58. 55.100
libavformat 58. 30.100 / 58. 30.100
libavdevice 58. 9.100 / 58. 9.100
libavfilter 7. 58.100 / 7. 58.100
libswscale 5. 6.100 / 5. 6.100
libswresample 3. 6.100 / 3. 6.100
libpostproc 55. 6.100 / 55. 6.100
Input #0, image2pipe, from 'pipe:0':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: bmp, bgra, 13x18, 24 fps, 24 tbr, 24 tbn, 24 tbc
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '20170625_124223.mp4':
Metadata:
major_brand : isom
minor_version : 0
compatible_brands: isom3gp4
creation_time : 2017-06-25T09:43:00.000000Z
Duration: 00:00:29.90, start: 0.000000, bitrate: 11822 kb/s
Stream #1:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 1280x720, 11692 kb/s, 29.19 fps, 30 tbr, 90k tbn, 180k tbc (default)
Metadata:
creation_time : 2017-06-25T09:43:00.000000Z
handler_name : VideoHandle
Stream #1:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 125 kb/s (default)
Metadata:
creation_time : 2017-06-25T09:43:00.000000Z
handler_name : SoundHandle
Input #2, mov,mp4,m4a,3gp,3g2,mj2, from '20170805_202152.mp4':
Metadata:
major_brand : isom
minor_version : 0
compatible_brands: isom3gp4
creation_time : 2017-08-05T17:22:05.000000Z
Duration: 00:00:12.57, start: 0.000000, bitrate: 11849 kb/s
Stream #2:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 1280x720, 11945 kb/s, 29.96 fps, 30 tbr, 90k tbn, 180k tbc (default)
Metadata:
creation_time : 2017-08-05T17:22:05.000000Z
handler_name : VideoHandle
Stream #2:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 124 kb/s (default)
Metadata:
creation_time : 2017-08-05T17:22:05.000000Z
handler_name : SoundHandle
Stream mapping:
Stream #0:0 (bmp) -> overlay:overlay
Stream #1:0 (h264) -> trim
Stream #1:1 (aac) -> atrim
Stream #2:0 (h264) -> trim
Stream #2:1 (aac) -> atrim
overlay -> Stream #0:0 (libx264)
concat:out:a0 -> Stream #0:1 (aac)
[libx264 @ 000001fe2d1b0c80] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 000001fe2d1b0c80] profile High, level 3.1, 4:2:0, 8-bit
[libx264 @ 000001fe2d1b0c80] 264 - core 158 r2984 3759fcb - H.264/MPEG-4 AVC codec - Copyleft 2003-2019 - http://www.videolan.org/x264.html - options: cabac=1 ref=5 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=2 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=50 rc=crf mbtree=1 crf=27.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'result.mp4':
Metadata:
encoder : Lavf58.30.100
Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 1280x720, q=-1--1, 1200 kb/s, 24 fps, 12288 tbn, 24 tbc (default)
Metadata:
encoder : Lavc58.55.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/1200000 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
encoder : Lavc58.55.100 aac
[aac @ 000001fe2d1b0800] Queue input is backward in time
[mp4 @ 000001fe2f36ddc0] Non-monotonous DTS in output stream 0:1; previous: 16384, current: 0; changing to 16385. This may result in incorrect timestamps in the output file.
[SKIPPED MANY ROWS WITH SIMILAR WARNINGS]
[mp4 @ 000001fe2f36ddc0] Non-monotonous DTS in output stream 0:1; previous: 239850, current: 239616; changing to 239851. This may result in incorrect timestamps in the output file.
frame= 243 fps= 23 q=-1.0 Lsize= 1459kB time=00:00:10.04 bitrate=1189.7kbits/s dup=0 drop=189 speed=0.961x
video:1304kB audio:146kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.637435%
[libx264 @ 000001fe2d1b0c80] frame I:9 Avg QP:22.90 size: 17528
[libx264 @ 000001fe2d1b0c80] frame P:61 Avg QP:27.07 size: 10207
[libx264 @ 000001fe2d1b0c80] frame B:173 Avg QP:29.43 size: 3206
[libx264 @ 000001fe2d1b0c80] consecutive B-frames: 4.1% 0.0% 8.6% 87.2%
[libx264 @ 000001fe2d1b0c80] mb I I16..4: 14.3% 81.5% 4.3%
[libx264 @ 000001fe2d1b0c80] mb P I16..4: 3.7% 5.7% 0.5% P16..4: 42.5% 8.6% 4.1% 0.0% 0.0% skip:35.1%
[libx264 @ 000001fe2d1b0c80] mb B I16..4: 0.2% 0.2% 0.0% B16..8: 38.0% 2.5% 0.3% direct: 1.1% skip:57.7% L0:48.9% L1:48.7% BI: 2.4%
[libx264 @ 000001fe2d1b0c80] 8x8 transform intra:70.6% inter:72.7%
[libx264 @ 000001fe2d1b0c80] direct mvs spatial:98.8% temporal:1.2%
[libx264 @ 000001fe2d1b0c80] coded y,uvDC,uvAC intra: 27.2% 40.7% 3.5% inter: 5.1% 12.4% 0.1%
[libx264 @ 000001fe2d1b0c80] i16 v,h,dc,p: 19% 27% 8% 47%
[libx264 @ 000001fe2d1b0c80] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 39% 16% 22% 3% 4% 4% 5% 3% 4%
[libx264 @ 000001fe2d1b0c80] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 24% 33% 13% 4% 5% 6% 6% 4% 6%
[libx264 @ 000001fe2d1b0c80] i8c dc,h,v,p: 63% 19% 11% 7%
[libx264 @ 000001fe2d1b0c80] Weighted P-Frames: Y:26.2% UV:9.8%
[libx264 @ 000001fe2d1b0c80] ref P L0: 54.2% 18.1% 15.7% 6.8% 4.5% 0.8%
[libx264 @ 000001fe2d1b0c80] ref B L0: 80.4% 14.6% 4.4% 0.7%
[libx264 @ 000001fe2d1b0c80] ref B L1: 94.5% 5.5%
[libx264 @ 000001fe2d1b0c80] kb/s:1054.81
[aac @ 000001fe2d1b0800] Qavg: 27157.621
答案1
存在2个问题。
- 与ffmpeg相关。
这是关于输入文件 fps 的问题,它不同于我们的输出 fps,也与时间基有关,应该在叠加之前进行校正。因此,可以使用每秒帧数和设定点
-y -thread_queue_size {framesCount} -f image2pipe -framerate {FRAME_RATE} -i \.\pipe\ffpipe -i {filename1} -i {filename2} -filter_complex
“[1:v]trim=开始={startSeconds1}.00:持续时间={durationSeconds1}.00,fps={FRAME_RATE},setpts=PTS-STARTPTS[av];
[1:a]atrim =开始= {startSeconds1}.00:持续时间={durationSeconds1}.00,asetpts = PTS-STARTPTS [aa];
[2:v]trim = 开始 = {startSeconds2}.00:持续时间 = {durationSeconds2}.00,fps = {FRAME_RATE},setpts = PTS-STARTPTS [bv];
[2:a]atrim =开始= {startSeconds2}.00:持续时间={durationSeconds2}.00,asetpts = PTS-STARTPTS [ba];
[av][aa][bv][ba]concat=n=2:v=1:a=1[coutv][outa];
[coutv][0:v]覆盖=最短=1[outv] "
-r {FRAME_RATE} -map [outv] -map [outa] -vcodec libx264 -pix_fmt yuv420p -crf 27 -level 3.1 -preset slow -b:v 1200000 -acodec aac -subq 7 -me_range 16 -threads 2 {RESULT_FILENAME}”
- 与ffmpeg没有直接关系。
修正 fps 和时间基准后,我发现在 9 和 10 之间总是有 2 个未重叠的帧。原因是 9 和 10 的图像大小不同,因此在重叠时会出现问题。解决方案很简单 - 使管道的所有图像具有相同的大小(宽度和高度):)