使用 ffmpeg 连接视频和图像并添加音频,但视频输出跳过几幅图像且控制台返回音频错误

使用 ffmpeg 连接视频和图像并添加音频,但视频输出跳过几幅图像且控制台返回音频错误

我需要连接一个视频和多个图像,并添加音频。我不需要特定的文件格式,但我更喜欢一步到位的方法,而不需要不必要的重新编码。

我尝试使用以下命令行:

ffmpeg \
-i video.webm \
-framerate 1/4 \
-pattern_type glob \
-i "*.jpg" \
-i audio1.webm \
-i audio2.webm \
-i audio3.webm \
-filter_complex "[1]scale=width=1920:height=800:force_original_aspect_ratio=decrease, \
                    pad=width=1920:height=800:x=(out_w-in_w)/2:y=(out_h-in_h)/2, \
                    setsar=sar=1[1a]; \
                 [0][1a]concat; \
                 [2][3][4]concat=n=3:v=0:a=1" \
output.mp4

但我有两个问题:

  • 视频output.mp4播放跳过了几张图片。由于视频和图片输入的帧速率不同(Stream #0:0: 23.98 fps, 23.98 tbr, 1k tbn, 1k tbcStream #1:0: 0.25 fps, 0.25 tbr, 0.25 tbn, 0.25 tbc),我将其添加fps=fps=ntsc-film到过滤器中,但视频output.mp4播放仍然跳过了几张图片。
  • 虽然我在播放过程中没有听到任何明显的声音问题,但控制台输出返回了许多Non-monotonous DTS in output stream 0:1错误。由于音频输入的时间戳有误(请参阅评论中的@Gyan),我将过滤器的音频部分从 更改为[2][3][4]concat=n=3:v=0:a=1[2]asetpts=PTS-STARTPTS[2a];[3]asetpts=PTS-STARTPTS[3a];[4]asetpts=PTS-STARTPTS[4a];[2a][3a][4a]concat=n=3:v=0:a=1但控制台输出仍然返回相同的错误。

我该如何解决这两个问题?有没有更好的方法?

控制台输出如下:

ffmpeg version 4.0.4 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (GCC)
  configuration: --prefix=/usr --bindir=/usr/bin --datadir=/usr/share/ffmpeg --docdir=/usr/share/doc/ffmpeg --incdir=/usr/include/ffmpeg --libdir=/usr/lib64 --mandir=/usr/share/man --arch=x86_64 --optflags='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' --extra-ldflags='-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld ' --extra-cflags=' ' --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libvo-amrwbenc --enable-version3 --enable-bzlib --disable-crystalhd --enable-fontconfig --enable-frei0r --enable-gcrypt --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libcdio --enable-libdrm --enable-indev=jack --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libmp3lame --enable-nvenc --enable-openal --enable-opencl --enable-opengl --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libvorbis --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-libzvbi --enable-avfilter --enable-avresample --enable-postproc --enable-pthreads --disable-static --enable-shared --enable-gpl --disable-debug --disable-stripping --shlibdir=/usr/lib64 --enable-libmfx --enable-runtime-cpudetect
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Input #0, matroska,webm, from 'video.webm':
  Metadata:
    encoder         : google/video-file
  Duration: 00:00:20.65, start: 0.000000, bitrate: 1285 kb/s
    Stream #0:0(eng): Video: vp9 (Profile 0), yuv420p(tv, bt709/unknown/unknown), 1920x800, SAR 1:1 DAR 12:5, 23.98 fps, 23.98 tbr, 1k tbn, 1k tbc (default)
Input #1, image2, from '*.jpg':
  Duration: 00:06:52.00, start: 0.000000, bitrate: N/A
    Stream #1:0: Video: mjpeg, yuvj444p(pc, bt470bg/unknown/unknown), 1920x800 [SAR 72:72 DAR 12:5], 0.25 fps, 0.25 tbr, 0.25 tbn, 0.25 tbc
Input #2, matroska,webm, from 'audio1.webm':
  Metadata:
    encoder         : google
  Duration: 00:00:21.06, start: -0.007000, bitrate: 126 kb/s
    Stream #2:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
Input #3, matroska,webm, from 'audio2.webm':
  Metadata:
    encoder         : google
  Duration: 00:03:51.50, start: -0.007000, bitrate: 139 kb/s
    Stream #3:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
Input #4, matroska,webm, from 'audio3.webm':
  Metadata:
    encoder         : google/video-file
  Duration: 00:05:30.02, start: -0.007000, bitrate: 154 kb/s
    Stream #4:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
Stream mapping:
  Stream #0:0 (vp9) -> concat:in0:v0
  Stream #1:0 (mjpeg) -> scale
  Stream #2:0 (opus) -> concat:in0:a0
  Stream #3:0 (opus) -> concat:in1:a0
  Stream #4:0 (opus) -> concat:in2:a0
  concat -> Stream #0:0 (libx264)
  concat -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[swscaler @ 0x55be91baee80] deprecated pixel format used, make sure you did set range correctly
[libx264 @ 0x55be913f9c00] using SAR=1/1
[libx264 @ 0x55be913f9c00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x55be913f9c00] profile High, level 4.0
[libx264 @ 0x55be913f9c00] 264 - core 152 r2854 e9a5903 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.mp4':
  Metadata:
    encoder         : Lavf58.12.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 1920x800 [SAR 1:1 DAR 12:5], q=-1--1, 23.98 fps, 24k tbn, 23.98 tbc (default)
    Metadata:
      encoder         : Lavc58.18.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      encoder         : Lavc58.18.100 aac
[image2 @ 0x55be913dda40] Thread message queue blocking; consider raising the thread_queue_size option (current value: 8)
[swscaler @ 0x55be91715d00] deprecated pixel format used, make sure you did set range correctly speed=1.05x    
[swscaler @ 0x55be91715d00] Warning: data is not aligned! This can lead to a speed loss
[swscaler @ 0x55be91715d00] deprecated pixel format used, make sure you did set range correctly
    Last message repeated 2 times
[swscaler @ 0x55be91b816c0] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 0x55be91a0fe00] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 0x55be91715d00] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 0x55be91a0d680] deprecated pixel format used, make sure you did set range correctly4 speed=1.34x    
[swscaler @ 0x55be9188f6c0] deprecated pixel format used, make sure you did set range correctly4 speed=1.46x    
[aac @ 0x55be913f7740] Queue input is backward in time41.83 bitrate=1052.8kbits/s dup=553 drop=4 speed=1.59x    
[mp4 @ 0x55be913f9180] Non-monotonous DTS in output stream 0:1; previous: 1183792, current: 173784; changing to 1183793. This may result in incorrect timestamps in the output file.
[mp4 @ 0x55be913f9180] Non-monotonous DTS in output stream 0:1; previous: 1183793, current: 174808; changing to 1183794. This may result in incorrect timestamps in the output file.
[mp4 @ 0x55be913f9180] Non-monotonous DTS in output stream 0:1; previous: 1183794, current: 175832; changing to 1183795. This may result in incorrect timestamps in the output file.

# repeated many times

[mp4 @ 0x562a844d6180] Non-monotonous DTS in output stream 0:1; previous: 13643739, current: 13641560; changing to 13643740. This may result in incorrect timestamps in the output file.
[mp4 @ 0x562a844d6180] Non-monotonous DTS in output stream 0:1; previous: 13643740, current: 13642584; changing to 13643741. This may result in incorrect timestamps in the output file.
[mp4 @ 0x562a844d6180] Non-monotonous DTS in output stream 0:1; previous: 13643741, current: 13643608; changing to 13643742. This may result in incorrect timestamps in the output file.
[swscaler @ 0x562a84c607c0] deprecated pixel format used, make sure you did set range correctly=4 speed=2.84x    
[swscaler @ 0x562a84c607c0] deprecated pixel format used, make sure you did set range correctly=4 speed=2.85x    
[swscaler @ 0x562a84c5fd00] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=2.89x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed= 2.9x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=2.91x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=2.92x    
[swscaler @ 0x562a84620a40] deprecated pixel format used, make sure you did set range correctly=4 speed=2.94x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=2.95x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=2.96x    
[swscaler @ 0x562a84620a40] deprecated pixel format used, make sure you did set range correctly=4 speed=2.98x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=2.99x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.01x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.02x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.02x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.02x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.02x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.03x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.03x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.04x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.04x    
[swscaler @ 0x562a85289dc0] deprecated pixel format used, make sure you did set range correctly=4 speed=3.05x    
[swscaler @ 0x562a8460dc80] deprecated pixel format used, make sure you did set range correctly=4 speed=3.05x    
frame= 9878 fps= 73 q=-1.0 Lsize=   30899kB time=00:06:51.86 bitrate= 614.6kbits/s dup=9284 drop=4 speed=3.06x    
video:21524kB audio:9129kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.804084%
[libx264 @ 0x562a844d6c00] frame I:102   Avg QP:12.25  size:163820
[libx264 @ 0x562a844d6c00] frame P:2492  Avg QP:13.96  size:  1188
[libx264 @ 0x562a844d6c00] frame B:7284  Avg QP:13.19  size:   325
[libx264 @ 0x562a844d6c00] consecutive B-frames:  1.2%  0.5%  2.9% 95.4%
[libx264 @ 0x562a844d6c00] mb I  I16..4: 38.6% 44.7% 16.7%
[libx264 @ 0x562a844d6c00] mb P  I16..4:  0.8%  1.1%  0.0%  P16..4:  2.0%  0.5%  0.2%  0.0%  0.0%    skip:95.4%
[libx264 @ 0x562a844d6c00] mb B  I16..4:  0.1%  0.1%  0.0%  B16..8:  1.6%  0.1%  0.0%  direct: 0.1%  skip:98.0%  L0:43.9% L1:53.3% BI: 2.9%
[libx264 @ 0x562a844d6c00] 8x8 transform intra:49.0% inter:88.7%
[libx264 @ 0x562a844d6c00] coded y,uvDC,uvAC intra: 33.3% 43.4% 27.3% inter: 0.3% 0.8% 0.0%
[libx264 @ 0x562a844d6c00] i16 v,h,dc,p: 65% 12%  5% 18%
[libx264 @ 0x562a844d6c00] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 29% 15% 30%  4%  4%  4%  4%  4%  5%
[libx264 @ 0x562a844d6c00] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 20% 16% 11%  7% 10%  9% 10%  8% 10%
[libx264 @ 0x562a844d6c00] i8c dc,h,v,p: 67% 14% 13%  6%
[libx264 @ 0x562a844d6c00] Weighted P-Frames: Y:0.7% UV:0.7%
[libx264 @ 0x562a844d6c00] ref P L0: 66.6% 11.4% 17.5%  4.3%  0.2%
[libx264 @ 0x562a844d6c00] ref B L0: 91.1%  7.7%  1.2%
[libx264 @ 0x562a844d6c00] ref B L1: 97.4%  2.6%
[libx264 @ 0x562a844d6c00] kb/s:427.96
[aac @ 0x562a844d4740] Qavg: 547.581

答案1

Non-monotonous DTS in output stream 0:1我通过四步不同的方法解决了我的问题(播放期间不再跳过图像,编码期间不再出错):

首先,从图片编码视频:

ffmpeg \
-framerate 1/4 \
-i pictures/%03d.jpg \
-filter:v "scale=width=1920:height=800:force_original_aspect_ratio=decrease, \
           pad=width=1920:height=800:x=(out_w-in_w)/2:y=(out_h-in_h)/2" \
pictures.webm

其次,将现有视频与从图片编码的视频连接起来,无需重新编码:

ffmpeg \
-f concat \
-safe 0 \
-i <(printf "file '$PWD/video.webm'\nfile '$PWD/pictures.webm'") \
-c copy \
video.webm

第三,无需重新编码即可连接音频:

ffmpeg \
-f concat \
-safe 0 \
-i <(printf "file '$PWD/audio1.webm'\nfile '$PWD/audio2.webm'\nfile '$PWD/audio3.webm'") \
-c copy \
audio.webm

第四,无需重新编码即可复用视频和音频:

ffmpeg \
-i video.webm \
-i audio.webm \
-c copy \
final.webm

欢迎评论!具体来说,是否有一种无需重新编码的单步方法可以做到这一点?

相关内容