当 ffmpeg 从某些来源流式传输时，在 Alsa 中缓冲 xrun

2024-11-4 • tag-icon

我正在使用 Pulseaudio RTP 模块广播我的台式电脑（从这里开始为客户端）的音频：

load-module module-null-sink sink_name=rtp_2_camilla channels=2 format=S16BE rate=96000 sink_properties="device.description='RTP to CamillaDSP'"
load-module module-rtp-send source=rtp_2_camilla.monitor destination_ip=192.168.0.22 port=46908

接收端是 RPi4b（从这里开始为服务器），其中 ffmpeg 在命令行中运行，没有 Pulseaudio。我从 Pulseaudio 调试输出生成了一个 sdp 文件，并在客户端上使用它来识别流：

ffmpeg -protocol_whitelist file,udp,rtp -i stream_96k_s16be.sdp -f alsa hw:0,0

到目前为止一切都很好，在很多应用程序中，除了延迟越来越高之外，设置运行完美——稍后我会谈到这一点。主要问题是，对于客户端的某些应用程序，ffmpeg 存在严重的缓冲区不足。我听说每个数据包大约有一半的音频不存在，这几乎和听静态噪音一样糟糕。这是发生这种情况时的日志：

ffmpeg -loglevel trace -protocol_whitelist file,udp,rtp -i stream_96k_s16be.sdp -f alsa hw:0,0 -report
ffmpeg started on 2023-02-22 at 10:12:51
Report written to "ffmpeg-20230222-101251.log"
Log level: 56
ffmpeg version 4.3.5-0+deb11u1+rpt3 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 10 (Debian 10.2.1-6)
  configuration: --prefix=/usr --extra-version=0+deb11u1+rpt3 --toolchain=hardened --incdir=/usr/include/aarch64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-mmal --enable-neon --enable-v4l2-request --enable-libudev --enable-epoxy --enable-sand --libdir=/usr/lib/aarch64-linux-gnu --arch=arm64 --enable-pocketsphinx --enable-libdc1394 --enable-libdrm --enable-vout-drm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Splitting the commandline.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument 'trace'.
Reading option '-protocol_whitelist' ... matched as AVOption 'protocol_whitelist' with argument 'file,udp,rtp'.
Reading option '-i' ... matched as input url with argument 'stream_96k_s16be.sdp'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'alsa'.
Reading option 'hw:0,0' ... matched as output url.
Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option loglevel (set logging level) with argument trace.
Applying option report (generate a report) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url stream_96k_s16be.sdp.
Successfully parsed a group of options.
Opening an input file: stream_96k_s16be.sdp.
[NULL @ 0x557fb57720] Opening 'stream_96k_s16be.sdp' for reading
Probing sdp score:50 size:205
[sdp @ 0x557fb57720] Format sdp probed with size=2048 and score=50
[sdp @ 0x557fb57720] sdp: v='0'
[sdp @ 0x557fb57720] sdp: o='kerem 3885738296 0 IN IP4 192.168.0.###'
[sdp @ 0x557fb57720] sdp: s='PulseAudio RTP Stream on ke#######'
[sdp @ 0x557fb57720] sdp: c='IN IP4 192.168.0.###'
[sdp @ 0x557fb57720] sdp: t='3885738296 0'
[sdp @ 0x557fb57720] sdp: a='recvonly'
[sdp @ 0x557fb57720] sdp: m='audio 46908 RTP/AVP 127'
[sdp @ 0x557fb57720] sdp: a='rtpmap:127 L16/96000/2'
[sdp @ 0x557fb57720] audio codec set to: pcm_s16be
[sdp @ 0x557fb57720] audio samplerate set to: 96000
[sdp @ 0x557fb57720] audio channels set to: 2
[sdp @ 0x557fb57720] sdp: a='type:broadcast'
[udp @ 0x557fb0fb60] end receive buffer size reported is 425984
[udp @ 0x557fb5ea90] end receive buffer size reported is 425984
[sdp @ 0x557fb57720] setting jitter buffer size to 500
[sdp @ 0x557fb57720] Before avformat_find_stream_info() pos: 205 bytes read:205 seeks:0 nb_streams:1
[sdp @ 0x557fb57720] All info found
[sdp @ 0x557fb57720] stream 0: start_time: 0 duration: NOPTS
[sdp @ 0x557fb57720] format: start_time: 0 duration: NOPTS (estimate from bit rate) bitrate=3072 kb/s
[sdp @ 0x557fb57720] After avformat_find_stream_info() pos: 205 bytes read:205 seeks:0 frames:1
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, sdp, from 'stream_96k_s16be.sdp':
  Metadata:
    title           : PulseAudio RTP Stream on ke#####
  Duration: N/A, start: 0.000000, bitrate: 3072 kb/s
    Stream #0:0, 1, 1/96000: Audio: pcm_s16be, 96000 Hz, stereo, s16, 3072 kb/s
Successfully opened the file.
Parsing a group of options: output url hw:0,0.
Applying option f (force format) with argument alsa.
Successfully parsed a group of options.
Opening an output file: hw:0,0.
Successfully opened the file.
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16be (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
detected 4 logical cores
[graph_0_in_0_0 @ 0x557fb5d1f0] Setting 'time_base' to value '1/96000'
[graph_0_in_0_0 @ 0x557fb5d1f0] Setting 'sample_rate' to value '96000'
[graph_0_in_0_0 @ 0x557fb5d1f0] Setting 'sample_fmt' to value 's16'
[graph_0_in_0_0 @ 0x557fb5d1f0] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_0 @ 0x557fb5d1f0] tb:1/96000 samplefmt:s16 samplerate:96000 chlayout:0x3
[format_out_0_0 @ 0x557fb9c6e0] Setting 'sample_fmts' to value 's16'
[AVFilterGraph @ 0x557fb5ff30] query_formats: 4 queried, 9 merged, 0 already done, 0 delayed
Output #0, alsa, to 'hw:0,0':
  Metadata:
    title           : PulseAudio RTP Stream on ke######
    encoder         : Lavf58.45.100
    Stream #0:0, 0, 1/96000: Audio: pcm_s16le, 96000 Hz, stereo, s16, 3072 kb/s
    Metadata:
      encoder         : Lavc58.91.100 pcm_s16le
[alsa @ 0x557fb59ee0] ALSA buffer xrun.
    Last message repeated 68 times
[alsa @ 0x557fb59ee0] ALSA buffer xrun.peed=0.473x    
    Last message repeated 69 times
[alsa @ 0x557fb59ee0] ALSA buffer xrun.peed=0.466x    
    Last message repeated 69 times
[alsa @ 0x557fb59ee0] ALSA buffer xrun.peed=0.463x    
    Last message repeated 5 times
[sdp @ 0x557fb57720] sending 52 bytes of RR
[sdp @ 0x557fb57720] result from ffurl_write: -5
[alsa @ 0x557fb59ee0] ALSA buffer xrun.
    Last message repeated 63 times
[alsa @ 0x557fb59ee0] ALSA buffer xrun.peed=0.463x    
    Last message repeated 68 times
[alsa @ 0x557fb59ee0] ALSA buffer xrun.peed=0.462x    
    Last message repeated 68 times
[alsa @ 0x557fb59ee0] ALSA buffer xrun.peed=0.462x    
    Last message repeated 14 times
[sdp @ 0x557fb57720] sending 52 bytes of RR
[sdp @ 0x557fb57720] result from ffurl_write: -5

例如，使用 Vivaldi 浏览器时我没有遇到问题，但使用 Rhythmbox 时总是遇到问题。我在 ffmpeg 中尝试了以下设置的组合：

-flags low_delay
-fflags nobuffer
-avioflags direct
-buffer_size 4096
-re

一些组合确实解决了之前提到的越来越高的延迟问题，但 Rhythmbox 和其他组合的音频断续问题仍然存在。我使用 Pulseaudio 检查了输出延迟$pactl list sinks，它在 0.1ms 到 2ms 之间移动，这似乎与正在播放的任何应用程序一致。为了进一步澄清，我在 Alsa 中的输入设备是一个 snd-aloop 环回模块，它被捕获卡米拉DSP。CamillaDSP 处理捕获并在 USB 声卡上播放。它支持速率调整，所以我真的不认为它是罪魁祸首。如果我在服务器或客户端上使用 Alsa 进行扬声器测试，它会从 CamillaDSP 播放干净的音频。

感谢您的阅读并期待您的支持。

相关内容