RealTek RTL8125 以太网控制器:Ubuntu 22.04.3 中是否支持 IEEE-1588/PTP?

RealTek RTL8125 以太网控制器:Ubuntu 22.04.3 中是否支持 IEEE-1588/PTP?

我有一台配有 Gigabyte B650 AORUS ELITE AX 主板的台式机,我已在其上安装了 Ubuntu 22.04.3。我试图将此台式机用作 IEEE-1588(精确时间协议)主机,在一个小型本地网络中,该网络仅包含一个非托管交换机和另一个我希望与之同步时间的设备。

我注意到,当我启动时ptp4l,我没有任何/dev/ptpX节点,在调查之后ethtool -T,我发现我的界面似乎既不支持硬件时间戳,也不支持软件时间戳(除非这一页是不正确的,这表明我的输出表明不支持;无论如何我尝试使用软件模式但结果不够稳定,无法使用):

$ ethtool -T enp8s0
Time stamping parameters for enp8s0:
Capabilities:
    software-transmit
    software-receive
    software-system-clock
PTP Hardware Clock: none
Hardware Transmit Timestamp Modes: none
Hardware Receive Filter Modes: none

查看主板上的控制器,我发现它是 RealTek RTL8125:

$ lspci | grep Ethernet
08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)

根据 RealTek,我认为应该支持 PTP 硬件时间戳。RealTek 确实提供Linux 驱动程序对于此设备,但我需要的驱动程序(2.5G/5G 以太网 LINUX 驱动程序 r8125)被列为“适用于最高内核版本 6.4”(我使用的是 6.5.0)。无论如何,我尝试构建/安装它,并且能够在 makefile 中启用 PTP 后,ethtool 现已显示支持:

ethtool -T enp8s0
Time stamping parameters for enp8s0:
Capabilities:
    hardware-transmit
    software-transmit
    hardware-receive
    software-receive
    software-system-clock
    hardware-raw-clock
PTP Hardware Clock: 0
Hardware Transmit Timestamp Modes:
    off
    on
Hardware Receive Filter Modes:
    none
    ptpv2-l4-event
    ptpv2-l4-sync
    ptpv2-l4-delay-req
    ptpv2-event
    ptpv2-sync
    ptpv2-delay-req

但是,当我尝试启动 ptp4l 时,它被终止了,并且我在内核日志中看到以下内容:

$ sudo ptp4l -i enp8s0 -p /dev/ptp0 -m
ptp4l[2665.771]: selected /dev/ptp0 as PTP clock
Killed
$ sudo dmesg
...
[ 2891.569055] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 2891.569059] #PF: supervisor instruction fetch in kernel mode
[ 2891.569061] #PF: error_code(0x0010) - not-present page
[ 2891.569063] PGD 0 P4D 0 
[ 2891.569066] Oops: 0010 [#4] PREEMPT SMP NOPTI
[ 2891.569068] CPU: 19 PID: 7502 Comm: ptp4l Tainted: P      D    OE      6.5.0-15-generic #15~22.04.1-Ubuntu
[ 2891.569070] Hardware name: Gigabyte Technology Co., Ltd. B650 AORUS ELITE AX/B650 AORUS ELITE AX, BIOS FB 07/10/2023
[ 2891.569072] RIP: 0010:0x0
[ 2891.569092] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 2891.569094] RSP: 0018:ffffc115866d3d10 EFLAGS: 00010286
[ 2891.569096] RAX: 0000000000000000 RBX: ffff9b97039de000 RCX: 00000000071c71c7
[ 2891.569097] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b9722b20768
[ 2891.569098] RBP: ffffc115866d3d40 R08: 0000000000000000 R09: 0000000000000000
[ 2891.569099] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc115866d3d98
[ 2891.569101] R13: ffff9b97039df518 R14: 0000000000000000 R15: 0000000000000000
[ 2891.569102] FS:  00007fdfdaa96740(0000) GS:ffff9ba6186c0000(0000) knlGS:0000000000000000
[ 2891.569103] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2891.569105] CR2: ffffffffffffffd6 CR3: 000000010eb8e000 CR4: 0000000000750ee0
[ 2891.569106] PKRU: 55555554
[ 2891.569107] Call Trace:
[ 2891.569109]  <TASK>
[ 2891.569111]  ? show_regs+0x6d/0x80
[ 2891.569116]  ? __die+0x24/0x80
[ 2891.569119]  ? page_fault_oops+0x99/0x1b0
[ 2891.569123]  ? do_user_addr_fault+0x31d/0x6b0
[ 2891.569126]  ? exc_page_fault+0x83/0x1b0
[ 2891.569130]  ? asm_exc_page_fault+0x27/0x30
[ 2891.569136]  ptp_clock_adjtime+0xf5/0x230
[ 2891.569140]  pc_clock_adjtime+0x70/0xc0
[ 2891.569143]  __do_sys_clock_adjtime+0x9e/0x140
[ 2891.569149]  __x64_sys_clock_adjtime+0x15/0x20
[ 2891.569151]  do_syscall_64+0x58/0x90
[ 2891.569154]  ? srso_alias_return_thunk+0x5/0x7f
[ 2891.569157]  ? exit_to_user_mode_prepare+0x30/0xb0
[ 2891.569160]  ? srso_alias_return_thunk+0x5/0x7f
[ 2891.569162]  ? syscall_exit_to_user_mode+0x37/0x60
[ 2891.569165]  ? srso_alias_return_thunk+0x5/0x7f
[ 2891.569167]  ? do_syscall_64+0x67/0x90
[ 2891.569169]  ? srso_alias_return_thunk+0x5/0x7f
[ 2891.569171]  ? do_syscall_64+0x67/0x90
[ 2891.569173]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 2891.569175] RIP: 0033:0x7fdfda92644b
[ 2891.569179] Code: 8b 15 e9 39 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa b8 31 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 b1 39 0f 00 f7 d8
[ 2891.569180] RSP: 002b:00007ffd99830ad8 EFLAGS: 00000286 ORIG_RAX: 0000000000000131
[ 2891.569182] RAX: ffffffffffffffda RBX: 0000556e69a442a0 RCX: 00007fdfda92644b
[ 2891.569183] RDX: 00007ffd99830d70 RSI: 00007ffd99830b00 RDI: 00000000ffffffdb
[ 2891.569185] RBP: 00000000ffffffdb R08: 0000000000000000 R09: 0000000000000000
[ 2891.569186] R10: 0000000000000000 R11: 0000000000000286 R12: 00000000071c71c7
[ 2891.569187] R13: 0000556e688c12db R14: 0000556e688ca720 R15: 0000556e688c136b
[ 2891.569190]  </TASK>
[ 2891.569191] Modules linked in: uhid rfcomm nfnetlink ccm cmac algif_hash algif_skcipher af_alg nvidia_uvm(PO) bnep intel_rapl_msr intel_rapl_common nvidia_drm(PO) snd_hda_codec_realtek edac_mce_amd snd_hda_codec_generic nvidia_modeset(PO) binfmt_misc ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi rtw89_8852ce snd_hda_codec rtw89_8852c snd_hda_core snd_hwdep rtw89_pci snd_pcm rtw89_core snd_seq_midi snd_seq_midi_event mac80211 kvm snd_rawmidi btusb btrtl btbcm btintel snd_seq irqbypass nvidia(PO) btmtk nls_iso8859_1 snd_seq_device rapl input_leds bluetooth ftdi_sio cdc_mbim snd_timer usbserial gigabyte_wmi cdc_wdm wmi_bmof cfg80211 ecdh_generic k10temp ccp ecc snd libarc4 soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore ip_tables x_tables autofs4 dm_crypt hid_generic cdc_ncm cdc_ether usbnet mii usbhid hid amdgpu amdxcp iommu_v2 drm_buddy gpu_sched i2c_algo_bit drm_suballoc_helper drm_ttm_helper ttm drm_display_helper crct10dif_pclmul cec crc32_pclmul
[ 2891.569257]  polyval_clmulni rc_core polyval_generic ghash_clmulni_intel aesni_intel drm_kms_helper crypto_simd cryptd nvme drm ahci xhci_pci i2c_piix4 libahci xhci_pci_renesas nvme_core r8125(OE) nvme_common video wmi
[ 2891.569273] CR2: 0000000000000000
[ 2891.569276] ---[ end trace 0000000000000000 ]---
[ 2891.915705] RIP: 0010:0x0
[ 2891.915712] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 2891.915713] RSP: 0018:ffffc1159d207cb0 EFLAGS: 00010286
[ 2891.915715] RAX: 0000000000000000 RBX: ffff9b97039de000 RCX: 00000000071c71c7
[ 2891.915717] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b9722b20768
[ 2891.915718] RBP: ffffc1159d207ce0 R08: 0000000000000000 R09: 0000000000000000
[ 2891.915719] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc1159d207d38
[ 2891.915720] R13: ffff9b97039df518 R14: 0000000000000000 R15: 0000000000000000
[ 2891.915721] FS:  00007fdfdaa96740(0000) GS:ffff9ba6186c0000(0000) knlGS:0000000000000000
[ 2891.915723] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2891.915724] CR2: ffffffffffffffd6 CR3: 000000010eb8e000 CR4: 0000000000750ee0
[ 2891.915726] PKRU: 55555554
[ 2891.915727] note: ptp4l[7502] exited with irqs disabled

我确实找到了另一个问题,其中提到 apt repos 中有一个此驱动程序的版本:r8125-dkms它使用 dkms(我今天才知道)。但是,该版本编译时不支持 PTP:

$ ethtool -i enp8s0
driver: r8125
$ ethtool -T enp8s0
Time stamping parameters for enp8s0:
Capabilities:
    software-transmit
    software-receive
    software-system-clock
PTP Hardware Clock: none
Hardware Transmit Timestamp Modes: none
Hardware Receive Filter Modes: none

我尝试更改 makefile/usr/src/并使用 dkms 卸载/重新安装,但这似乎没有改变任何东西,尽管我可能弄错了。apt(9.007.01)中的版本与 realtek(9.012.04)中的版本有相当大的不同,并且无法编译(与 PTP 相关的内核结构发生了变化,等等)。

我希望有人能在任何层面上帮助我解决这个问题;无论是通过一些 ptp4l 配置还是修补驱动程序/其他东西。我真的很想有硬件时间戳支持,但功能性的 ptp 同步是主要要求。(在软件模式下,我无法实现 1000ns 以下的偏移稳定,可能会大幅跳跃到 +-100000ns)


进步:

距离解决方案还有一段距离,并且有偏离 stackoverflow 领域太远的风险,我发现:

  • 在某个时候,内核结构ptp_clock_info删除了adjfreq成员,并似乎用adjfine
  • 驱动程序 9.012.04 无法正确设置,但也无法设置许多其他成员函数,包括现在由于某种原因被调用的adjfreqnewadjfine这里(我这样说是出于某种原因,因为显然在内核版本 4 之前情况并非如此)。
static int ptp_clock_adjtime(struct posix_clock *pc, struct __kernel_timex *tx)
{
    struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock);
    struct ptp_clock_info *ops;
...
    ops = ptp->info;
...
    } else if (tx->modes & ADJ_FREQUENCY) {
        long ppb = scaled_ppm_to_ppb(tx->freq);
        if (ppb > ops->max_adj || ppb < -ops->max_adj)
            return -ERANGE;
        err = ops->adjfine(ops, tx->freq);
        ptp->dialed_frequency = tx->freq;
    }
...
    return err;
}

不幸的是,用空函数“欺骗”这些函数会导致ptp4l在步骤中挂起port 1: assuming the grand master role,所以看起来我要么必须替换tx->modes正在使用的函数(不确定我该怎么做)要么正确地实现驱动程序函数。我会进一步研究这个问题,同时已经联系了维护驱动程序的团队。

相关内容