什么是 systemd“refresh-policy-routes” [AWS Linux 2023]?

什么是 systemd“refresh-policy-routes” [AWS Linux 2023]?

我正在尝试查找实例中断的原因,该原因似乎是来自计划的 systemdrefresh-policy-routes服务,然后调用404 ErrorEC2RoleProvider发生错误后,实例上的所有网络连接都失败(包括 ssh)。怎么办refresh-policy-routes

Starting [email protected] - Refresh policy routes for ens5...
Starting sysstat-collect.service - system activity accounting tool...
sysstat-collect.service: Deactivated successfully.
Finished sysstat-collect.service - system activity accounting tool.

SERVICE_START pid=1 uid=0 auid=<sess> ses=<sess> subj=system_u:system_r:init_t:s0 
msg='unit=sysstat-collect comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
SERVICE_STOP pid=1 uid=0 auid=<sess> ses=<sess> subj=system_u:system_r:init_t:s0 
msg='unit=sysstat-collect comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'

Starting configuration for ens5
/lib/systemd/systemd-networkd-wait-online ens5

[get_meta] Querying IMDS for mac
Got IMDSv2 token from http://169.254.169.254/latest
Using existing cfgfile /run/systemd/network/70-ens5.network
[get_meta] Querying IMDS for network/interfaces/macs/<mac-addr>/local-ipv4s
Got IMDSv2 token from http://169.254.169.254/latest
[get_meta] Querying IMDS for network/interfaces/macs/<mac-addr>/ipv4-prefix
Got IMDSv2 token from http://169.254.169.254/latest
[get_meta] Querying IMDS for network/interfaces/macs/<mac-addr>/local-ipv4s
Got IMDSv2 token from http://169.254.169.254/latest

Called trap
No networkd reload needed

[email protected]: Deactivated successfully.
Finished [email protected] - Refresh policy routes for ens5.

SERVICE_START pid=1 uid=0 auid=<sess> ses=<sess> subj=system_u:system_r:init_t:s0 
msg='unit=refresh-policy-routes@ens5 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
SERVICE_STOP pid=1 uid=0 auid=<sess> ses=<sess> subj=system_u:system_r:init_t:s0 
msg='unit=refresh-policy-routes@ens5 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'

2024-02-11 08:30:46 WARN EC2RoleProvider Failed to connect
 to Systems Manager with instance profile role credentials. 
 Err: retrieved credentials failed to report to ssm. Error: EC2RoleRequestError: no EC2 instance role found
 caused by: EC2MetadataError: failed to make EC2Metadata request
    <?xml version="1.0" encoding="iso-8859-1"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
                     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
     <head>
      <title>404 - Not Found</title>
     </head>
     <body>
      <h1>404 - Not Found</h1>
     </body>
    </html>
        status code: 404, request id:

答案1

似乎已删除(或修复)cleanupstop处的行为setup-policy-routes.sh。希望此更新对您有所帮助。

https://github.com/amazonlinux/amazon-ec2-net-utils/pull/107


(太长)后记

refresh-policy-routes 起什么作用?

refresh-policy-routes@<iface>.service

refresh-policy-routes@<iface>.service调用脚本setup-policy-routes两次(至少在 中aws-ec2-net-utils v2.4.1)。首先,使用参数 调用脚本start,然后使用 调用脚本cleanup

https://github.com/amazonlinux/amazon-ec2-net-utils/blob/b721f411c2e7ca00534cfa0b03089976c0a434ac/systemd/system/refresh-policy-routes%40.service#L10-L11

ExecStart=/usr/bin/setup-policy-routes %i start
ExecStartPost=/usr/bin/setup-policy-routes %i cleanup

使用参数start,它会执行systemd-networkd-wait-online。使用参数cleanup,它会删除锁定文件(如果仍然存在)。但我还没有准确理解这种行为。

https://github.com/amazonlinux/amazon-ec2-net-utils/blob/b721f411c2e7ca00534cfa0b03089976c0a434ac/bin/setup-policy-routes.sh#L41-L63

    info "Starting configuration for $iface"
    debug /lib/systemd/systemd-networkd-wait-online -i "$iface"
    /lib/systemd/systemd-networkd-wait-online -i "$iface"
        info "WARNING: Cleaning up leaked lock ${lockdir}/${iface}"
        rm -f "${lockdir}/${iface}"

[Install]顺便说一句,服务单元本身在其单元文件中没有任何部分。

https://github.com/amazonlinux/amazon-ec2-net-utils/blob/v2.4.1/systemd/system/refresh-policy-routes%40.service

因此需要通过其他方式激活。这里timer使用 type unit。

refresh-policy-routes@<iface>.timer

当连接一个接口(或发生某些事件)时,policy-routes@<iface>.service属于该接口的实例化服务单元将通过 激活udev

policy-routes@<iface>.service也调用setup-policy-routes两次,就像一样refresh-policy-routes@<iface>.service

https://github.com/amazonlinux/amazon-ec2-net-utils/blob/b721f411c2e7ca00534cfa0b03089976c0a434ac/systemd/system/policy-routes%40.service#L14-L15

ExecStart=/usr/bin/setup-policy-routes %i start
ExecStartPost=/usr/bin/setup-policy-routes %i cleanup

同时,也udev使之成为可能refresh-policy-routes@<iface>.timer

但是定时器refresh-policy-routes@<iface>.service第一次激活时要等待 30 秒。然后定时器每隔一两分钟就会重复激活该服务(就我在我的环境中看到的情况而言)。

https://github.com/amazonlinux/amazon-ec2-net-utils/blob/5ba0509505f60dfaa2edb3da6bca10228b17d041/systemd/system/refresh-policy-routes%40.timer#L1-L4

[Timer]
OnActiveSec=30
OnUnitInactiveSec=60
RandomizedDelaySec=5

概括

setup-policy-routes.sh被多个触发器多次调用。不清楚每个阶段发生了什么(以及应该发生什么)。这会与其他依赖网络的服务或组件发生冲突。

我认为需要一些After=或选项或类似的选项(在单元文件中)来解决这种情况。Before=systemd

答案2

相关内容