我们正在使用 RedHat IdM 7,并将用于中央用户管理的服务器系统绑定到我们的 IdM 集群。我们在 Ubuntu 20.04 系统上发现,sssd-pac.service
每隔 7-8 分钟就会定期崩溃并自动重新启动。不幸的是,我们到目前为止还没有找到原因。我们正在运行以下 SSSD 版本:
(~) root@srv2-t $ sssd --version
2.2.3
包含/var/log/syslog
以下条目:
Oct 7 11:01:32 srv2-t sssd_pac[537014]: dbus[537014]: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Oct 7 11:01:32 srv2-t sssd_pac[537014]: Most likely, the application called unref() too many times and removed a reference belonging to libdbus, since this is a shared connection.
Oct 7 11:01:32 srv2-t sssd_pac[537014]: D-Bus not built with -rdynamic so unable to print a backtrace
Oct 7 11:01:32 srv2-t systemd[1]: sssd-pac.service: Main process exited, code=dumped, status=6/ABRT
Oct 7 11:01:32 srv2-t systemd[1]: sssd-pac.service: Failed with result 'core-dump'.
Oct 7 11:01:32 srv2-t systemd[1]: sssd-pac.service: Scheduled restart job, restart counter is at 5.
Oct 7 11:01:32 srv2-t systemd[1]: Stopped SSSD PAC Service responder.
Oct 7 11:01:32 srv2-t systemd[1]: Starting SSSD PAC Service responder...
Oct 7 11:01:32 srv2-t systemd[1]: Started SSSD PAC Service responder.
Oct 7 11:01:33 srv2-t sssd_pac[537271]: Starting up
Oct 7 11:09:03 srv2-t sssd_pac[537271]: dbus[537271]: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Oct 7 11:09:03 srv2-t sssd_pac[537271]: Most likely, the application called unref() too many times and removed a reference belonging to libdbus, since this is a shared connection.
Oct 7 11:09:03 srv2-t sssd_pac[537271]: D-Bus not built with -rdynamic so unable to print a backtrace
Oct 7 11:09:03 srv2-t systemd[1]: sssd-pac.service: Main process exited, code=dumped, status=6/ABRT
Oct 7 11:09:03 srv2-t systemd[1]: sssd-pac.service: Failed with result 'core-dump'.
Oct 7 11:09:03 srv2-t systemd[1]: sssd-pac.service: Scheduled restart job, restart counter is at 6.
Oct 7 11:09:03 srv2-t systemd[1]: Stopped SSSD PAC Service responder.
Oct 7 11:09:03 srv2-t systemd[1]: Starting SSSD PAC Service responder...
Oct 7 11:09:03 srv2-t systemd[1]: Started SSSD PAC Service responder.
Oct 7 11:09:03 srv2-t sssd_pac[537522]: Starting up
所有其他依赖服务都是套接字激活的,并且sssd.conf
没有services
应在启动时直接运行的服务条目:
[domain/lx.mycompany]
id_provider = ipa
ipa_server = _srv_, idm2.lx.mycompany
ipa_domain = lx.mycompany
ipa_hostname = srv2-t.lx.mycompany
auth_provider = ipa
chpass_provider = ipa
access_provider = ipa
cache_credentials = True
ldap_tls_cacert = /etc/ipa/ca.crt
dyndns_update = True
dyndns_iface = ens160
krb5_store_password_if_offline = True
default_shell = /bin/bash
[sssd]
domains = lx.mycompany
[nss]
homedir_substring = /home
[pam]
offline_credentials_expiration = 7
[sudo]
[autofs]
[ssh]
[ifp]
[secrets]
[session_recording]
我们已经识别出有两个sssd_pac
进程正在运行,一个直接由 启动,sssd.service
另一个由 激活sssd-pac.socket
。套接字激活的进程会在 7-8 分钟后定期崩溃,而另一个进程则运行正常。
我们重新启动SSSD并检查进程列表:
(~) root@srv2-t $ systemctl stop sssd* --all
(~) root@srv2-t $ systemctl start sssd.service
(~) root@srv2-t $ systemctl status sssd.service
● sssd.service - System Security Services Daemon
Loaded: loaded (/lib/systemd/system/sssd.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2021-10-07 12:31:04 CEST; 9s ago
Main PID: 540692 (sssd)
Tasks: 3 (limit: 9448)
Memory: 8.2M
CGroup: /system.slice/sssd.service
├─540692 /usr/sbin/sssd -i --logger=files
├─540705 /usr/libexec/sssd/sssd_be --domain lx.mycompany --uid 0 --gid 0 --logger=files
└─540706 /usr/libexec/sssd/sssd_pac --uid 0 --gid 0 --logger=files
Oct 07 12:31:03 srv2-t systemd[1]: Starting System Security Services Daemon...
Oct 07 12:31:04 srv2-t sssd[540692]: Starting up
Oct 07 12:31:04 srv2-t sssd_be[540705]: Starting up
Oct 07 12:31:04 srv2-t sssd_pac[540706]: Starting up
Oct 07 12:31:04 srv2-t systemd[1]: Started System Security Services Daemon.
Oct 07 12:31:04 srv2-t sssd_be[540705]: GSSAPI client step 1
Oct 07 12:31:04 srv2-t sssd_be[540705]: GSSAPI client step 1
Oct 07 12:31:04 srv2-t sssd_be[540705]: GSSAPI client step 1
Oct 07 12:31:04 srv2-t sssd_be[540705]: GSSAPI client step 2
(~) root@srv2-t $ ps aux | grep sssd_
root 540705 0.1 0.2 71752 19740 ? S 12:31 0:00 /usr/libexec/sssd/sssd_be --domain lx.mycompany --uid 0 --gid 0 --logger=files
root 540706 0.0 0.1 38836 13932 ? S 12:31 0:00 /usr/libexec/sssd/sssd_pac --uid 0 --gid 0 --logger=files
root 540751 0.0 0.0 6300 736 pts/0 S+ 12:31 0:00 grep --color=auto sssd_
用户登录系统后,我们得到以下信息:
(~) root@srv2-t $ ps aux | grep sssd_
root 540705 0.1 0.2 72168 21720 ? S 12:31 0:00 /usr/libexec/sssd/sssd_be --domain lx.mycompany --uid 0 --gid 0 --logger=files
root 540706 0.0 0.1 38836 13932 ? S 12:31 0:00 /usr/libexec/sssd/sssd_pac --uid 0 --gid 0 --logger=files
root 540752 0.1 0.4 58416 39016 ? Ss 12:31 0:00 /usr/libexec/sssd/sssd_nss --logger=files --socket-activated
root 540787 0.0 0.1 33436 13308 ? Ss 12:32 0:00 /usr/libexec/sssd/sssd_ssh --logger=files --socket-activated
root 540806 0.0 0.1 33980 13448 ? Ss 12:32 0:00 /usr/libexec/sssd/sssd_pam --logger=files --socket-activated
root 540810 0.0 0.1 38968 14664 ? Ss 12:32 0:00 /usr/libexec/sssd/sssd_pac --logger=files --socket-activated
root 540921 0.0 0.0 6300 736 pts/0 R+ 12:32 0:00 grep --color=auto sssd_
大约 15 分钟后,套接字激活sssd_pac
进程重新启动了两次。查看报告的启动时间。
(~) root@nxrepo-srv2-t $ ps aux | grep sssd_
root 540705 0.0 0.2 72164 20944 ? S 12:31 0:00 /usr/libexec/sssd/sssd_be --domain lx.eurodata.de --uid 0 --gid 0 --logger=files
root 540706 0.0 0.1 38836 13932 ? S 12:31 0:00 /usr/libexec/sssd/sssd_pac --uid 0 --gid 0 --logger=files
root 541428 0.0 0.1 38836 13960 ? Ss 12:47 0:00 /usr/libexec/sssd/sssd_pac --logger=files --socket-activated
root 541546 0.0 0.0 6432 736 pts/0 S+ 12:51 0:00 grep --color=auto sssd_
这是日志条目:
Oct 7 12:32:18 srv2-t sssd_pac[540810]: Starting up
Oct 7 12:32:19 srv2-t kernel: [8020600.723391] audit: type=1400 audit(1633602739.238:35): apparmor="ALLOWED" operation="open" profile="/usr/sbin/sssd" name="/etc/selinux/semanage.conf" pid=540811 comm="selinux_child" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Oct 7 12:39:13 srv2-t sssd_nss[540752]: Shutting down
Oct 7 12:39:13 srv2-t systemd[1]: sssd-nss.service: Succeeded.
Oct 7 12:39:46 srv2-t sssd_ssh[540787]: Shutting down
Oct 7 12:39:46 srv2-t systemd[1]: sssd-ssh.service: Succeeded.
Oct 7 12:39:48 srv2-t sssd_pam[540806]: Shutting down
Oct 7 12:39:48 srv2-t systemd[1]: sssd-pam.service: Succeeded.
Oct 7 12:39:48 srv2-t sssd_pac[540810]: dbus[540810]: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Oct 7 12:39:48 srv2-t sssd_pac[540810]: Most likely, the application called unref() too many times and removed a reference belonging to libdbus, since this is a shared connection.
Oct 7 12:39:48 srv2-t sssd_pac[540810]: D-Bus not built with -rdynamic so unable to print a backtrace
Oct 7 12:39:49 srv2-t systemd[1]: sssd-pac.service: Main process exited, code=dumped, status=6/ABRT
Oct 7 12:39:49 srv2-t systemd[1]: sssd-pac.service: Failed with result 'core-dump'.
Oct 7 12:39:49 srv2-t systemd[1]: sssd-pac.service: Scheduled restart job, restart counter is at 1.
Oct 7 12:39:49 srv2-t systemd[1]: Stopped SSSD PAC Service responder.
Oct 7 12:39:49 srv2-t systemd[1]: Starting SSSD PAC Service responder...
Oct 7 12:39:49 srv2-t systemd[1]: Started SSSD PAC Service responder.
Oct 7 12:39:49 srv2-t sssd_pac[541181]: Starting up
Oct 7 12:47:19 srv2-t sssd_pac[541181]: dbus[541181]: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Oct 7 12:47:19 srv2-t sssd_pac[541181]: Most likely, the application called unref() too many times and removed a reference belonging to libdbus, since this is a shared connection.
Oct 7 12:47:19 srv2-t sssd_pac[541181]: D-Bus not built with -rdynamic so unable to print a backtrace
Oct 7 12:47:19 srv2-t systemd[1]: sssd-pac.service: Main process exited, code=dumped, status=6/ABRT
Oct 7 12:47:19 srv2-t systemd[1]: sssd-pac.service: Failed with result 'core-dump'.
Oct 7 12:47:19 srv2-t systemd[1]: sssd-pac.service: Scheduled restart job, restart counter is at 2.
Oct 7 12:47:19 srv2-t systemd[1]: Stopped SSSD PAC Service responder.
Oct 7 12:47:19 srv2-t systemd[1]: Starting SSSD PAC Service responder...
Oct 7 12:47:19 srv2-t systemd[1]: Started SSSD PAC Service responder.
Oct 7 12:47:20 srv2-t sssd_pac[541428]: Starting up
我们怎样才能摆脱这个问题?
多谢。