有没有办法知道服务重启的原因以及是谁重启的?

有没有办法知道服务重启的原因以及是谁重启的?
  • Ubuntu 14.04
  • clamav 0.98.7

该问题clamav-daemon几乎每天都会重新启动:

Sep  1 06:30:00 x-master clamd[6778]: Pid file removed.
clamd[6778]: --- Stopped at Tue Sep  1 06:30:00 2015
clamd[5979]: clamd daemon 0.98.7 (OS: linux-gnu, ARCH: x86_64, CPU: x86_64)
clamd[5979]: Running as user root (UID 0, GID 0)
clamd[5979]: Log file size limited to 4294967295 bytes.
clamd[5979]: Reading databases from /var/lib/clamav
clamd[5979]: Not loading PUA signatures.
clamd[5979]: Bytecode: Security mode set to "TrustSigned".

clamdscan如果正在运行则会导致问题:

/etc/cron.daily/clamav_scan:
ERROR: Could not connect to clamd on x.x.x.x: Connection refused

请注意,我在一开始就说了“几乎”:

/var/log/syslog:Sep  1 06:30:00 x-master clamd[6778]: Pid file removed.
/var/log/syslog.1:Aug 31 06:27:54 x-master clamd[20128]: Pid file removed.
/var/log/syslog.4.gz:Aug 28 06:28:34 x-master clamd[4475]: Pid file removed.
/var/log/syslog.5.gz:Aug 27 06:27:47 x-master clamd[21466]: Pid file removed.

如你看到的:

  • 8 月 29 日和 30 日没有发生
  • 它经常在 06:27 左右重新启动,这是cron.daily运行时间

    27 6 * * * root nice -n 19 ionice -c3 run-parts --report /etc/cron.daily
    

内容/etc/cron.daily/clamav_scan

find / $exclude_string ! \( -path "/tmp/clamav-*.tmp" -prune \) ! \( -path "/var/lib/elasticsearch" -prune \) ! \( -path "/var/lib/mongodb" -prune \) ! \( -path "/var/lib/graylog-server" -prune \) -mtime -1 -type f -print0 | xargs -0 clamdscan --quiet -l "$status_file" || retval=$?

这里有一个 clamav-daemon 的 logrotate 文件:

/var/log/clamav/clamav.log {
     rotate 12
     weekly
     compress
     delaycompress
     create 640  clamav adm
     postrotate
     /etc/init.d/clamav-daemon reload-log > /dev/null
     endscript
     }

但它只是重新加载日志:

Sep  1 02:30:24 uba-master clamd[6778]: SIGHUP caught: re-opening log file.

我知道我们可以用它auditd来监视二进制文件,这里有一个示例日志:

ausearch -f /usr/sbin/clamd                                                                                                        [2/178]
----
time->Tue Sep  1 07:56:44 2015
type=PATH msg=audit(1441094204.559:15): item=1 name=(null) inode=2756458 dev=fc:00 mode=0100755 ouid=0 ogid=0 rdev=00:00
type=PATH msg=audit(1441094204.559:15): item=0 name="/usr/sbin/clamd" inode=3428628 dev=fc:00 mode=0100755 ouid=0 ogid=0 rdev=00:00
type=CWD msg=audit(1441094204.559:15):  cwd="/"
type=EXECVE msg=audit(1441094204.559:15): argc=1 a0="/usr/sbin/clamd"
type=SYSCALL msg=audit(1441094204.559:15): arch=c000003e syscall=59 success=yes exit=0 a0=7ffd277e03dc a1=7ffd277dfa78 a2=7ffd277dfa88 a3=7ffd277df570 items=2
 ppid=5708 pid=5946 auid=4294967295 uid=109 gid=114 euid=109 suid=109 fsuid=109 egid=114 sgid=114 fsgid=114 tty=pts1 ses=4294967295 comm="clamd" exe="/usr/sbin/clamd" key=(null)

clamav109 是...用户的 UID :

getent passwd clamav clamav:x:109:114::/var/lib/clamav:/bin/false

在这种情况下还有其他方法可以解决问题吗?


回复@HBruijn:

更新 AV 定义后可能出现 freshclam 吗?

我想了一下。以下是日志:

Sep  1 05:31:04 x-master freshclam[16197]: Received signal: wake up
Sep  1 05:31:04 x-master freshclam[16197]: ClamAV update process started at Tue Sep  1 05:31:04 2015
Sep  1 05:31:04 x-master freshclam[16197]: main.cvd is up to date (version: 55, sigs: 2424225, f-level: 60, builder: neo)
Sep  1 05:31:05 x-master freshclam[16197]: Downloading daily-20865.cdiff [100%]
Sep  1 05:31:09 x-master freshclam[16197]: daily.cld updated (version: 20865, sigs: 1555338, f-level: 63, builder: neo)
Sep  1 05:31:10 x-master freshclam[16197]: bytecode.cvd is up to date (version: 268, sigs: 47, f-level: 63, builder: anvilleg)
Sep  1 05:31:13 x-master freshclam[16197]: Database updated (3979610 signatures) from db.local.clamav.net (IP: 168.143.19.95)
Sep  1 05:31:13 x-master freshclam[16197]: Clamd successfully notified about the update.
Sep  1 05:31:13 x-master freshclam[16197]: --------------------------------------
Sep  1 04:34:10 x-master clamd[6778]: SelfCheck: Database status OK.
Sep  1 05:31:13 x-master clamd[6778]: Reading databases from /var/lib/clamav
Sep  1 05:31:22 x-master clamd[6778]: Database correctly reloaded (3974071 signatures)

我对此不太确定,但看起来 freshclam 有一个“内部机制”来通知 clamd 有关更新的信息。之后它只需重新加载数据库,无需重新启动该过程。您能确认一下吗?

另外,从时间戳上看,clamav-daemon 在 freshclam update database 一小时后重启,这正常吗?


更新时间:2015 年 9 月 1 日星期二 22:10:49 ICT

但看起来 freshclam 有一个“内部机制”来通知 clamd 有关更新的信息。之后它只需重新加载数据库,无需重新启动进程。

我可以通过测试来确认这是正确的:

  • 编辑 freshclam.conf 文件将间隔更改为分钟(Checks 1440
  • 重新启动 clamav-freshclam
  • cd /var/lib/clamav
  • rm daily.cvd
  • 等一分钟

    Sep  1 14:49:25 p freshclam[7654]: Downloading daily.cvd [100%]
    Sep  1 14:49:28 p freshclam[7654]: daily.cvd updated (version: 19487, sigs: 1191913, f-level: 63, builder: neo)
    Sep  1 14:49:28 p freshclam[7654]: Reading CVD header (bytecode.cvd):
    Sep  1 14:49:28 p freshclam[7654]: OK
    Sep  1 14:49:28 p freshclam[7654]: bytecode.cvd is up to date (version: 245, sigs: 43, f-level: 63, builder: dgoddard)
    Sep  1 14:49:31 p freshclam[7654]: Database updated (3616181 signatures) from clamav.local (IP: 10.0.2.2)
    Sep  1 14:49:31 p freshclam[7654]: Clamd successfully notified about the update.
    Sep  1 14:49:31 p freshclam[7654]: --------------------------------------
    Sep  1 14:49:32 p clamd[6693]: Reading databases from /var/lib/clamav
    Sep  1 14:49:39 p clamd[6693]: Database correctly reloaded (3610621 signatures)
    

并且 clamav-daemon 未重新启动。

答案1

请检查您是否正在使用任何配置管理系统,例如 Puppet、Chef、CFEngine 等。它们会定期干扰服务。要采取的具体措施取决于服务在配置管理系统中的使用方式。

答案2

提醒自己。

作业缓存的输出:

----------
          ID: clamav-daemon
    Function: service.running
      Result: True
     Comment: Service restarted
     Started: 06:27:52.736890
    Duration: 12997.632 ms
     Changes:
              ----------
              clamav-daemon:
                  True

看一下clamav公式:

  clamav-daemon:
    service:
      - running
      - order: 50
      - require:
        - service: clamav-freshclam
      - watch:
        - pkg: clamav-daemon
        - file: clamav-daemon
        - user: clamav

ed 状态下没有任何watch改变:

----------
          ID: clamav-daemon
    Function: pkg.latest
      Result: True
     Comment: Package clamav-daemon is already up-to-date.
     Started: 06:27:51.531415
    Duration: 53.224 ms
     Changes:

----------
          ID: clamav-daemon
    Function: file.managed
        Name: /etc/clamav/clamd.conf
      Result: True
     Comment: File /etc/clamav/clamd.conf is in the correct state
     Started: 06:27:51.760019
    Duration: 625.075 ms
     Changes:

----------
          ID: clamav
    Function: user.present
      Result: True
     Comment: User clamav is present and up to date
     Started: 06:27:51.590214
    Duration: 2.455 ms
     Changes:

为什么服务已重新启动?

查找watch_in,我发现一个管理pid文件的状态,如果pid文件发生变化,服务将重新启动:

{%- macro manage_pid(path, user, group, watch_in_service, mode=644) -%}
    {%- if salt['file.file_exists'](path) %}
{{ path }}:
  file:
    - managed
    - user: {{ user }}
    - group: {{ group }}
    - mode: {{ mode }}
    - replace: False
        {%- if caller is defined -%}
            {%- for line in caller().split("\n") -%}
                {%- if loop.first %}
    - require:
                {%- endif %}
{{ line|trim|indent(6, indentfirst=True) }}
            {%- endfor -%}
        {%- endif %}
    - watch_in:
      - service: {{ watch_in_service }}
    {%- else %}
# {{ path }} does not exist, no need to manage
    {%- endif -%}
{%- endmacro -%}

{%- call manage_pid('/var/run/clamav/clamd.pid', 'clamav', 'clamav', 'clamav-daemon', 664) %}
- pkg: clamav-daemon
{%- endcall %}

在的输出中salt-run jobs.lookup_jid <job id number>,我看到了这一点:

----------
          ID: /var/run/clamav/clamd.pid
    Function: file.managed
      Result: True
     Comment:
     Started: 06:27:52.392555
    Duration: 2.364 ms
     Changes:
              ----------
              group:
                  clamav
              user:
                  clamav

因此,该 pid 文件的所有者/组已更改为clamav。最后,我发现原因是 clamav 守护进程以root用户身份在网络模式下运行。因此,pid 文件被创建为root。因此,管理 pid 文件的状态必须更改为类似以下内容:

{%- call manage_pid('/var/run/clamav/clamd.pid', 'root', 'root', 'clamav-daemon', 664) %}
- pkg: clamav-daemon
{%- endcall %}

相关内容