如何加速 clamd@amavisd 的启动/让它在 Amazon Linux 2 上“启动”

2024-6-1 • tag-icon

如何加速 clamd@amavisd 的启动/让它在 Amazon Linux 2 上“启动”

我最近将 clamav 从我不确定的版本（但不管是 2 月 13 日 EPEL 上的最新版本）升级到了 0.102.3。在之前的版本下，我必须TimeoutStartSec在 systemd conf 文件中将启动时间设置为 5 分钟才能让它在不超时的情况下启动。在当前版本下，据我所知，它永远不会启动。我正在尝试弄清楚如何让它真正启动，以及如何在不到几个小时的时间内让它进入成功或失败状态。

$sudo systemctl -l status clamd@amavisd
● [email protected] - clamd scanner (amavisd) daemon
   Loaded: loaded (/usr/lib/systemd/system/[email protected]; enabled; vendor preset: disabled)
   Active: activating (start) since Wed 2020-07-15 17:33:53 UTC; 4h 18min ago

我正在努力。我将超时设置为 15 分钟，然后是 20 分钟，然后是 40 分钟，然后是 1.5 小时，最后是 4 小时。在众多奇怪的事情中，上面的内容在 4 小时过去很久之后仍然“有效”，但最终还是失败了：

Jul 15 17:23:29 ip-10-0-200-85 systemd: Starting clamd scanner (amavisd) daemon.
...
Jul 15 22:47:08 ip-10-0-200-85 systemd: [email protected] failed.

这是 systemd 配置：

$cat /usr/lib/systemd/system/[email protected]
[Unit]
Description = clamd scanner (%i) daemon
Documentation=man:clamd(8) man:clamd.conf(5) https://www.clamav.net/documents/
After = syslog.target nss-lookup.target network.target

[Service]
Type = forking
ExecStart = /usr/sbin/clamd -c /etc/clamd.d/%i.conf
# Reload the database
ExecReload=/bin/kill -USR2 $MAINPID
Restart = on-failure
TimeoutStartSec=240min

[Install]
WantedBy = multi-user.target

最终失败前的最后日志条目是

...
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: bytecode: registered ctx variable at 0x55ceb28157b0 (+744) id 7
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: bytecode debug: startup: bytecode execution in auto mode
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: interpreter bytecode run finished in 50106us, after executing 96 opcodes
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: Bytecode: disable status is 0
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: bytecode: JIT disabled
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: Cannot prepare for JIT, LLVM is not compiled or not linked
Jul 15 20:06:29 ip-10-0-200-85 clamd: LibClamAV debug: Bytecode: 0 bytecode prepared with JIT, 95 prepared with interpreter, 95 total
...

我不确定启动过程已经远远超过了 4 小时的超时时间，但它还没有自行终止，这意味着什么。在之前的短暂间隔中，它只是重新启动了启动过程。例如，这里将计时器设置为 20 分钟：

Jul 15 16:12:32 ip-10-0-200-85 systemd: Starting clamd scanner (amavisd) daemon.
...
Jul 15 16:33:13 ip-10-0-200-85 systemd: [email protected] start operation timed out. Terminating.
Jul 15 16:33:13 ip-10-0-200-85 systemd: Failed to start clamd scanner (amavisd) daemon.
Jul 15 16:33:13 ip-10-0-200-85 systemd: Unit [email protected] entered failed state.
Jul 15 16:33:13 ip-10-0-200-85 systemd: [email protected] failed.
Jul 15 16:33:14 ip-10-0-200-85 systemd: [email protected] holdoff time over, scheduling restart.
Jul 15 16:33:14 ip-10-0-200-85 systemd: Starting clamd scanner (amavisd) daemon.

考虑到 4 小时计时器到期 20 分钟后仍未到达时间Failed to start clamd点，我想我应该尝试一下，用它处理一些邮件。邮件实际上出现在我的收件箱中，标题为X-Virus-Scanned: amavisd-new at example.com，它似乎在做一些事情（尽管我承认我不确定这些日志文件应该说什么）：

Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Breaking command loop, mode is no longer MODE_COMMAND
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Consumed entire command
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Number of file descriptors polled: 1 fds
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: fds_poll_recv: timeout after 600 seconds
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: THRMGR: queue (single) crossed low threshold -> signaling
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: THRMGR: queue (bulk) crossed low threshold -> signaling
Jul 15 21:57:10 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Finished scanthread
Jul 15 21:57:10 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Scanthread: connection shut down (FD 10)
Jul 15 21:57:10 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: THRMGR: queue (single) crossed low threshold -> signaling
Jul 15 21:57:10 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: THRMGR: queue (bulk) crossed low threshold -> signaling

我只是让它运行，继续处于“激活”状态，但它最终在 5 小时 15 分钟后失败了（再次，针对 4 小时计时器 ¯_(ツ)_/¯ ）：

Jul 15 17:23:29 ip-10-0-200-85 systemd: Starting clamd scanner (amavisd) daemon...
...
Jul 15 22:47:06 ip-10-0-200-85 systemd: [email protected] start operation timed out. Terminating.
Jul 15 22:47:08 ip-10-0-200-85 systemd: Failed to start clamd scanner (amavisd) daemon.
Jul 15 22:47:08 ip-10-0-200-85 systemd: Unit [email protected] entered failed state.
Jul 15 22:47:08 ip-10-0-200-85 systemd: [email protected] failed.
Jul 15 22:47:08 ip-10-0-200-85 systemd: [email protected] holdoff time over, scheduling restart.

所以，就像我在开头说的那样，我正在努力弄清楚如何让它真正启动——同时，我也在努力弄清楚如何让它在几个小时内失败。显然，不断增加计时器是一种选择，但如果有更好的方法，或者我明显忽略了某些事情，我不想在测试之间等待 5、6、7 个小时。

另外，我看了关于如何在 Centos 7 上运行 ClamAV 的说明[实际上不是我的笔记，只是这么叫而已]。里面有一些有趣的东西，比如设置良好的级别，以及 CPU 和内存限制，但这些似乎会让速度变得更慢。

TIA 对于任何帮助/提示/建议。

答案1

归根结底，这似乎是内存问题。该服务在 AWS EC2 t2.nano（即 500MB RAM）上运行，具有 2GB 交换空间。这在过去是可行的，但显然在过去六个月的某个时候，ClamAV 只是……需要更多。我将实例类型切换到 t2.micro（1GB RAM），它就正常工作了。对于某些人来说，弄清楚如何让 clamd 使用更多交换/更少实际内存可能仍然是一件好事，但在大多数情况下，只需确保您至少有 1GB RAM 似乎就足够了。

答案1

相关内容