How to speed up clamd@amavisd launch / get it to "start" on Amazon Linux 2

Question

I recently upgraded clamav from I'm not sure which version, but whatever was current on EPEL on February 13th, to 0.102.3. Under the previous version, I had to set TimeoutStartSec in the systemd conf file to 5 minutes to get it to start without timing out. Under the current version, AFAICT, it ~ never starts. I'm trying to figure out both how to get it to actually start, and how to get it to a success or failed state in some time less than hours.

$sudo systemctl -l status clamd@amavisd
● clamd@amavisd.service - clamd scanner (amavisd) daemon
   Loaded: loaded (/usr/lib/systemd/system/clamd@.service; enabled; vendor preset: disabled)
   Active: activating (start) since Wed 2020-07-15 17:33:53 UTC; 4h 18min ago

I was working my way up. I'd set the timeout to 15 minutes, then 20 minutes, then 40 minutes, then 1.5 hours, and finally 4 hours. Amongst the many curious things, the above was still "activating" long after the 4 hours was up, but ultimately it did fail:

Jul 15 17:23:29 ip-10-0-200-85 systemd: Starting clamd scanner (amavisd) daemon.
...
Jul 15 22:47:08 ip-10-0-200-85 systemd: clamd@amavisd.service failed.

Here's the systemd configuration:

$cat /usr/lib/systemd/system/clamd@.service
[Unit]
Description = clamd scanner (%i) daemon
Documentation=man:clamd(8) man:clamd.conf(5) https://www.clamav.net/documents/
After = syslog.target nss-lookup.target network.target

[Service]
Type = forking
ExecStart = /usr/sbin/clamd -c /etc/clamd.d/%i.conf
# Reload the database
ExecReload=/bin/kill -USR2 $MAINPID
Restart = on-failure
TimeoutStartSec=240min

[Install]
WantedBy = multi-user.target

The last log entries before it ultimately failed were

...
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: bytecode: registered ctx variable at 0x55ceb28157b0 (+744) id 7
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: bytecode debug: startup: bytecode execution in auto mode
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: interpreter bytecode run finished in 50106us, after executing 96 opcodes
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: Bytecode: disable status is 0
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: bytecode: JIT disabled
Jul 15 20:06:18 ip-10-0-200-85 clamd: LibClamAV debug: Cannot prepare for JIT, LLVM is not compiled or not linked
Jul 15 20:06:29 ip-10-0-200-85 clamd: LibClamAV debug: Bytecode: 0 bytecode prepared with JIT, 95 prepared with interpreter, 95 total
...

I wasn't sure what it meant that the start process was well past the 4-hour timeout and it hadn't yet killed itself. At previous, short intervals, it just restarted the startup process. For example, here was the timer set to 20 minutes:

Jul 15 16:12:32 ip-10-0-200-85 systemd: Starting clamd scanner (amavisd) daemon.
...
Jul 15 16:33:13 ip-10-0-200-85 systemd: clamd@amavisd.service start operation timed out. Terminating.
Jul 15 16:33:13 ip-10-0-200-85 systemd: Failed to start clamd scanner (amavisd) daemon.
Jul 15 16:33:13 ip-10-0-200-85 systemd: Unit clamd@amavisd.service entered failed state.
Jul 15 16:33:13 ip-10-0-200-85 systemd: clamd@amavisd.service failed.
Jul 15 16:33:14 ip-10-0-200-85 systemd: clamd@amavisd.service holdoff time over, scheduling restart.
Jul 15 16:33:14 ip-10-0-200-85 systemd: Starting clamd scanner (amavisd) daemon.

Given that, 20 minutes after the 4-hour timer had expired, it hadn't hit the Failed to start clamd point, I figured I'd give it a try and run some mail through it. Mail actually showed up in my inbox with the header X-Virus-Scanned: amavisd-new at example.com, and it appeared to be doing something (though I confess I'm not sure what these log files are supposed to say):

Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Breaking command loop, mode is no longer MODE_COMMAND
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Consumed entire command
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Number of file descriptors polled: 1 fds
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: fds_poll_recv: timeout after 600 seconds
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: THRMGR: queue (single) crossed low threshold -> signaling
Jul 15 21:57:07 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: THRMGR: queue (bulk) crossed low threshold -> signaling
Jul 15 21:57:10 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Finished scanthread
Jul 15 21:57:10 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: Scanthread: connection shut down (FD 10)
Jul 15 21:57:10 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: THRMGR: queue (single) crossed low threshold -> signaling
Jul 15 21:57:10 ip-10-0-200-85.eu-central-1.compute.internal clamd[16152]: THRMGR: queue (bulk) crossed low threshold -> signaling

I just let it run, continuing in its "activating" state, but it did ultimately fail, after 5 hours and 15 minutes (again, against a 4 hour timer ¯_(ツ)_/¯ ):

Jul 15 17:23:29 ip-10-0-200-85 systemd: Starting clamd scanner (amavisd) daemon...
...
Jul 15 22:47:06 ip-10-0-200-85 systemd: clamd@amavisd.service start operation timed out. Terminating.
Jul 15 22:47:08 ip-10-0-200-85 systemd: Failed to start clamd scanner (amavisd) daemon.
Jul 15 22:47:08 ip-10-0-200-85 systemd: Unit clamd@amavisd.service entered failed state.
Jul 15 22:47:08 ip-10-0-200-85 systemd: clamd@amavisd.service failed.
Jul 15 22:47:08 ip-10-0-200-85 systemd: clamd@amavisd.service holdoff time over, scheduling restart.

So, like I said at the beginning, I'm trying to figure out how to get this to actually start — and whilst working on that, trying to figure out how to get it to fail in sometime less than hours. Obviously, continually upping the timer is a choice, but I don't want to wait 5, 6, 7 hours between tests if there's a better way to do this, or something obvious I'm missing.

Also, I looked at My Notes On Getting ClamAV Working on Centos 7 [not actually my notes, that's just what it's called]. There's some interesting stuff in there are setting nice level, and CPU and memory limits, but those seem like they'd slow things down even more.

TIA for any help/hints/suggestions.

score 1 · Answer 1 · answered Jul 16 '20 at 14:50

1

Ultimately, this appears to have been a memory issue. The service was running on an AWS EC2 t2.nano (i.e., 500MB RAM), with 2GB swap. That had worked in the past, but apparently at some point in the last six months ClamAV just ... needed more. I switched the instance type to t2.micro (1GB RAM) and it Just Worked. Probably for some people it would still be nice to figure out how to make clamd use more swap / less real memory, but in most cases, just ensuring you have at least 1GB RAM seems to be sufficient.

answered Jul 16 '20 at 14:50

philolegein

409
4
12

swap is inherently slow, while swap usage isn't bad in itself, frequent swapping is and will drastically slow down your service so this isn't unexpected. Unless you have ClamAV load fewer signatures (maybe, I'm not sure what uses the most memory in ClamAV), you can't fix that. – Ginnungagap Jul 17 '20 at 05:46

How to speed up clamd@amavisd launch / get it to "start" on Amazon Linux 2

1 Answers1