1

I'm using syslog-ng 3.37.1 on a VMware Photon 3.0 virtual appliance (preconfigured VM). The appliance is configured to write logs into certain files under /var/log folder as well as to remote syslog servers (optional).

Logs from facility 'auth' and 'authpriv' are configured to write to /var/log/auth.log, as well as send it over to remote syslog server when enabled. In addition, there are other messages as well from kernel, systemd services as well as other processes, configured to be processed via syslog-ng.

Issue is that, logs from a few facilities (such as auth, authpriv, cron etc) are not processed (received?) by syslog-ng initially. So, any SSH events, TTY login events are not logged into the file and remote. However, many other events from kernel, systemd and other processes are logged fine.

Below is the configuration for auth.log, that does not log in the first boot.

filter f_auth { facility(auth) or facility(authpriv)); };
destination authlog { file("/var/log/auth.log" perm(0600)); };
log { source(s_local); filter(f_auth); destination(authlog); };

I updated the filter as below without any success

filter f_auth {
    facility(auth) or facility(authpriv) or
    match('sshd' value('PROGRAM')) or match('systemd-logind' value('PROGRAM'));
};

In journal logs I can observe the relevant logs, for example, below command to view SSH logs.

journalctl -f -u sshd

Additional syslog-ng service restart or config reload during appliance startup do not fix this. The log file /var/log/auth.log (and also cron log etc) show zero size during this time. Syslog-ng log looks fine too.

However, if I generate some auth facility event (say, SSH/TTY login) and manually restart syslog-ng, all the log entries (including old events) are immediately written into filesystem log (/var/log/auth.log) and also sent to remote syslog server.

In the syslog-ng.log I find below entry when it starts working that way.

syslog-ng[481]: [date] Failed to seek journal to the saved cursor position; cursor='', error='Invalid argument (22)'

It makes me wonder if it is due to some bad cursor position. However, I can still see other systemd and kernel logs being logged fine. So, not sure.

What could be causing such behaviour? How can I ensure that syslog-ng is able to receive and process these logs without manual intervention?

Below is more detailed configuration for reference:

@version: 3.37
@include "scl.conf"

source s_local {
    system();
    internal();
    udp();
};

destination d_local {
    file("/var/log/messages");
    file("/var/log/messages-kv.log" template("$ISODATE $HOST $(format-welf --scope all-nv-pairs)\n") frac-digits(3));
};

log {
    source(s_local);
    # uncomment this line to open port 514 to receive messages
    #source(s_network);
    destination(d_local);
};


filter f_auth {
    facility(auth) or facility(authpriv)); # Also tried facility (auth, authpriv)
};
destination authlog { file("/var/log/auth.log" perm(0600)); };
log { source(s_local); filter(f_auth); destination(authlog); };

destination d_kern { file("/dev/console" perm(0600)); };
filter f_kern { facility(kern); };
log { source(s_local); filter(f_kern); destination(d_kern); };

destination d_cron { file("/var/log/cron" perm(0600)); };
filter f_cron { facility(cron); };
log { source(s_local); filter(f_cron); destination(d_cron); };

destination d_syslogng { file("/var/log/syslog-ng.log" perm(0600)); };
filter f_syslogng { program(syslog-ng); };
log { source(s_local); filter(f_syslogng); destination(d_syslogng); };

# A few more of above kind of configuration follows here.

# Add configuration files that have remote destination, filter and log configuration for remote servers
@include "remote/*.conf"

As can be seen, /var/log/auth.log should hold logs from auth facility, but the log remains blank until subsequent restart of syslog-ng after a syslog config change (via API) or manual login into the system. However, triggering automated restart of syslog-ng using cron (without additional syslog config change) does not help.

Any thoughts, suggestions?

ramtech
  • 757
  • 6
  • 15
  • This seems strange indeed. this might be caused by an empty initial value for the cursor, where syslog-ng seeks to at first startup. This has been done this way forever, so it might be a new behavior of systemd. – bazsi77 Dec 13 '22 at 06:31
  • 1
    I've now checked some more things in the code. When there's no syslog-ng.persist file (e.g. no state where we left off in the journal) we will just seek to the end or the head of the journal depending on the read-old-records() option. In any other case, we try to seek back to the last position where we left off, as stored in the persist file. It seems your persist file exists, but the cursor stored there is empty and the seek fails. In this case we revert back to the head of the journal. read-old-records() is TRUE by default. Does this reflect the behaviour you experience? – bazsi77 Dec 13 '22 at 08:11
  • Thanks for the inputs. Yes, syslog-ng.persist file was there. Do not really know what is wrong, added some additional checks in cron job to check the auth.log size and restart syslog-ng if the log file is empty. Observed that it works fine after around two or three iterations. I guess, I'll settle with this workaround for now. – ramtech Dec 14 '22 at 06:19
  • I was rereading your original post, and now as I understand it, it is the empty cursor value that actually fixes the issue for you. As when syslog-ng reports this issue, it resets its journal position and then it starts working. Maybe you are affected by this issue: https://github.com/syslog-ng/syslog-ng/issues/2836 – bazsi77 Dec 14 '22 at 09:49

1 Answers1

1

This is probably caused by your real time clock going backwards. The notification mechanism in libsystemd does not work in this case.

There's a proof-of-concept patch in this syslog-ng issue:

https://github.com/syslog-ng/syslog-ng/issues/2836

But I've increased the priority to tackle that problem and fix this, as it is causing issue more often than I anticipated.

As a workaround you should synchronize the time for your VM, preferably so that during boot it waits until a sync and then keep the time synchronized by ntp.

bazsi77
  • 521
  • 2
  • 6
  • bazsli77, I'm confused by the OP writing "However, many other events from kernel, systemd and other processes are logged fine." But you know syslog-ng better than I do. – TaborKelly Dec 15 '22 at 18:00
  • @TaborKelly, that's right. Observing issue events from certain programs/facilities. That's where I'm confused too. – ramtech Dec 21 '22 at 04:04
  • @bazsi77, as indicated in comment on the question, a few restarts of syslog-ng is resolving the issue for now. It's possible that time on the VM is syncing by then and resolving the issue..but I'm not sure right now. Will take a look at this in a few days and update my observations here. Thanks – ramtech Dec 21 '22 at 04:06
  • Just wanted to mention that there's a PR waiting for review that should have a fix for this. This is the one: https://github.com/syslog-ng/syslog-ng/pull/4245 – bazsi77 Jan 17 '23 at 18:20