1

I have a PostgreSQL server (postgresql.service) and a basic shell script (mobilizon-postgresql.service that is run to provide some assertions on the database to a third service (Mobilizon).

So naturally, mobilizon-postgresql.service is configured with an After= dependency for postgresql.service:

# systemctl show mobilizon-postgresql.service | grep After=
After=basic.target system.slice systemd-journald.socket sysinit.target postgresql.service

Though after a reconfiguration of the system (which didn't touch these dependencies), I'm seeing the following behaviour in the logs:

1677672119.103035 myserver systemd[1]: Starting Mobilizon PostgreSQL setup...
...
1677672119.153192 myserver systemd[1]: Starting PostgreSQL Server...

So, mobilizon-postgresql.service is started before the startup of postgresql.service.

1677672119.279742 myserver mobilizon-postgresql-start[329444]: psql: error: could not connect to server: No such file or directory
1677672119.279742 myserver mobilizon-postgresql-start[329444]:         Is the server running locally and accepting
1677672119.279742 myserver mobilizon-postgresql-start[329444]:         connections on Unix domain socket "/run/postgresql/.s.PGSQL.543>
1677672119.283558 myserver systemd[1]: mobilizon-postgresql.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
1677672119.283707 myserver systemd[1]: mobilizon-postgresql.service: Failed with result 'exit-code'.
1677672119.289678 myserver systemd[1]: Failed to start Mobilizon PostgreSQL setup.

So of course, mobilizon.postgresql can't setup a connection to the db.

1677672119.503881 myserver postgres[329458]: [329458] LOG:  starting PostgreSQL 13.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11>
1677672119.512541 myserver postgres[329458]: [329458] LOG:  listening on IPv4 address "0.0.0.0", port 5432
1677672119.512863 myserver postgres[329458]: [329458] LOG:  listening on IPv6 address "::", port 5432
1677672119.519498 myserver postgres[329458]: [329458] LOG:  listening on Unix socket "/run/postgresql/.s.PGSQL.5432"
...
1677672119.871989 myserver systemd[1]: Started PostgreSQL Server.

Only later the postgresql.service is reported to be fully started.

This behaviour goes completely against my understanding of After=. From systemd.unit(5):

If unit foo.service contains the setting Before=bar.service and both units are being started, bar.service's start-up is delayed until foo.service has finished starting up.

Am I having wrong assumptions here?

Systemd version is 251.12, Distro is NixOS 22.11

erictapen
  • 113
  • 4

1 Answers1

0

What does systemd mean by "finished starting up"? man systemd.unit says

for service units start-up is considered completed for the purpose of Before=/After= when all its configured start-up commands have been invoked and they either failed or reported start-up success.

So you need to look in the postgresql service file to see if it is doing something like forking and putting its pid in a file, or using sd_notify() in its code to signal "ready". It's not very useful.

Sadly, the easy way round this is to make the mobilizon-postgresql service wait for the Unix domain socket to exist, using inotifywait or similar, or add Restart=on-failure and RestartSec=1 to retry until success.

Systemd's idea of how things should work is that you create a socket unit, and whenever anyone tries to open that socket, you launch a program. In this case, by starting your script, which opens the socket, it launches postgresql.

meuh
  • 1,563
  • 10
  • 11
  • That was my first suspicion, that PostgreSQL and systemd have different opinions of what constitutes a "started" service. But in this case, systemd itself reports PostgreSQL as "started" way after it does things that should only happen after a started service? See the last log line in the question. – erictapen Mar 03 '23 at 18:19
  • 1
    @erictapen: More specifically, PostgreSQL doesn't _accurately report_ to systemd when its service should be considered "started". Which version are you using? Postgres 15.2 on Arch seems to support sd_notify() readiness notifications, as its .service unit has Type=notify (though that doesn't guarantee that it was programmed to invoke sd_notify() at the correct time). Does your postgresql.service use Type=notify, Type=forking, or Type=simple? Does it invoke postgres directly, or does it somehow go through a wrapper init.d script? – user1686 Mar 05 '23 at 13:27
  • Mh, I think my PostgreSQL *does* accurately report to systemd when it's started? See the last log snippet, it reports ports and sockets being open and then systemd states that it "Started PostgreSQL Server". My problem is that systemd just doesn't seem to care about startup state of PostgreSQL and starts the unit nonetheless? – erictapen Mar 06 '23 at 16:29