-2

I am trying to setup a prometheus podman container using ansible.

Question

I'm trying to set up a Prometheus container using Podman and Ansible on an Ubuntu machine. The container runs fine manually, but when I attempt to automate it using Ansible, the systemd service doesn't seem to auto-start as expected. I'm looking for help to debug this issue.


I'm working on an Ubuntu machine.

me@ubuntu-usb:~/undergit/td2023-monitoring$ uname -a
Linux ubuntu-usb 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
me@ubuntu-usb:~/undergit/td2023-monitoring$ free -h
               total       utilisé      libre     partagé tamp/cache   disponible
Mem:            15Gi       8.1Gi       1.8Gi       1.6Gi       5.5Gi       5.2Gi
Échange:      2.0Gi          0B       2.0Gi
me@ubuntu-usb:~/undergit/td2023-monitoring$ 

Reproduce error

Here is the command I'm using:

I'm using Terraform to launch my playbook since my tasks and handlers are in a role.

me@ubuntu-usb:~/undergit/td2023-monitoring$ terraform -chdir=tf apply -target=module.lxd_machine_cernovada

Since I'm using podman I want my tasks to be rootless, that's why I'm using a become false.

- name: Installation/configuration of monitoring server
  hosts: domain_monitor
  become: false
  roles:
    - da.prometheus

And this is the playbook/role I created:

me@ubuntu-usb:~/undergit/td2023-monitoring$ cat roles/da.prometheus/tasks/main.yml 

---
- name: Ensure data directory exists
  file:
    path: "/home/admin/data"
    state: directory
    mode: '0755' # donne rwx au propriétaire, rx aux groups et rx à all

- name: Copy prometheus.yml to the data directory
  copy:
    src: prometheus.yml
    dest: /home/admin/data/prometheus.yml
    mode: '0644' # donne rw au propriétaire, r aux groups et r à all
    owner: admin
    group: admin

- name: Run Prometheus container
  containers.podman.podman_container:
    name: prometheus
    image: docker.io/prom/prometheus:v2.46.0
    ports:
      - "9090:9090"
    volumes:
      - /home/admin/data:/etc/prometheus
    state: started
    detach: yes
    recreate: no

- name: Ensure systemd user directory exists
  file:
    path: "/home/admin/.config/systemd/user"
    state: directory
    mode: '0755'

- name: Copy prometheus.service into the user systemd directory
  copy:
    src: "prometheus.service"
    dest: "/home/admin/.config/systemd/user/prometheus.service"
    mode: '0644'
  notify: 
    - reload systemd

- name: Enable Prometheus user service
  systemd:
    name: prometheus.service
    enabled: yes
    state: started
    scope: user
  notify:
    - restart prometheus
me@ubuntu-usb:~/undergit/td2023-monitoring$ 

With those handlers :

me@ubuntu-usb:~/undergit/td2023-monitoring$ cat roles/da.prometheus/handlers/main.yml

---
- name: restart prometheus
  systemd:
    name: prometheus
    state: restarted
    scope: user

- name: reload systemd
  systemd:
    daemon_reload: yes
me@ubuntu-usb:~/undergit/td2023-monitoring$ 

Since I'm using rootless podman I want my service to be scope: user

From Ansible systemd doc : https://docs.ansible.com/ansible/latest/collections/ansible/builtin/systemd_service_module.html#parameter-scope

Here is the very basic service I created for this task.

me@ubuntu-usb:~/undergit/td2023-monitoring$ cat roles/da.prometheus/files/prometheus.service 
[Unit]
Description=Prometheus Service

[Service]
ExecStart=/usr/bin/podman start -a prometheus
ExecStop=/usr/bin/podman stop -t 2 prometheus
Restart=always

[Install]
WantedBy=multi-user.target

me@ubuntu-usb:~/undergit/td2023-monitoring$ 

Error I'm gettin

│ RUNNING HANDLER [da.prometheus : reload systemd] *******************************
│ lundi 28 août 2023  11:30:13 +0200 (0:00:01.996)       0:01:17.395 ************ 
│ fatal: [cernovada.antiterre.lan]: FAILED! => {
│     "changed": false
│ }
│ 
│ MSG:
│ 
│ failure 1 during daemon-reload: Failed to reload daemon: Method call timed out
│ 
│ 
│ NO MORE HOSTS LEFT *************************************************************
│ 
│ PLAY RECAP *********************************************************************
│ cernovada.antiterre.lan    : ok=11   changed=9    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
│ 
│ Playbook run took 0 days, 0 hours, 1 minutes, 44 seconds
│ lundi 28 août 2023  11:30:40 +0200 (0:00:26.924)       0:01:44.319 ************ 
│ =============================================================================== 
│ da.prometheus : Run Prometheus container ------------------------------- 28.92s
│ Installation des paquets ----------------------------------------------- 28.42s
│ da.prometheus : reload systemd ----------------------------------------- 26.92s
│ da.prometheus : Copy prometheus.yml to the data directory --------------- 3.05s
│ da.prometheus : Copy prometheus.service into the user systemd directory --- 2.83s
│ Gathering Facts --------------------------------------------------------- 2.78s
│ Stop service cron on debian, if running --------------------------------- 2.36s
│ Gathering Facts --------------------------------------------------------- 2.01s
│ da.prometheus : restart prometheus -------------------------------------- 2.00s
│ da.prometheus : Enable Prometheus user service -------------------------- 1.87s
│ da.prometheus : Ensure data directory exists ---------------------------- 1.58s
│ da.prometheus : Ensure systemd user directory exists -------------------- 1.56s
│ 

I tried ssh into the container to check the service status and it says ``:

If I ssh into the container and start the service mannually it works fine ?!

admin@cernovada:~$ podman ps
CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES
admin@cernovada:~$ systemctl --user start prometheus.service
admin@cernovada:~$ podman ps
CONTAINER ID  IMAGE                              COMMAND               CREATED         STATUS           PORTS                   NAMES
16f96b483a18  docker.io/prom/prometheus:v2.46.0  --config.file=/et...  14 minutes ago  Up 1 second ago  0.0.0.0:9090->9090/tcp  prometheus
admin@cernovada:~$ 

So my guess is the service is working fine, only when using the Ansible playbook my service won't auto-start.

Goal

I want the prometheus server to auto-start when running the playbook. That's why I created a systemd service.

I'm open to others solutions to auto-start the container, although it has to always restart if the host restart.

What I tried

I tried adding a - meta: flush_handlers after the reload systemd

as seen here: https://lookonmyworks.co.uk/2015/06/24/ansible-systemctl-daemon-reload/

I tried removing the notify: reload systemd but I get the same error.

I did make all the steps manually without using Ansible to ensure it wasn't from my config files and I get no errors.

If I try to access the prometheus interface while the playbook is running, I get the web UI and when ansible try reload the systemd daemon I can't access it anymore.


meta

If I forgot to add informations, ask and I'll provide asap.

If the question is too long or unreadable, thanks in advance for the edits.

Thank you for you kind help

Didix
  • 567
  • 1
  • 7
  • 26
  • Feel free to comment while downvoting, so I can improve the question. Thank you – Didix Aug 28 '23 at 10:55
  • 2
    What about adding `become: true` to the `reload systemd` task? – β.εηοιτ.βε Aug 28 '23 at 10:59
  • I'll try adding the `become: true` to the systemd handler ty, also isn't using root user anti podman design? Or using root for services isn't bad practice? – Didix Aug 28 '23 at 11:10
  • 1
    @β.εηοιτ.βε It did work thank you, I'll use this solution for now, you can add it as an answer. I was wondering if there was a root-less solution since I'm using podman and I want full best practices? Haven't seen anything regarding systemd priveleges in podman docs. – Didix Aug 28 '23 at 11:17
  • That's not related to podman or anything the like, your user need either root or the possibility to run any `systemctl` command in order to act on systemd. – β.εηοιτ.βε Aug 28 '23 at 11:21
  • Sorry for the dup, I've seen the other question but did not realise we had the same kind of issue. – Didix Aug 28 '23 at 11:36

0 Answers0