5

I have had my docker host added to Rancher since a long time ago and everything has been working just fine for months. Suddenly, a few days ago, my docker host was marked as "Disconnected" in Rancher. When I check the status of the rancher-agent container I can see that it is restarting all the time:

•100% ➜ sudo docker ps -f name=rancher
CONTAINER ID   IMAGE                   COMMAND         CREATED        STATUS                          PORTS     NAMES
0a12a18ca52c   rancher/agent:v1.2.11   "/run.sh run"   21 hours ago   Restarting (1) 54 seconds ago             rancher-agent

In the log I see this:

•100% ➜ sudo docker container logs 0a12a18ca52c                                                                                                                                                                                                                                                   
time="2021-06-29T09:13:27Z" level=fatal msg="Failed to find container id:\n0::/\n" 
time="2021-06-29T09:13:28Z" level=fatal msg="Failed to find container id:\n0::/\n" 
time="2021-06-29T09:13:29Z" level=fatal msg="Failed to find container id:\n0::/\n" 
time="2021-06-29T09:13:31Z" level=fatal msg="Failed to find container id:\n0::/\n" 
time="2021-06-29T09:13:32Z" level=fatal msg="Failed to find container id:\n0::/\n" 
time="2021-06-29T09:13:35Z" level=fatal msg="Failed to find container id:\n0::/\n" 
time="2021-06-29T09:13:39Z" level=fatal msg="Failed to find container id:\n0::/\n" 

I have tried searching the web for this but found nothing of interest. I have tried recreating the container. I have tried removing everything in /var/lib/rancher and recreating the container. I have even tried to remove my environment in Rancher, stopped docker, removed all the docker data (data-root) on the docker host, again removed the files related to the rancher-agent and recreated the rancher-agent. Every time I recreate the rancher-agent I have used the command that the Rancher GUI gives you when you want to add a new host. I always end up with the same errors in the log.

I have the same version of docker installed on the host as it had when the host was last connected to Rancher. I use Rancher 1.6 (cannot change this) and docker 20.10.6 (also tried 20.10.7) on a machine running Manjaro.

3 Answers3

5

it is caused by cgroups v2 to make it work again

echo 'GRUB_CMDLINE_LINUX=systemd.unified_cgroup_hierarchy=false' > /etc/default/grub.d/cgroup.cfg
update-grub
user20872259
  • 51
  • 1
  • 3
  • 1
    Sounds promising. I have since given up on Manjaro and stuck to Ubuntu as it just works without any issues. – Anton Pettersson Dec 29 '22 at 17:14
  • I get `/etc/default/grub.d/cgroup.cfg: Permission denied` (even with sudo) – Ri1a Mar 24 '23 at 08:52
  • 2
    Just wanna confirm this worked! Thanks! @Ri1a you gotta just edit the file in another way then. What worked for me is: `sudo nano /etc/default/grub.d/cgroup.cfg` then add this line: `GRUB_CMDLINE_LINUX=systemd.unified_cgroup_hierarchy=false` to save: press `CTRL + X` then `Y` then enter to save the file. then reboot the host and it should work! – Boro Mar 26 '23 at 20:52
0

TL;DR: I reinstalled my dockerhost with Ubuntu 20.04 and now everything works.

I tried setting up a new virtual machine with the same OS as my host had (Manjaro Linux) and I got the exact same experience and behavior as on the host. My dockerhost was "Disconnected" and the log said "Failed to find container id:...". I made another virtual machine with Ubuntu 20.04 and everything worked as expected there. What's strange is that the docker version was the same. Not sure about containerd though. I did try different combinations of docker and containerd on my host and guest but they always had the same issue.

  • Same problem here. - on a server with Ubuntu 22.04 LTS, it fails and `rancher/agent` logs are full of `"Failed to find container id…"`. - on a server with on older Ubuntu (20.04 LTS) it works (I use Docker 20.10.12 on both server). Did you find a solution? Does Rancher Agent is not compatible with Ubuntu 22 ?! – Dam Fa Aug 31 '23 at 21:28
0

also iptables must be old ones otherwise it is broken

user20872259
  • 51
  • 1
  • 3
  • You might convince more people that your answer is helpful if you add an explanation. Punctuation, capitalisatoin, clearer reference of "also" are optional, but might also help. Copmare [answer]. And you probably need to make more obvious why this is a separate answer that cannot meaningfully be edited in your other answer post here. I can see that you know how to edit it... – Yunnosch Mar 14 '23 at 15:03
  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/34032067) – Robert Mar 19 '23 at 23:48