3

I have problem restoring SELinux on a server and would like some insight.

Context

On our server, I recently changed SELinux from enforcing to permissive, and because it did not solve our problem which was an odd one (unexpected permissions denied), I even set it to disabled... Anyway, SELinux wasn't the problem. We figured it out and it is now solved ... but SELinux is now disabled!

Now I want to restore it. So I was simply changing it to enforcing again and I was expecting the system to detect automatically it needs to relabel (so no needs to touch /.autorelabel) on next boot, that-worked-in-the-past(tm). Sadly it got stuck during boot and after 10 minutes we went back to disabled to restore services. The documentation states it can take a long long time, so I've scheduled the reboot over the weekend. Well after the weekend it was still stuck in boot.

Strangely the server's HDDs do not show any disk activity.

When I do the above on a VM, it boots to the same point and then prints a message that SELinux relabeling needs to be performed and does it. On the server we do not see that message.

Remarks

The server has some CIFS mounts, would that be the reason that SELinux tries to relabel the CIFS filesystems? That could explain the longer time and the no disk activities, but that do not explain the absence of messages on the console.

Technical info

  • CentOS 7.6 (1810) x86_64
  • Baremetal, Dell server, 6x HDDs RAID6 (hardware)
  • No logs is written to disk, last started unit is Update UTMP about System Boot/Shutdown.

Note: There is a RHEL knowledge base article which recommends going from disabled to enforcing via permissive first. And with each reboot, forcing the relabeling by touching /.autorelabel. That's going to be my next try on the weekend. But any insight to what's going on is welcome.

Update 01

I have now done further testing. All of the following boot mode failed:

  1. booting with selinux=1, enforcing=0, autorelabel=0
  2. booting with selinux=1, enforcing=0, autorelabel=1
  3. booting with selinux=1, enforcing=1, autorelabel=0
  4. booting with selinux=1, enforcing=1, autorelabel=1

Only booting with selinux=0 is working. When I mean failed, it means there is no error messages on the console while booting nor a message about SELinux relabeling, the system hangs for many hours, no disk activity is visible.

I have now commented out all cifs volumes from my /etc/fstab and rebooted in permissive with autorelabeling. The boot took about 20min but I could see it was doing something (it printed on the console that relabeling was started, showed that FS like sysfs are read-only and be ignored, disk activity was visible, etc.). So SELinux is now enabled in permissive mode. I will now try enforcing mode this weekend.

Obviously it seems that those CIFS mounts were the problem, but I do not understand why. To me it was my understanding that CIFS mounts would be ignored by relabeling, just like sysfs. Have I misunderstood something here?

Huygens
  • 1,708
  • 3
  • 20
  • 36

1 Answers1

4

Is there any specific reason to use /.autorelabel? My understanding is that it triggers a restorecon.

To figure out why that hangs, I'd suggest to switch to permissive mode, and boot the machine without a /.autorelabel file. When the system is up and running try restorecon -rFv /. Eventually do not start with / but with e.g. /var or a smaller portion of the whole system.

The -v flag of restorecon shows you all the ongoing modifications on the filesystem. It should give you a good idea where it "hangs" during a boot with a /.autorelabel. I'd expect that there is some area in the filesystem where lots of tiny files are stored. Or eventually network mounted storage.

Once this is done switching to enforcing should not require another /.autorelabel run.

[edit]

I've just verified above statement regarding /.autorelabel and restorecon on a Fedora 29 machine. /lib/systemd/system/selinux-autorelabel.service starts /usr/libexec/selinux/selinux-autorelabel which is a bash script. That script runs /sbin/fixfiles with the restore parameter. /sbin/fixfiles is another bash script which actually runs restorecon.

hargut
  • 3,908
  • 7
  • 10
  • I did not use the `/.autorelabel` as it is my understanding that SELinux will see the change from `disabled` to `enforcing` and start relabeling automatically. But the RHEL KB article advise otherwise. – Huygens Jan 23 '19 at 11:45
  • The RHEL KB article also advise to go first through `permissive` again. But I do not understand why? As AFAIK there must be a relabeling in permissive and enforcing mode. On top of that on boot I should see a message stating `Warning -- SELinux targeted policy relabel is required` and with the labeling progress going on. Sadly I do not see that very first message, so I guess the relabeling is not stuck on remote FS but hasn't started yet. – Huygens Jan 23 '19 at 11:53
  • 1
    SELinux labeling information is stored in the filesystems extended attributes e.g. `getfattr -n security.selinux /bin/ls`. This information is not removed when you disable SELinux. So I'd be surprised that SELinux could detect if it was switched of for a while. While it could be, that if there is no such labels available, a labeling is started when beeing first enabled. In this case, to be sure what is going on, try to modify the `fixfiles` adding `set -x` and the `-v` parameter for `restorecon` and uncomment the `cifs` mounts in `/etc/fstab` (or systemd mounts). – hargut Jan 23 '19 at 12:14
  • 1
    Going through `permissive` when re-enabling SELinux gives you an opportunity to verify if everything is working properly after the changes made. The `permissive` mode acts the same as `enforcing` but it does not block any accesses. Boot the system in `permissive` verify the SELinux activities (eg. `/var/log/audit/audit.log` or `/usr/bin/aureport`). When you are sure that all your services will work properly switch to enforcing. This can be done live, there is no reboot required. – hargut Jan 23 '19 at 12:21
  • I don't see any benefit doing the relabeling twice, once in `permissive` and once in `enforcing` as the labeling information would be the same. This would only change when modifying the policy in use. – hargut Jan 23 '19 at 13:05
  • I agree with you and it is my understanding, You were right to suggest to comment the `cifs` mounts, that way I was able to boot in permissive mode (it wouldn't work without that). But this CIFS/SELinux issue is puzzling me... – Huygens Jan 23 '19 at 13:53
  • Are your mounts marked with with the _netdev mount option? As you stated that you're not seeing the SELinux message during boot, eventually the system hangs before, at the point trying to treat the CIFS storage as local disk, but not beeing able to reach it. – hargut Jan 23 '19 at 14:04
  • No, that's the only options: `defaults,credentials=/etc/samba/cifs-credentials,vers=3.0,x-systemd.automount,x-systemd.mount-timeout=2m,nounix` – Huygens Jan 23 '19 at 14:07
  • Further the mount options to look into are most named something with `xattr` (e.g. `nouser_xattr`). If it could be feasible to disable extended attributes on the CIFS mounts depends on your usage of the storage. But disabling it could help, in case the relabling on CIFS is the reason for the hanging boot. – hargut Jan 23 '19 at 14:08
  • If `_netdev` was not there, try to add it, and then try to relable the system again. I think that this will be the solution for the lockup. `_netdev` causes systemd to place the mounts into a different target (after networking.target), and that target will be most likely reached after the `selinux-autorelabel.target`. It is general highly advisable to use `_netdev` with any network mounted storage. – hargut Jan 23 '19 at 14:11
  • I cannot test this right now, I will have to at the next maintenance opportunity. I have tried to reproduce the behaviour in a VM, but I could not... – Huygens Jan 23 '19 at 14:57
  • 1
    Thank you for all your effort! I will award the bounty to you eventhough this is not yet completely solved. – Huygens Jan 28 '19 at 10:40