CLONE_NEWNS and mount propagate

Question

I'm current looking for some example to understand CLONE_NEWNS in linux, so I did following experiments:

in shell1:

$ mkdir mnt
$ sudo unshare -m /bin/bash
# mount /dev/sda5 mnt/
# ls mnt
lost+found

where as in shell2:

$ ls mnt
lost+found

I'm expecting output in shell2 should be empty, because CLONE_NEWNS will create a new mount namespace as documents said.

firstly, I thought child's namespace mount will propagate to parents', so I do mount in parent, and child also see the mount!

and then, I create two separate child namespace from same parent, mount in one child will also affect the other.

I'm confused.

ps. in my first experiment in shell1:

# readlink /proc/$$/ns/mnt
mnt:[4026532353]

in shell2:

$ readlink /proc/$$/ns/mnt
mnt:[4026531840]

apparently, they are in different mount namespace.

score 0 · Answer 1 · answered Apr 06 '13 at 12:51

A different mount namespace just means, that the [u]mount-actions in the child-namespace will not be visible in the parent. It specifically does not mean, that mounts in the parent will not be visible in the child, neither it means that all mounts disappear.

To try it, you can [un]mount something in the child-namespace and see if it [still] exists in the parent namespace.

score 0 · Answer 2 · answered Nov 06 '16 at 07:23

Linux seems to have got it entirely backward, and I'm not entirely sure why.

But if you mount /dev/sda5 mnt/ in shell2, then ls mnt in shell1 should show no LOST+FOUND. The child namespace in shell1 is effectively protected it from any changes in its parent namespace, but the parent namespace is changed by the child. Sort of like reverse sandboxing, where the parent namespace is the one that can get changed by the child, but not vice versa.

I don't know why this is, and there may be cases where it's different, but I don't know of them. I could be dead wrong on this, but I tested the above action, and it did seem to prevent mnt/ from being mounted in shell2.

A possible solution to your problem might be to use unshare to create a sort of privileged mount namespace, in which you do all your root operations, and it's the parent namespace you use for normal unprivileged accounts and operations. So, like...

[shell1] # unshare -m bash
[shell2] # sudo -u normal-user startx
[shell1] # mount /dev/privatesecret /mnt/secretplace

...something like that. Obviously if anyone gets root, they can ptrace your processes, but the child namespace would be make the private mount in [shell1] completely hidden from any operations in [shell2] or anywhere else, provided you drop to normal user privileges before doing anything that could mess with it.

That reverse sandbox thing only applies to mount namespaces, I'm fairly sure. PID namespaces will be properly sandboxed such that children do not see PIDs of the parents, and memory namespaces have children more restricted in memory than the parent.

CLONE_NEWNS and mount propagate

2 Answers2