This PR decomposes `libcontainer/configs.Config.Rootless bool` into `RootlessEUID bool` and
`RootlessCgroups bool`, so as to make "runc-in-userns" to be more compatible with "rootful" runc.
`RootlessEUID` denotes that runc is being executed as a non-root user (euid != 0) in
the current user namespace. `RootlessEUID` is almost identical to the former `Rootless`
except cgroups stuff.
`RootlessCgroups` denotes that runc is unlikely to have the full access to cgroups.
`RootlessCgroups` is set to false if runc is executed as the root (euid == 0) in the initial namespace.
Otherwise `RootlessCgroups` is set to true.
(Hint: if `RootlessEUID` is true, `RootlessCgroups` becomes true as well)
When runc is executed as the root (euid == 0) in an user namespace (e.g. by Docker-in-LXD, Podman, Usernetes),
`RootlessEUID` is set to false but `RootlessCgroups` is set to true.
So, "runc-in-userns" behaves almost same as "rootful" runc except that cgroups errors are ignored.
This PR does not have any impact on CLI flags and `state.json`.
Note about CLI:
* Now `runc --rootless=(auto|true|false)` CLI flag is only used for setting `RootlessCgroups`.
* Now `runc spec --rootless` is only required when `RootlessEUID` is set to true.
For runc-in-userns, `runc spec` without `--rootless` should work, when sufficient numbers of
UID/GID are mapped.
Note about `$XDG_RUNTIME_DIR` (e.g. `/run/user/1000`):
* `$XDG_RUNTIME_DIR` is ignored if runc is being executed as the root (euid == 0) in the initial namespace, for backward compatibility.
(`/run/runc` is used)
* If runc is executed as the root (euid == 0) in an user namespace, `$XDG_RUNTIME_DIR` is honored if `$USER != "" && $USER != "root"`.
This allows unprivileged users to allow execute runc as the root in userns, without mounting writable `/run/runc`.
Note about `state.json`:
* `rootless` is set to true when `RootlessEUID == true && RootlessCgroups == true`.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
it is now allowed to bind mount /proc. This is useful for rootless
containers when the PID namespace is shared with the host.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
subgid is defined per user, not group (see subgid(5))
This commit also adds support for specifying subuid owner with a numeric UID.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This adds a new CRIU based checkpoint/restore test to check if
the restored container runs in the same network namespace as before.
Signed-off-by: Adrian Reber <areber@redhat.com>
Using CRIU to checkpoint and restore a container into an existing
network namespace is not possible.
If the network namespace is defined like
{
"type": "network",
"path": "/run/netns/test"
}
there is the expectation that the restored container is again running in
the network namespace specified with 'path'.
This adds the new CRIU 'external namespace' feature to runc, where
during checkpointing that specific namespace is referenced and during
restore CRIU tries to restore the container in exactly that
namespace.
This breaks/fixes current runc behavior. If, without this patch, runc
restores a container with such a network namespace definition, it is
ignored and CRIU recreates a network namespace without a name.
With this patch runc uses the network namespace path (if available) to
checkpoint and restore the container in just that network namespace.
Restore will now fail if a container was checkpointed with a network
namespace path set and if that network namespace path does not exist
during restore.
runc still falls back to the old behavior if CRIU older than 3.11 is
installed.
Fixes#1786
Related to https://github.com/projectatomic/libpod/pull/469
Thanks to Andrei Vagin for all the help in getting the interface between
CRIU and runc right!
Signed-off-by: Adrian Reber <areber@redhat.com>
MOVE_MOUNT will fail under certain situations.
You are not allowed to MS_MOVE if the parent directory is shared.
man mount
...
The move operation
Move a mounted tree to another place (atomically). The call is:
mount --move olddir newdir
This will cause the contents which previously appeared under olddir to
now be accessible under newdir. The physical location of the files is
not changed. Note that olddir has to be a mountpoint.
Note also that moving a mount residing under a shared mount is invalid
and unsupported. Use findmnt -o TARGET,PROPAGATION to see the current
propagation flags.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
since runc don't manage net device and their configuration, checkpoint
also don't dump net namespace by default, so set 'nsmask = unix.CLONE_NEWNET'
by default in restore. Or if user do not pass 'empty-ns network', criu will
cost extra time in restore.
Signed-off-by: Ace-Tang <aceapril@126.com>
For criu v3.10, a patch is needed for `@test "checkpoint --lazy-pages and restore"`.
Starting with v3.11, the patch will no longer be needed.
The issue had not been caught in Travis because the kernel is too old and the test
had not been executed in Travis.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
RunC doesn't manage network devices and their configuration,
so it is impossible to describe external dependencies to restore them
back.
This means that all users have to set --empty-ns network, so let's do
this by default.
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Upstream renamed the feature check for lazy migration support from
'lazy_pages' to 'uffd'. The lazy migration test case was therefore
not running at all. This enables the lazy migration test case in runc
again.
The test will, however, not run in travis as the kernel is too old.
But it works again locally.
Signed-off-by: Adrian Reber <areber@redhat.com>
This should fix the following (very legitimate) warnings on static
build:
> /tmp/go-link-818454663/000019.o: In function `mygetgrouplist':
> /usr/lib/go-1.10/src/os/user/getgrouplist_unix.go:15: warning: Using
> 'getgrouplist' in statically linked applications requires at runtime the
> shared libraries from the glibc version used for linking
>
> /tmp/go-link-818454663/000018.o: In function `mygetgrgid_r':
> /usr/lib/go-1.10/src/os/user/cgo_lookup_unix.go:38: warning: Using
> 'getgrgid_r' in statically linked applications requires at runtime the
> shared libraries from the glibc version used for linking
>
> ...
as well as segfaults in the resulting binary.
For more details, check https://github.com/golang/go/issues/23265
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This will help runc's init to not spawn many threads on large systems when
launched with max procs by the caller.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Users can get very confused by how terminals work with runc, and the
quite confusing "terminal: ..." option. Add a document which goes
through all of the important parts of terminal handling in runc, in the
hopes that we can just point people to this as an explanation.
Signed-off-by: Avi Deitcher <avi@deitcher.net>
[cyphar: quite a large rewrite to fix factual errors and structure]
Co-authored-by: Avi Deitcher <avi@deitcher.net>
Signed-off-by: Aleksa Sarai <asarai@suse.de>