According to the XDG specification[1], in order to avoid the possibility of
our container states being auto-pruned every 6 hours we need to set the
sticky bit. Rather than handling all of the users of --root, we just
create the directory and set the sticky bit during detection, as it's
not expensive.
[1]: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Previously if oomScoreAdj was not set in config.json we would implicitly
set oom_score_adj to 0. This is not allowed according to the spec:
> If oomScoreAdj is not set, the runtime MUST NOT change the value of
> oom_score_adj.
Change this so that we do not modify oom_score_adj if oomScoreAdj is not
present in the configuration. While this modifies our internal
configuration types, the on-disk format is still compatible.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
In some cases, /sys/fs/cgroups is mounted read-only. In rootless
containers we can consider this effectively identical to having cgroups
that we don't have write permission to -- because the user isn't
responsible for the read-only setup and cannot modify it. The rules are
identical to when /sys/fs/cgroups is not writable by the unprivileged
user.
An example of this is the default configuration of Docker, where cgroups
are mounted as read-only as a preventative security measure.
Reported-by: Vladimir Rutsky <rutsky@google.com>
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Currently if a confined container process tries to list these directories
AVC's are generated because they are labeled with external labels. Adding
the mountlabel will remove these AVC's.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Currently Manager accepts nil cgroups when calling Apply, but it will panic then trying to call Destroy with the same config.
Signed-off-by: Denys Smirnov <denys@sourced.tech>
The "shell" rule in the Makefile uses docker to run a bash session,
however it was depending on the "all" rule which assumes non-docker local
development. This commit fixes it by making it depend on the "runcimage" rule.
Signed-off-by: Tibor Vass <tibor@docker.com>
The function is called even if the usernamespace is not set.
This results having wrong uid/gid set on devices.
This fix add a test to check if usernamespace is set befor calling
setupUserNamespace.
Fixes#1742
Signed-off-by: Julien Lavesque <julien.lavesque@gmail.com>
The current version of criu bundled in dockerfile failed to do checkpoint/restore test on my
system (v4.14.14). Upgrade to latest version v3.7 and also change the
repository name to point to the current official repo.
Signed-off-by: Daniel Dao <dqminh89@gmail.com>
This fixes a bug in the console package for big-endian architectures.
When creating a new pty the returned path to the new pty slave was
wrong for the second und all subsequent ptys.
In runc the exec subcommand failed with an runtime error such as
`container_linux.go:265: starting container process caused "open
/dev/pts/4294967296: no such file or directory"`.
The number is shifted by 32.
Signed-off-by: Peter Morjan <peter.morjan@de.ibm.com>
gocapability has supported 0 as "the current PID" since
syndtr/gocapability@5e7cce49 (Allow to use the zero value for pid to
operate with the current task, 2015-01-15, syndtr/gocapability#2).
libcontainer was ported to that approach in 444cc298 (namespaces:
allow to use pid namespace without mount namespace, 2015-01-27,
docker/libcontainer#358), but the change was clobbered by 22df5551
(Merge branch 'master' into api, 2015-02-19, docker/libcontainer#388)
which landed via 5b73860e (Merge pull request #388 from docker/api,
2015-02-19, docker/libcontainer#388). This commit restores the
changes from 444cc298.
Signed-off-by: W. Trevor King <wking@tremily.us>
If 'go-md2man' is not installed,
an error can occur when running md2man-all.sh like below:
$ ./man/md2man-all.sh -q
./man/md2man-all.sh: line 21: go-md2man: command not found
So fix it.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
... that prevent sending signals not mentioned in signal map.
Currently these are SIGRTMIN..SIGRTMAX.
Signed-off-by: Valentin Kulesh <valentin.kulesh@virtuozzo.com>
The helper DRYs up the transition tests and makes it easy to get
complete coverage for invalid transitions.
I'm also using t.Run() for subtests. Run() is new in Go 1.7 [1], but
runc dropped support for 1.6 back in e773f96b (update go version at
travis-ci, 2017-02-20, #1335).
[1]: https://blog.golang.org/subtests
Signed-off-by: W. Trevor King <wking@tremily.us>
Technically, this change should not be necessary, as the kernel
documentation claims that if you call clone(flags|CLONE_NEWUSER), the
new user namespace will be the owner of all other namespaces created in
@flags. Unfortunately this isn't always the case, due to various
additional semantics and kernel bugs.
One particular instance is SELinux, which acts very strangely towards
the IPC namespace and mqueue. If you unshare the IPC namespace *before*
you map a user in the user namespace, the IPC namespace's internal
kern-mount for mqueue will be labelled incorrectly and the container
won't be able to access it. The only way of solving this is to unshare
IPC *after* the user has been mapped and we have changed to that user.
I've also heard of this happening to the NET namespace while talking to
some LXC folks, though I haven't personally seen that issue.
This change matches our handling of user namespaces to be the same as
how LXC handles these problems.
Signed-off-by: Aleksa Sarai <asarai@suse.de>