Namely, use an undocumented feature of pivot_root(2) where
pivot_root(".", ".") is actually a feature and allows you to make the
old_root be tied to your /proc/self/cwd in a way that makes unmounting
easy. Thanks a lot to the LXC developers which came up with this idea
first.
This is the first step of many to allowing runC to work with a
completely read-only rootfs.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
We need support for read/only mounts in SELinux to allow a bunch of
containers to share the same read/only image. In order to do this
we need a new label which allows container processes to read/execute
all files but not write them.
Existing mount label is either shared write or private write. This
label is shared read/execute.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
At some point InitLabels was changed to look for SecuritOptions
separated by a ":" rather then an "=", but DupSecOpt was never
changed to match this default.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
If copyup is specified for a tmpfs mount, then the contents of the
underlying directory are copied into the tmpfs mounted over it.
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
In order to mount root filesystems inside the container's mount
namespace as part of the spec we need to have the ability to do a bind
mount to / as the destination.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Since Linux 4.3 ambient capabilities are available. If set these allow unprivileged child
processes to inherit capabilities, while at present there is no means to set capabilities
on non root processes, other than via filesystem capabilities which are not usually
supported in image formats.
With ambient capabilities non root processes can be given capabilities as well, and so
the main reason to use root in containers goes away, and capabilities work as expected.
The code falls back to the existing behaviour if ambient capabilities are not supported.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
grep -r "range map" showw 3 parts use map to
range enum types, use slice instead can get
better performance and less memory usage.
Signed-off-by: Peng Gao <peng.gao.dut@gmail.com>
For example, the /sys/firmware directory should be masked because it can contain some sensitive files:
- /sys/firmware/acpi/tables/{SLIC,MSDM}: Windows license information:
- /sys/firmware/ibft/target0/chap-secret: iSCSI CHAP secret
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
cgroupData.join method using `WriteCgroupProc` to place the pid into
the proc file, it can avoid attach any pid to the cgroup if -1 is
specified as a pid.
so, replace `writeFile` with `WriteCgroupProc` like `cpuset.go`'s
ApplyDir method.
Signed-off-by: Wang Long <long.wanglong@huawei.com>
if a container state is running or created, the container.Pause()
method can set the state to pausing, and then paused.
this patch update the comment, so it can be consistent with the code.
Signed-off-by: Wang Long <long.wanglong@huawei.com>
Currently if a user does a command like
docker: Error response from daemon: operation not supported.
With this fix they should see a much more informative error message.
docker run -ti -v /proc:/proc:Z fedora sh
docker: Error response from daemon: SELinux Relabeling of /proc is not allowed: operation not supported.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>