jasder/runc - runc - 军科开源项目托管

Commit Graph

Author	SHA1	Message	Date
Serge Hallyn	c0ad40c5e6	Do not create devices when in user namespace When we launch a container in a new user namespace, we cannot create devices, so we bind mount the host's devices into place instead. If we are running in a user namespace (i.e. nested in a container), then we need to do the same thing. Add a function to detect that and check for it before doing mknod. Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> --- Changelog - add a comment clarifying what's going on with the uidmap file.	2016-01-08 12:54:08 -08:00
Qiang Huang	9c1242ecba	Add white list for bind mount chec Fixes: #400 It would be useful to use fuse to isolate proc info. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-01-06 14:48:40 +08:00
Alexander Morozov	776791463d	Merge pull request #357 from ashahab-altiscale/350-container-in-container Bind mount device nodes on EPERM	2015-11-16 14:54:02 -08:00
Qiang Huang	96f0eefa1a	Fix comment to be consistent with the code Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2015-11-16 19:16:27 +08:00
Abin Shahab	28c9d0252c	Userns container in containers Enables launching userns containers by catching EPERM errors for writing to devices cgroups, and for mknod invocations. Signed-off-by: Abin Shahab <ashahab@altiscale.com>	2015-11-15 14:42:35 -08:00
Qiang Huang	34cff6f2f3	Correct intuition for setupDev Minor fix, the former setupDev=true means not setup dev, which is contrary to intuition, just correct it. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2015-10-21 16:06:26 +08:00
Antonio Murdaca	c5b80bddf1	bump docker pkgs Docker pkgs were updated while golinting the whole docker code base. Now when trying to bump libcontainer/runc in docker, it fails compiling with the following error: `` vendor/src/github.com/opencontainers/runc/libcontainer/rootfs_linux.go:424: undefined: mount.MountInfo `` This is because, for instance, the mount pkg was updated here `0f5c9d301b (diff-49294d05afa48e2f7c0d2f02c6f7614c)` and now that type is only `mount.Info`. This patch bump docker pkgs commit and adapt code to it. Signed-off-by: Antonio Murdaca <amurdaca@redhat.com>	2015-10-06 10:48:12 +02:00
Vivek Goyal	da8d776c08	Make pivotDir rprivate pivotDir is the one where pivot_root() call puts the old root. We will unmount pivotDir() and delete it. Previously we were making / always rslave or rprivate. That will mean that pivotDir() could never have mounts which would be shared with parent mount namespace. That also means that unmounting pivotDir() was safe and none of the unmount will propagate to parent namespace and unmount things which we did not want to. But now user can specify that apply private, shared, slave on /. That means some of the mounts we inherited from parent could be shared and that also means if we umount pivotDir/, those mounts will get unmounted in parent too. That's not what we want. Instead make pivotDir rprivate so that unmounts don't propagate back to parent. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2015-10-01 17:03:02 -04:00
Vivek Goyal	23ec72a426	Make parent mount of container root private if it is shared. pivot_root() introduces bunch of restrictions otherwise it fails. parent mount of container root can not be shared otherwise pivot_root() will fail. So far parent could not be shared as we marked everything either private or slave. But now we have introduced new propagation modes where parent mount of container rootfs could be shared and pivot_root() will fail. So check if parent mount is shared and if yes, make it private. This will make sure pivot_root() works. Also it will make sure that when we bind mount container rootfs, it does not propagate to parent mount namespace. Otherwise cleanup becomes a problem. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2015-10-01 17:03:02 -04:00
Vivek Goyal	5dd6caf6cf	Replace config.Privatefs with config.RootPropagation Right now config.Privatefs is a boolean which determines if / is applied with propagation flag syscall.MS_PRIVATE \| syscall.MS_REC or not. Soon we want to represent other propagation states like private, [r]slave, and [r]shared. So either we can introduce more boolean variable or keep track of propagation flags in an integer variable. Keeping an integer variable is more versatile and can allow various kind of propagation flags to be specified. So replace Privatefs with RootPropagation which is an integer. Note, this will require changes in docker. Instead of setting Privatefs to true, they will need to set. config.RootPropagation = syscall.MS_PRIVATE \| syscall.MS_REC Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2015-10-01 17:03:02 -04:00
Alexander Morozov	4d5079b9dc	Merge pull request #309 from chenchun/fix_reOpenDevNull Fix reOpenDevNull	2015-09-30 19:06:43 -07:00
Alexander Morozov	fba07bce72	Merge pull request #307 from estesp/no-remount-if-unecessary Only remount if requested flags differ from current	2015-09-30 11:40:06 -07:00
Chun Chen	06d91f546f	Fix reOpenDevNull We should open /dev/null with os.O_RDWR, otherwise it won't be possible writen to it Signed-off-by: Chun Chen <ramichen@tencent.com>	2015-09-30 16:05:49 +08:00
Phil Estes	97f5ee4e6a	Only remount if requested flags differ from current Do not remount a bind mount to enable flags unless non-default flags are provided for the requested mount. This solves a problem with user namespaces and remount of bind mount permissions. Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)	2015-09-29 23:13:04 -04:00
Dan Walsh	cab342f0de	Check for failure on /dev/mqueue and try again without labeling Signed-off-by: Dan Walsh <dwalsh@redhat.com>	2015-09-28 12:31:52 -04:00
Dan Walsh	b4dcb75503	/proc and /sys do not support labeling This is causing docker to crash when --selinux-enforcing mode is set. Signed-off-by: Dan Walsh <dwalsh@redhat.com>	2015-09-28 12:31:52 -04:00
Michael Crosby	203d3e258e	Move mount methods out of configs pkg Do not have methods and actions that require syscalls in the configs package because it breaks cross compile. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2015-09-24 09:43:12 -07:00
Michael Crosby	5765dcd086	Merge pull request #296 from crosbymichael/mount-resolv-symlink Change mount dest after resolving symlinks	2015-09-23 10:21:25 -07:00
Michael Crosby	b3bb606513	Change mount dest after resolving symlinks We need to update the mount's destination after we resolve symlinks so that it properly creates and mounts the correct location. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2015-09-23 10:07:18 -07:00
Antonio Murdaca	d6e6462478	Cleanup unused func arguments Signed-off-by: Antonio Murdaca <runcom@linux.com>	2015-09-21 11:50:29 +02:00
Vivek Goyal	d1f4a5b8b5	libcontainer: Allow passing mount propagation flags Right now if one passes a mount propagation flag in spec file, it does not take effect. For example, try following in spec json file. { "type": "bind", "source": "/root/mnt-source", "destination": "/root/mnt-dest", "options": "rbind,shared" } One would expect that /root/mnt-dest will be shared inside the container but that's not the case. #findmnt -o TARGET,PROPAGATION `-/root/mnt-dest private Reason being that propagation flags can't be passed in along with other regular flags. They need to be passed in a separate call to mount syscall. That too, one propagation flag at a time. (from mount man page). Hence, store propagation flags separately in a slice and apply these in that order after the mount call wherever appropriate. This allows user to control the propagation property of mount point inside the container. Storing them separately also solves another problem where recursive flag (syscall.MS_REC) can get mixed up. For example, options "rbind,private" and "bind,rprivate" will be same and there will be no way to differentiate between these if all the flags are stored in a single integer. This patch would allow one to pass propagation flags "[r]shared,[r]slave, [r]private,[r]unbindable" in spec file as per mount property. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2015-09-16 15:53:23 -04:00
Mrunal Patel	486ac97618	Merge pull request #236 from hqhq/hq_fix_cgroup_rw Always remount for bind mount	2015-09-14 12:08:34 -07:00
Andrey Vagin	da2535f2d1	mount: don't read /proc/self/cgroup many times Signed-off-by: Andrey Vagin <avagin@openvz.org>	2015-09-10 21:00:22 +03:00
Qiang Huang	b7385e291c	Always remount for bind mount Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2015-08-31 11:10:34 +08:00
Michael Crosby	b1e7041957	Merge pull request #165 from calavera/context_labels Make label.Relabel safer.	2015-08-28 14:20:00 -07:00
rajasec	24f7a10a93	Adding securityfs mount Signed-off-by: rajasec <rajasec79@gmail.com>	2015-08-05 16:50:08 +05:30
Mrunal Patel	c9d5850629	Don't make modifications to /dev there are no devices in the configuration Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2015-08-04 16:57:29 -04:00
David Calavera	4bd4d462af	Make label.Relabel safer. - Check if Selinux is enabled before relabeling. This is a bug. - Make exclusion detection constant time. Kinda buggy too, imo. - Do not depend on a magic string to create a new Selinux context. Signed-off-by: David Calavera <david.calavera@gmail.com>	2015-07-31 10:37:32 -07:00
Kir Kolyshkin	6f82d4b544	Simplify and fix os.MkdirAll() usage TL;DR: check for IsExist(err) after a failed MkdirAll() is both redundant and wrong -- so two reasons to remove it. Quoting MkdirAll documentation: > MkdirAll creates a directory named path, along with any necessary > parents, and returns nil, or else returns an error. If path > is already a directory, MkdirAll does nothing and returns nil. This means two things: 1. If a directory to be created already exists, no error is returned. 2. If the error returned is IsExist (EEXIST), it means there exists a non-directory with the same name as MkdirAll need to use for directory. Example: we want to MkdirAll("a/b"), but file "a" (or "a/b") already exists, so MkdirAll fails. The above is a theory, based on quoted documentation and my UNIX knowledge. 3. In practice, though, current MkdirAll implementation [1] returns ENOTDIR in most of cases described in #2, with the exception when there is a race between MkdirAll and someone else creating the last component of MkdirAll argument as a file. In this very case MkdirAll() will indeed return EEXIST. Because of #1, IsExist check after MkdirAll is not needed. Because of #2 and #3, ignoring IsExist error is just plain wrong, as directory we require is not created. It's cleaner to report the error now. Note this error is all over the tree, I guess due to copy-paste, or trying to follow the same usage pattern as for Mkdir(), or some not quite correct examples on the Internet. [1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go Signed-off-by: Kir Kolyshkin <kir@openvz.org>	2015-07-29 18:03:27 -07:00
Alexander Morozov	d89964eed3	Remount /sys/fs/cgroup as RO if MS_RDONLY was passed in m.Flags Signed-off-by: Alexander Morozov <lk4d4@docker.com>	2015-07-22 11:05:40 -07:00
Alexander Morozov	d3217084b5	Create symlinks for merged cgroups This allows software be not aware about existence of merged cgroups. Signed-off-by: Alexander Morozov <lk4d4@docker.com>	2015-07-20 16:12:28 -07:00
Andrey Vagin	af4a5e708a	ct: give criu informations about cgroup mounts Actually cgroup mounts are bind-mounts, so they should be handled by the same way. Reported-by: Ross Boucher <rboucher@gmail.com> Signed-off-by: Andrey Vagin <avagin@openvz.org>	2015-07-20 22:56:07 +03:00
Mrunal Patel	5b805276c2	Revert "Remount /sys/fs/cgroup as readonly always" This reverts commit `18de1a273e`. Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2015-07-17 17:50:46 -04:00
Alexander Morozov	18de1a273e	Remount /sys/fs/cgroup as readonly always Signed-off-by: Alexander Morozov <lk4d4@docker.com>	2015-07-17 12:45:09 -07:00
Alexander Morozov	40b9b89107	Substract bindmount path from cgroup dir Signed-off-by: Alexander Morozov <lk4d4@docker.com>	2015-07-15 10:41:25 -07:00
Mrunal Patel	42aa891a6b	Merge pull request #91 from hqhq/hq_add_cgroup_mount Add cgroup mount in the recommended config	2015-07-15 09:51:24 -07:00
Qiang Huang	d7181a73e4	Add cgroup mount in the recommended config And allow cgroup mount take flags from user configs. As we show ro in the recommendation, so hard-coded read-only flag should be removed. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2015-07-15 09:31:39 +08:00
Qiang Huang	b1fd78346e	Correct tmpfs mount for cgroup Fixes: https://github.com/docker/docker/issues/14543 Fixes: https://github.com/docker/docker/pull/14610 Before this, we got mount info in container: ``` sysfs /sys sysfs ro,seclabel,nosuid,nodev,noexec,relatime 0 0 /sys/fs/cgroup tmpfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset 0 0 ``` It has no mount source, so in `parseInfoFile` in Docker code, we'll get: ``` Error found less than 3 fields post '-' in "84 83 0:41 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs rw,seclabel" ``` After this fix, we have mount info corrected: ``` sysfs /sys sysfs ro,seclabel,nosuid,nodev,noexec,relatime 0 0 tmpfs /sys/fs/cgroup tmpfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset 0 0 ``` Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2015-07-15 09:09:09 +08:00
Michael Crosby	080df7ab88	Update import paths for new repository Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2015-06-21 19:29:59 -07:00
Michael Crosby	8f97d39dd2	Move libcontainer into subdirectory Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2015-06-21 19:29:15 -07:00

40 Commits