jasder/runc - runc - 军科开源项目托管

Commit Graph

Author	SHA1	Message	Date
Mrunal Patel	a00bf01908	Merge pull request #1862 from AkihiroSuda/decompose-rootless-pr Disable rootless mode except RootlessCgMgr when executed as the root in userns (fix Docker-in-LXD regression)	2018-10-15 17:32:15 -07:00
Dominik Süß	0b412e9482	various cleanups to address linter issues Signed-off-by: Dominik Süß <dominik@suess.wtf>	2018-10-13 21:14:03 +02:00
Adrian Reber	0d01164756	Fix travis Go: tip This fixes libcontainer/container_linux.go:1200: Error call has possible formatting directive %s Signed-off-by: Adrian Reber <areber@redhat.com>	2018-10-13 10:44:07 +00:00
Aleksa Sarai	e40d4635c4	merge branch 'pr-1894' Move spec.Linux.IntelRdt check to spec.Linux != nil block LGTMs: @crosbymichael @cyphar Closes #1894	2018-10-09 02:41:13 +11:00
Jonathan Marler	1499c746a1	Move spec.Linux.IntelRdt check to spec.Linux != nil block Signed-off-by: Jonathan Marler <johnnymarler@gmail.com>	2018-10-04 21:30:55 -06:00
Mike Brown	26bdc0dce7	clarify license information Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2018-10-03 10:39:44 -05:00
Mrunal Patel	2abd837c8c	Merge pull request #1893 from cyphar/keyctl-ignore-enosys keyring: handle ENOSYS with keyctl(KEYCTL_JOIN_SESSION_KEYRING)	2018-09-25 13:35:16 -07:00
Danail Branekov	a1d5398afa	Respect container's cgroup path Respect the container's cgroup path when finding the container's cgroup mount point, which is useful in multi-tenant environments, where containers have their own unique cgroup mounts Signed-off-by: Danail Branekov <danailster@gmail.com> Signed-off-by: Oliver Stenbom <ostenbom@pivotal.io> Signed-off-by: Giuseppe Capizzi <gcapizzi@pivotal.io>	2018-09-25 17:43:36 +01:00
Aleksa Sarai	578fe65e4f	merge branch 'pr-1817' Fix duplicate entries and missing entries in getCgroupMountsHelper Add test for testing cgroup mounts on bedrock linux Stop relying on number of subsystems for cgroups LGTMs: @crosbymichael @cyphar Closes #1817	2018-09-19 19:48:17 +10:00
Michael Crosby	cc8146cf93	Merge pull request #1858 from marcov/nsenter-README Update outdated nsenter README content	2018-09-17 10:53:19 -04:00
Michael Crosby	d77251d5fc	Merge pull request #1892 from Ace-Tang/add_clean_test test: add more test case for CleanPath	2018-09-17 10:51:17 -04:00
Aleksa Sarai	40f1468413	keyring: handle ENOSYS with keyctl(KEYCTL_JOIN_SESSION_KEYRING) While all modern kernels (and I do mean _all_ of them -- this syscall was added in 2.6.10 before git had begun development!) have support for this syscall, LXC has a default seccomp profile that returns ENOSYS for this syscall. For most syscalls this would be a deal-breaker, and our use of session keyrings is security-based there are a few mitigating factors that make this change not-completely-insane: * We already have a flag that disables the use of session keyrings (for older kernels that had system-wide keyring limits and so on). So disabling it is not a new idea. * While the primary justification of using session keys is security-based, it's more of a security-by-obscurity protection. The main defense keyrings have is VFS credentials -- which is something that users already have better security tools for (setuid(2) and user namespaces). * Given the security justification you might argue that we shouldn't silently ignore this. However, the only way for the kernel to return -ENOSYS is either being ridiculously old (at which point we wouldn't work anyway) or that there is a seccomp profile in place blocking it. Given that the seccomp profile (if malicious) could very easily just return 0 or a silly return code (or something even more clever with seccomp-bpf) and trick us without this patch, there isn't much of a significant change in how much seccomp can trick us with or without this patch. Given all of that over-analysis, I'm pretty convinced there isn't a security problem in this very specific case and it will help out the ChromeOS folks by allowing Docker to run inside their LXC container setup. I'd be happy to be proven wrong. Ref: https://bugs.chromium.org/p/chromium/issues/detail?id=860565 Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-09-17 21:38:30 +10:00
Ace-Tang	5963cf2afc	test: add more test case for CleanPath Signed-off-by: Ace-Tang <aceapril@126.com>	2018-09-14 21:37:12 +08:00
Akihiro Suda	06f789cf26	Disable rootless mode except RootlessCgMgr when executed as the root in userns This PR decomposes `libcontainer/configs.Config.Rootless bool` into `RootlessEUID bool` and `RootlessCgroups bool`, so as to make "runc-in-userns" to be more compatible with "rootful" runc. `RootlessEUID` denotes that runc is being executed as a non-root user (euid != 0) in the current user namespace. `RootlessEUID` is almost identical to the former `Rootless` except cgroups stuff. `RootlessCgroups` denotes that runc is unlikely to have the full access to cgroups. `RootlessCgroups` is set to false if runc is executed as the root (euid == 0) in the initial namespace. Otherwise `RootlessCgroups` is set to true. (Hint: if `RootlessEUID` is true, `RootlessCgroups` becomes true as well) When runc is executed as the root (euid == 0) in an user namespace (e.g. by Docker-in-LXD, Podman, Usernetes), `RootlessEUID` is set to false but `RootlessCgroups` is set to true. So, "runc-in-userns" behaves almost same as "rootful" runc except that cgroups errors are ignored. This PR does not have any impact on CLI flags and `state.json`. Note about CLI: * Now `runc --rootless=(auto\|true\|false)` CLI flag is only used for setting `RootlessCgroups`. * Now `runc spec --rootless` is only required when `RootlessEUID` is set to true. For runc-in-userns, `runc spec` without `--rootless` should work, when sufficient numbers of UID/GID are mapped. Note about `$XDG_RUNTIME_DIR` (e.g. `/run/user/1000`): * `$XDG_RUNTIME_DIR` is ignored if runc is being executed as the root (euid == 0) in the initial namespace, for backward compatibility. (`/run/runc` is used) * If runc is executed as the root (euid == 0) in an user namespace, `$XDG_RUNTIME_DIR` is honored if `$USER != "" && $USER != "root"`. This allows unprivileged users to allow execute runc as the root in userns, without mounting writable `/run/runc`. Note about `state.json`: * `rootless` is set to true when `RootlessEUID == true && RootlessCgroups == true`. Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-09-07 15:05:03 +09:00
Yan Zhu	feb90346e0	doc: fix typo Signed-off-by: Yan Zhu <yanzhu@alauda.io>	2018-09-07 11:58:59 +08:00
Michael Crosby	70ca035aa6	Merge pull request #1883 from lifubang/containeridinpath fix delete other file bug when container id is ..	2018-09-05 13:43:21 -04:00
Mrunal Patel	9cda583235	Merge pull request #1832 from giuseppe/runc-drop-invalid-proc-destination-with-chroot linux: drop check for /proc as invalid dest	2018-09-04 09:26:21 -07:00
Lifubang	4eb30fcdbe	code optimization: use securejoin.SecureJoin and CleanPath Signed-off-by: Lifubang <lifubang@acmcoder.com>	2018-09-04 09:02:18 +08:00
Lifubang	4fae8fcce2	code optimization after review Signed-off-by: Lifubang <lifubang@acmcoder.com>	2018-09-03 23:27:31 +08:00
Lifubang	d2d226e8f9	fix unexpected delete bug when container id is .. Signed-off-by: Lifubang <lifubang@acmcoder.com>	2018-08-31 11:17:42 +08:00
ChangFeng	3ce8fac7c4	libcontainer: add /proc/loadavg to the white list of bind mount Signed-off-by: JunLi <lijun.git@gmail.com>	2018-08-30 21:30:23 +08:00
Giuseppe Scrivano	636b664027	linux: drop check for /proc as invalid dest it is now allowed to bind mount /proc. This is useful for rootless containers when the PID namespace is shared with the host. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2018-08-30 09:56:18 +02:00
Akihiro Suda	b34d6d8a7c	libcontainer: CurrentGroupSubGIDs -> CurrentUserSubGIDs subgid is defined per user, not group (see subgid(5)) This commit also adds support for specifying subuid owner with a numeric UID. Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-08-29 07:46:03 +09:00
Michael Crosby	1555a78945	Merge pull request #1874 from mrunalp/drop_unused_code Remove unused veth setup code	2018-08-27 11:07:25 -04:00
Qiang Huang	0228707b77	Merge pull request #1873 from rhatdan/ms_move When doing a copyup, /tmp can not be a shared mount point	2018-08-27 10:08:53 +08:00
Mrunal Patel	fe3d5c4c6e	Remove unused veth setup code Networking is setup by plugins for users of runc so it makes sense to get rid of the veth strategy. Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2018-08-24 15:41:52 -07:00
Adrian Reber	fa43a72aba	criu: restore into existing namespace when specified Using CRIU to checkpoint and restore a container into an existing network namespace is not possible. If the network namespace is defined like { "type": "network", "path": "/run/netns/test" } there is the expectation that the restored container is again running in the network namespace specified with 'path'. This adds the new CRIU 'external namespace' feature to runc, where during checkpointing that specific namespace is referenced and during restore CRIU tries to restore the container in exactly that namespace. This breaks/fixes current runc behavior. If, without this patch, runc restores a container with such a network namespace definition, it is ignored and CRIU recreates a network namespace without a name. With this patch runc uses the network namespace path (if available) to checkpoint and restore the container in just that network namespace. Restore will now fail if a container was checkpointed with a network namespace path set and if that network namespace path does not exist during restore. runc still falls back to the old behavior if CRIU older than 3.11 is installed. Fixes #1786 Related to https://github.com/projectatomic/libpod/pull/469 Thanks to Andrei Vagin for all the help in getting the interface between CRIU and runc right! Signed-off-by: Adrian Reber <areber@redhat.com>	2018-08-22 23:27:20 +02:00
Daniel J Walsh	62a4763a7a	When doing a copyup, /tmp can not be a shared mount point MOVE_MOUNT will fail under certain situations. You are not allowed to MS_MOVE if the parent directory is shared. man mount ... The move operation Move a mounted tree to another place (atomically). The call is: mount --move olddir newdir This will cause the contents which previously appeared under olddir to now be accessible under newdir. The physical location of the files is not changed. Note that olddir has to be a mountpoint. Note also that moving a mount residing under a shared mount is invalid and unsupported. Use findmnt -o TARGET,PROPAGATION to see the current propagation flags. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2018-08-20 17:41:06 -04:00
Aleksa Sarai	20aff4f048	merge branch 'pr-1867' Revert "libcontainer/rootfs_linux: minor cleanup" LGTMs: @hqhq @cyphar Closes #1867	2018-08-15 15:42:56 +10:00
Mrunal Patel	26ec8a9783	Revert "libcontainer/rootfs_linux: minor cleanup" This reverts commit `1b27db67f1`. Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2018-08-14 15:50:18 -07:00
Marco Vedovati	34ed62697b	Update outdated nsenter README content Signed-off-by: Marco Vedovati <mvedovati@suse.com>	2018-08-07 17:53:56 +02:00
Michael Crosby	4056a41f58	Merge pull request #1830 from crosbymichael/procs Pass GOMAXPROCS to init processes	2018-08-01 10:48:06 -04:00
Jay Kamat	a2faaa1317	Fix duplicate entries and missing entries in getCgroupMountsHelper Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2018-07-31 20:12:18 -07:00
Alban Crequy	3321aa1af7	Fix regression with mounts with non-absolute source path PR #1753 introduced a test on the mount flags but the binary operator was wrong, see https://github.com/opencontainers/runc/pull/1753#discussion_r203445652 This was noticed when investigating https://github.com/opencontainers/runtime-tools/issues/651 Symptoms: in the container, /proc/self/mountinfo displays some mounts as follow: 296 279 0:67 / /tmp rw,nosuid - tmpfs /home/dpark/go/src/github.com/opencontainers/runc/tmpfs rw,size=65536k,mode=755 Signed-off-by: Alban Crequy <alban@kinvolk.io>	2018-07-18 18:30:49 +02:00
Michael Crosby	53fddb540a	Pass GOMAXPROCS to init processes This will help runc's init to not spawn many threads on large systems when launched with max procs by the caller. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2018-06-26 11:23:37 -04:00
Michael Crosby	2c632d1a2d	Merge pull request #1824 from cyphar/fix-mips-build-devNumber libcontainer: devices: fix mips builds	2018-06-25 13:21:28 -04:00
Jay Kamat	e5a7c61f3c	Add test for testing cgroup mounts on bedrock linux Add a mountinfo from a bedrock linux system with 4 strata, and include it for tests Signed-off-by: Jay Kamat <jaygkamat@gmail.com> Signed-off-by: Daniel Dao <dqminh89@gmail.com>	2018-06-24 00:01:07 +01:00
Daniel Dao	5ee0648bfb	Stop relying on number of subsystems for cgroups When there are complicated mount setups, there can be multiple mount points which have the subsystem we are looking for. Instead of counting the mountpoints, tick off subsystems until we have found them all. Without the 'all' flag, ignore duplicate subsystems after the first. Signed-off-by: Daniel Dao <dqminh89@gmail.com>	2018-06-24 00:00:58 +01:00
Aleksa Sarai	823c06eae9	libcontainer: improve "kernel.{domainname,hostname}" sysctl handling These sysctls are namespaced by CLONE_NEWUTS, and we need to use "kernel.domainname" if we want users to be able to set an NIS domainname on Linux. However we disallow "kernel.hostname" because it would conflict with the "hostname" field and cause confusion (but we include a helpful message to make it clearer to the user). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-06-18 21:48:04 +10:00
Aleksa Sarai	a0e99e7a1a	libcontainer: devices: fix mips builds It turns out that MIPS uses uint32 in the device number returned by stat(2), so explicitly wrap everything to make the compiler happy. I really wish that Go had C-like numeric type promotion. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-06-17 11:22:01 +10:00
Mrunal Patel	ad0f525506	Merge pull request #1819 from tiborvass/fix-arm32bit libcontainer: fix compilation on GOARCH=arm GOARM=6 (32 bits)	2018-06-15 07:06:50 -07:00
Tibor Vass	c205e9fb64	libcontainer: fix compilation on GOARCH=arm GOARM=6 (32 bits) This fixes the following compilation error on 32bit ARM: ``` $ GOARCH=arm GOARCH=6 go build ./libcontainer/system/ libcontainer/system/linux.go:119:89: constant 4294967295 overflows int ``` Signed-off-by: Tibor Vass <tibor@docker.com>	2018-06-14 18:33:14 +00:00
Giuseppe Scrivano	cbcc85d311	runc: not require uid/gid mappings if euid()==0 When running in a new unserNS as root, don't require a mapping to be present in the configuration file. We are already skipping the test for a new userns to be present. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2018-06-12 12:45:54 +02:00
Daniel J Walsh	aa3fee6c80	SELinux labels are tied to the thread We need to lock the threads for the SetProcessLabel to work, should also call SetProcessLabel("") after the container starts to go back to the default SELinux behaviour. Once you call SetProcessLabel, then any process executed by runc will run with this label, even if the process is for setup rather then the container. It is always safest to call the SELinux calls just before the exec of the container, so that other processes do not get started with the incorrect label. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2018-06-11 08:34:58 -04:00
Aleksa Sarai	dd56ece823	merge branch 'pr-1812' Fix race in runc exec LGTMs: @dqminh @cyphar Closes #1812	2018-06-04 19:02:33 +10:00
Daniel, Dao Quang Minh	2e91544060	Merge pull request #1806 from cyphar/cgroup-ignorable-error-fixup cgroup: clean up isIgnorableError for skippable EROFS	2018-06-02 23:57:02 +01:00
Mrunal Patel	bd3c4f844a	Fix race in runc exec There is a race in runc exec when the init process stops just before the check for the container status. It is then wrongly assumed that we are trying to start an init process instead of an exec process. This commit add an Init field to libcontainer Process to distinguish between init and exec processes to prevent this race. Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2018-06-01 16:25:58 -07:00
Michael Crosby	0e561642f8	Merge pull request #1688 from AkihiroSuda/unshare-m-r main: support rootless mode in userns	2018-05-29 15:41:17 -04:00
Aleksa Sarai	939d5a3753	cgroup: clean up isIgnorableError for skippable EROFS Include a rootless argument for isIgnorableError to avoid people accidentally using isIgnorableError when they shouldn't (we don't ignore any errors when running as root as that really isn't safe). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-05-25 11:31:41 +10:00
Qiang Huang	dd67ab10d7	Merge pull request #1759 from cyphar/rootless-erofs-as-eperm rootless: cgroup: treat EROFS as a skippable error	2018-05-25 09:24:16 +08:00
Daniel, Dao Quang Minh	2e931185f9	Merge pull request #1805 from derekwaynecarr/systemd-cpuquota-fix fix systemd cpu quota for -1	2018-05-24 11:24:27 +01:00
Akihiro Suda	c93815738a	libcontainer: remove extra CAP_SETGID check for SetgroupAttr Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-05-24 14:59:30 +09:00
Derek Carr	b515963c10	systemd cpu quota ignores -1 Signed-off-by: Derek Carr <decarr@redhat.com>	2018-05-23 14:28:39 -04:00
Michael Crosby	fd0febd3ce	Wrap error messages during init Fixes #1437 Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2018-05-10 10:28:10 -04:00
Akihiro Suda	f103de57ec	main: support rootless mode in userns Running rootless containers in userns is useful for mounting filesystems (e.g. overlay) with mapped euid 0, but without actual root privilege. Usage: (Note that `unshare --mount` requires `--map-root-user`) user$ mkdir lower upper work rootfs user$ curl http://dl-cdn.alpinelinux.org/alpine/v3.7/releases/x86_64/alpine-minirootfs-3.7.0-x86_64.tar.gz \| tar Cxz ./lower \|\| ( true; echo "mknod errors were ignored" ) user$ unshare --mount --map-root-user mappedroot# runc spec --rootless mappedroot# sed -i 's/"readonly": true/"readonly": false/g' config.json mappedroot# mount -t overlay -o lowerdir=./lower,upperdir=./upper,workdir=./work overlayfs ./rootfs mappedroot# runc run foo Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-05-10 12:16:43 +09:00
Akihiro Suda	9c7d8bc1fd	libcontainer: add parser for /etc/sub{u,g}id and /proc/PID/{u,g}id_map Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-05-10 12:16:43 +09:00
Mrunal Patel	0cbfd8392f	Merge pull request #1562 from cyphar/carry-975-959-ipc-uid-namespaces nsenter: improve namespace creation and SELinux IPC handling	2018-04-26 14:12:33 -07:00
Mrunal Patel	871ba2e58e	Merge pull request #1781 from filbranden/systemd3 Make channel for StartTransientUnit buffered	2018-04-24 11:56:34 -07:00
Michael Crosby	bdbb9fab07	Merge pull request #1693 from AkihiroSuda/leave-setgroups-allow libcontainer: allow setgroup in rootless mode	2018-04-24 11:24:04 -04:00
Michael Crosby	1f11dc5dba	Merge pull request #1785 from dlorenc/seccomp Make the setupSeccomp function public.	2018-04-19 16:00:54 -04:00
Mrunal Patel	63e6708c74	Merge pull request #1784 from pierrchen/master libcontainer/rootfs_linux: minor cleanup	2018-04-17 17:02:10 -07:00
dlorenc	40680b2d37	Make the setupSeccomp function public. This function is useful for converting from the OCI spec format to the one used by runC/libcontainer. Signed-off-by: dlorenc <lorenc.d@gmail.com>	2018-04-17 10:47:22 -07:00
Michael Crosby	d56f6cc202	Merge pull request #1753 from wking/do-not-require-bind-mount-type libcontainer/specconv/spec_linux: Support empty 'type' for bind mounts	2018-04-16 11:01:53 -04:00
Bin Chen	1b27db67f1	libcontainer/rootfs_linux: minor cleanup move variable close to where is used Signed-off-by: Bin Chen <nk@devicu.com>	2018-04-16 22:25:48 +10:00
Filipe Brandenburger	165ee45334	Make channel for StartTransientUnit buffered So that, if a timeout happens and we decide to stop blocking on the operation, the writer will not block when they try to report the result of the operation. This should address Issue #1780 and it's a follow up for PR #1683, PR #1754 and PR #1772. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2018-04-14 08:49:50 -07:00
Michael Crosby	f753f300ae	Merge pull request #1779 from runcom/gcc8-fix nsexec.c: fix GCC 8 warning	2018-04-12 12:13:43 -04:00
Michael Crosby	9f0eca2a94	Merge pull request #1777 from nalind/no-config-for-extant-netns Only configure networking when creating a net ns	2018-04-12 10:55:02 -04:00
Antonio Murdaca	1a5064622c	nsexec.c: fix GCC 8 warning Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2018-04-12 12:25:06 +02:00
Nalin Dahyabhai	4521d4b19c	Only configure networking when creating a net ns When joining an existing namespace, don't default to configuring a loopback interface in that namespace. Its creator should have done that, and we don't want to fail to create the container when we don't have sufficient privileges to configure the network namespace. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>	2018-04-11 13:28:19 -04:00
Filipe Brandenburger	0e16bd9b53	Detect whether Delegate is available on both slices and scopes Starting with systemd 237, in preparation for cgroup v2, delegation is only now available for scopes, not slices. Update libcontainer code to detect whether delegation is available on both and use that information when creating new slices. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2018-04-10 11:42:55 -07:00
Filipe Brandenburger	8ab251f298	Fix systemd.Apply() to check for DBus error before waiting on a channel. The channel was introduced in #1683 to work around a race condition. However, the check for error in StartTransientUnit ignores the error for an already existing unit, and in that case there will be no notification from DBus (so waiting on the channel will make it hang.) Later PR #1754 added a timeout, which worked around the issue, but we can fix this correctly by only waiting on the channel when there is no error. Fix the code to do so. The timeout handling was kept, since there might be other cases where this situation occurs (https://bugzilla.redhat.com/show_bug.cgi?id=1548358 mentions calling this code from inside a container, it's unclear whether an existing container was in use or not, so not sure whether this would have fixed that bug as well.) Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2018-04-09 11:51:59 -07:00
Sebastien Boeuf	985628dda0	libcontainer: Don't set container state to running when exec'ing There is no reason to set the container state to "running" as a temporary value when exec'ing a process on a container in "created" state. The problem doing this is that consumers of the libcontainer library might use it by keeping pointers in memory. In this case, the container state will indicate that the container is running, which is wrong, and this will end up with a failure on the next action because the check for the container state transition will complain. Fixes #1767 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-03-30 09:29:18 -07:00
Akihiro Suda	73f3dc6389	libcontainer: allow setgroup in rootless mode Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-03-27 17:42:05 +09:00
Akihiro Suda	ed58366cc8	libcontainer: fix Boolmsg alignment Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-03-26 14:44:03 +09:00
Tamal Saha	58415b4b12	Fix error message Signed-off-by: Tamal Saha <tamal@appscode.com>	2018-03-21 20:52:09 -07:00
Aleksa Sarai	fd3a6e6c83	libcontainer: handle unset oomScoreAdj corectly Previously if oomScoreAdj was not set in config.json we would implicitly set oom_score_adj to 0. This is not allowed according to the spec: > If oomScoreAdj is not set, the runtime MUST NOT change the value of > oom_score_adj. Change this so that we do not modify oom_score_adj if oomScoreAdj is not present in the configuration. While this modifies our internal configuration types, the on-disk format is still compatible. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-03-17 13:53:42 +11:00
Aleksa Sarai	03e585985f	rootless: cgroup: treat EROFS as a skippable error In some cases, /sys/fs/cgroups is mounted read-only. In rootless containers we can consider this effectively identical to having cgroups that we don't have write permission to -- because the user isn't responsible for the read-only setup and cannot modify it. The rules are identical to when /sys/fs/cgroups is not writable by the unprivileged user. An example of this is the default configuration of Docker, where cgroups are mounted as read-only as a preventative security measure. Reported-by: Vladimir Rutsky <rutsky@google.com> Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-03-17 13:53:42 +11:00
Daniel J Walsh	43aea05946	Label the masked tmpfs with the mount label Currently if a confined container process tries to list these directories AVC's are generated because they are labeled with external labels. Adding the mountlabel will remove these AVC's. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2018-03-09 14:29:06 -05:00
Qiang Huang	9facb87f87	Merge pull request #1754 from vikaschoudhary16/add-timeout Add timeout while waiting for StartTransinetUnit completion signal	2018-03-08 09:09:34 +08:00
W. Trevor King	0aa6e4e5d3	libcontainer/specconv/spec_linux: Support empty 'type' for bind mounts From the "Creating a bind mount" section of mount(2) [1]: > If mountflags includes MS_BIND (available since Linux 2.4), then > perform a bind mount... > > The filesystemtype and data arguments are ignored. This commit adds support for configurations that leave the OPTIONAL type [2] unset for bind mounts. There's a related spec-example change in flight with [3], although my personal preference would be a more explicit spec for the whole mount structure [4]. [1]: http://man7.org/linux/man-pages/man2/mount.2.html [2]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/config.md#L102 [3]: https://github.com/opencontainers/runtime-spec/pull/954 [4]: https://github.com/opencontainers/runtime-spec/pull/771 Signed-off-by: W. Trevor King <wking@tremily.us>	2018-03-07 10:23:42 -08:00
vikaschoudhary16	04e95b526d	Add timeout while waiting for StartTransinetUnit completion signal from dbus Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>	2018-03-07 05:11:38 -05:00
Denys Smirnov	3d26fc3fd7	cgroups/fs: fix NPE on Destroy than no cgroups are set Currently Manager accepts nil cgroups when calling Apply, but it will panic then trying to call Destroy with the same config. Signed-off-by: Denys Smirnov <denys@sourced.tech>	2018-03-06 23:31:31 +01:00
Vincent Batts	bf74951617	libcontainer/user: platform dependent calls This rearranges a bit of the user and group lookup, such that only a basic subset is exposed. Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2018-02-28 14:14:24 -05:00
Aleksa Sarai	757e78bebd	merge branch 'pr-1743' The setupUserNamespace function is always called. LGTMs: @crosbymichael @mrunalp @cyphar Closes #1743	2018-02-27 12:22:52 +11:00
Michael Crosby	8aca07289d	Merge pull request #1736 from allencloud/fix-lint-warning fix lint error in specconv	2018-02-26 14:21:26 -05:00
ynirk	2420eb1f4d	The setupUserNamespace function is always called. The function is called even if the usernamespace is not set. This results having wrong uid/gid set on devices. This fix add a test to check if usernamespace is set befor calling setupUserNamespace. Fixes #1742 Signed-off-by: Julien Lavesque <julien.lavesque@gmail.com>	2018-02-26 14:27:11 +01:00
Allen Sun	3f32e72963	fix lint error in specconv Signed-off-by: Allen Sun <allensun.shl@alibaba-inc.com>	2018-02-26 15:39:54 +08:00
Michael Crosby	595bea022f	Merge pull request #1722 from ravisantoshgudimetla/fix-systemd-path fix systemd slice expansion so that it could be consumed by cAdvisor	2018-02-20 09:59:24 -05:00
W. Trevor King	50dc7ee96c	libcontainer/capabilities_linux: Drop os.Getpid() call gocapability has supported 0 as "the current PID" since syndtr/gocapability@5e7cce49 (Allow to use the zero value for pid to operate with the current task, 2015-01-15, syndtr/gocapability#2). libcontainer was ported to that approach in `444cc298` (namespaces: allow to use pid namespace without mount namespace, 2015-01-27, docker/libcontainer#358), but the change was clobbered by `22df5551` (Merge branch 'master' into api, 2015-02-19, docker/libcontainer#388) which landed via `5b73860e` (Merge pull request #388 from docker/api, 2015-02-19, docker/libcontainer#388). This commit restores the changes from `444cc298`. Signed-off-by: W. Trevor King <wking@tremily.us>	2018-02-19 15:47:42 -08:00
ravisantoshgudimetla	7019e1de7b	fix systemd slice expansion so that it could be consumed by cAdvisor Signed-off-by: ravisantoshgudimetla <ravisantoshgudimetla@gmail.com>	2018-02-18 21:32:39 -05:00
Mrunal Patel	6e15bc3f92	Merge pull request #1702 from crosbymichael/chroot chroot when no mount namespaces is provided	2018-02-07 10:09:35 -08:00
W. Trevor King	be16b13645	libcontainer/state_linux_test: Add a testTransitions helper The helper DRYs up the transition tests and makes it easy to get complete coverage for invalid transitions. I'm also using t.Run() for subtests. Run() is new in Go 1.7 [1], but runc dropped support for 1.6 back in `e773f96b` (update go version at travis-ci, 2017-02-20, #1335). [1]: https://blog.golang.org/subtests Signed-off-by: W. Trevor King <wking@tremily.us>	2018-01-25 11:18:45 -08:00
Michael Crosby	91ca331474	chroot when no mount namespaces is provided Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2018-01-25 11:36:37 -05:00
Michael Crosby	c4e4bb0df2	Merge pull request #1699 from AkihiroSuda/indent-c make: validate C format	2018-01-25 10:09:09 -05:00
Aleksa Sarai	5a46c2ba8b	nsenter: move namespace creation after userns creation Technically, this change should not be necessary, as the kernel documentation claims that if you call clone(flags\|CLONE_NEWUSER), the new user namespace will be the owner of all other namespaces created in @flags. Unfortunately this isn't always the case, due to various additional semantics and kernel bugs. One particular instance is SELinux, which acts very strangely towards the IPC namespace and mqueue. If you unshare the IPC namespace before you map a user in the user namespace, the IPC namespace's internal kern-mount for mqueue will be labelled incorrectly and the container won't be able to access it. The only way of solving this is to unshare IPC after the user has been mapped and we have changed to that user. I've also heard of this happening to the NET namespace while talking to some LXC folks, though I haven't personally seen that issue. This change matches our handling of user namespaces to be the same as how LXC handles these problems. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-01-25 23:56:49 +11:00
Akihiro Suda	dd5eb3b9e3	make: validate C format Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-01-24 10:49:50 +09:00
Ed King	5c0af14bf8	Return from goroutine when it should terminate Signed-off-by: Craig Furman <cfurman@pivotal.io>	2018-01-23 10:46:31 +00:00
Will Martin	8d3e6c9826	Avoid race when opening exec fifo When starting a container with `runc start` or `runc run`, the stub process (runc[2:INIT]) opens a fifo for writing. Its parent runc process will open the same fifo for reading. In this way, they synchronize. If the stub process exits at the wrong time, the parent runc process will block forever. This can happen when racing 2 runc operations against each other: `runc run/start`, and `runc delete`. It could also happen for other reasons, e.g. the kernel's OOM killer may select the stub process. This commit resolves this race by racing the opening of the exec fifo from the runc parent process against the stub process exiting. If the stub process exits before we open the fifo, we return an error. Another solution is to wait on the stub process. However, it seems it would require more refactoring to avoid calling wait multiple times on the same process, which is an error. Signed-off-by: Craig Furman <cfurman@pivotal.io>	2018-01-22 17:03:02 +00:00
Antonio Murdaca	cd1e7abee2	libcontainer: expose annotations in hooks Annotations weren't passed to hooks. This patch fixes that by passing annotations to stdin for hooks. Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2018-01-11 16:54:01 +01:00
vikaschoudhary16	d5b4a3eddb	Fix race against systemd - T0: runc triggers a systemd unit creation asynchronously from [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L298) - T1: runc then moves ahead and starts creating cgroup paths(.scope directories), [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L348). Kernel creates .scope directory and cgroup.procs file(along with other default files) in the directory automatically, in an atomic manner. - T3: systemd execution thread which was invoked at time `T0`, is still in the process of unit creation. systemd also trying to create cgroup paths and deletes the `.scope` directory which is created at time `T1` by runc from [here](https://github.com/systemd/systemd/blob/v219/src/shared/cgroup-util.c#L1630) in the code Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>	2018-01-08 09:37:26 -05:00

1 2 3 4 5 ...

1211 Commits