jasder/runc - runc - 军科开源项目托管

Commit Graph

Author	SHA1	Message	Date
Michael Crosby	029124da7a	Merge pull request #2031 from lifubang/selinux Add selinux validate in runc exec	2019-04-03 16:09:19 -04:00
lifubang	3e6688f5c9	add selinux label for runc exec Signed-off-by: lifubang <lifubang@acmcoder.com>	2019-04-03 12:09:06 +08:00
Mrunal Patel	6a3f4749b8	Merge pull request #2032 from rhatdan/selinux Fix SELinux failures on disabled SELinux Machines	2019-04-02 13:39:48 -07:00
Daniel J Walsh	dcf994b4f8	Fix SELinux failures on disabled SELinux Machines On some machines when setting the SELinux key labels to "", we are seeing failures that cause runc to fail. Even if SELinux is disabled. This check will ignore callers calling SELinux Set*Label functions with "" when SELinux is disabled. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2019-04-02 10:27:27 -04:00
Aleksa Sarai	da2021132b	merge branch 'pr-2026' VERSION: back to development VERSION: release v1.0.0-rc7 Votes: +5 -0 /0 LGTMs: [unanimous]	2019-03-29 02:19:24 +11:00
Aleksa Sarai	6b5ee713f3	VERSION: back to development Signed-off-by: Aleksa Sarai <asarai@suse.de>	2019-03-28 22:46:35 +11:00
Aleksa Sarai	69ae5da6af	VERSION: release v1.0.0-rc7 Signed-off-by: Aleksa Sarai <asarai@suse.de>	2019-03-28 22:45:53 +11:00
Michael Crosby	11fc498ffa	Merge pull request #2023 from LittleLightLittleFire/2022-fix-runc-zombie-process-regression Fixes regression causing zombie runc:[1:CHILD] processes	2019-03-22 14:06:31 -04:00
Mrunal Patel	dd22a84864	Merge pull request #2012 from rhatdan/selinux Need to setup labeling of kernel keyrings.	2019-03-20 21:17:18 -07:00
Alex Fang	eab5330908	Fixes regression causing zombie runc:[1:CHILD] processes Whenever processes are spawned using nsexec, a zombie runc:[1:CHILD] process will always be created and will need to be reaped by the parent Signed-off-by: Alex Fang <littlelightlittlefire@gmail.com>	2019-03-21 13:43:38 +11:00
Aleksa Sarai	f56b4cbead	merge branch 'pr-2015' Use getenv not secure_getenv LGTMs: @crosbymichael @cyphar Closes #2015	2019-03-16 17:30:56 +11:00
Daniel, Dao Quang Minh	7341c22d46	Merge pull request #2014 from filbranden/testing1 Add $RUNC_USE_SYSTEMD to run tests using systemd cgroup driver	2019-03-15 10:49:13 +00:00
Filipe Brandenburger	9fe7c939f8	Add a Travis-CI job for systemd cgroup driver The additional test shows as a separate job. It sets environment RUNC_USE_SYSTEMD=1 so it will be clear in Travis-CI that this job is testing the systemd cgroup driver. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2019-03-14 18:53:27 -07:00
Filipe Brandenburger	5369f9ade3	Skip CRIU tests when $RUNC_USE_SYSTEMD for now These tests sometimes hang, so let's skip them for now. Tested: $ sudo make localintegration TESTPATH='/checkpoint.bats' RUNC_USE_SYSTEMD=1 The 5 tests in this test suite will be skipped. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2019-03-14 14:53:09 -07:00
Filipe Brandenburger	d4586090c4	Update tests that depend on cgroupfs paths to consider systemd cgroups When $RUNC_USE_SYSTEMD is set, then use a systemd syntax for the cgroupsPath. Also fix $CGROUPS_PATH to look under the actual path to the slice/scope created by systemd. Tested: $ sudo make localintegration TESTPATH='/cgroups.bats' RUNC_USE_SYSTEMD=1 That test will fail without this commit. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2019-03-14 14:51:24 -07:00
Filipe Brandenburger	a9056a348f	Add $RUNC_USE_SYSTEMD to use systemd cgroup driver in tests This allows us to test runc using libcontainer's systemd driver, by passing an extra `--systemd-cgroup` argument to the calls to runc. Tested: $ sudo make localintegration TESTPATH='/exec.bats' RUNC_USE_SYSTEMD=1 And confirmed that systemd was in use by looking at creation and removal of libcontainer_<pid>_systemd_test_default.slice test slices. Also introduced a breakage in systemd cgroup driver and confirmed that the tests failed as expected. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2019-03-14 10:26:47 -07:00
Filipe Brandenburger	4b2b978291	Add cgroup name to error message More information should help troubleshoot an issue when this error occurs. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2019-03-14 10:25:00 -07:00
Justin Cormack	6f714aa928	Use getenv not secure_getenv secure_getenv is a Glibc extension and so this code does not compile on Musl libc any more after this patch. secure_getenv is only intended to be used in setuid binaries, in order that they should not trust their environment. It simply returns NULL if the binary is running setuid. If runc was installed setuid, the user can already do anything as root, so it is game over, so this check is not needed. Signed-off-by: Justin Cormack <justin.cormack@docker.com>	2019-03-14 10:58:10 +00:00
Daniel J Walsh	cd96170c10	Need to setup labeling of kernel keyrings. Work is ongoing in the kernel to support different kernel keyrings per user namespace. We want to allow SELinux to manage kernel keyrings inside of the container. Currently when runc creates the kernel keyring it gets the label which runc is running with ususally `container_runtime_t`, with this change the kernel keyring will be labeled with the container process label container_t:s0:C1,c2. Container running as container_t:s0:c1,c2 can manage keyrings with the same label. This change required a revendoring or the SELinux go bindings. github.com/opencontainers/selinux. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2019-03-13 17:57:30 -04:00
Mrunal Patel	2b18fe1d88	Merge pull request #1984 from cyphar/memfd-cleanups nsenter: cloned_binary: "memfd" cleanups	2019-03-07 10:18:33 -08:00
Aleksa Sarai	923a8f8a9a	merge branch 'pr-2001' README: link to /org/security/ LGTMs: @crosbymichael @cyphar Closes #2001	2019-03-05 18:45:55 +11:00
Michael Crosby	f739110263	Merge pull request #1968 from adrianreber/podman Create bind mount mountpoints during restore	2019-03-04 11:37:07 -06:00
Michael Crosby	f416cac1fa	Merge pull request #2000 from lifubang/preserve-fds-error fix preserve-fds flag may cause runc hang	2019-03-04 10:45:17 -06:00
Vincent Batts	dbf6e48d0f	README: link to /org/security/ Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2019-03-03 15:01:08 -05:00
Aleksa Sarai	2d4a37b427	nsenter: cloned_binary: userspace copy fallback if sendfile fails There are some circumstances where sendfile(2) can fail (one example is that AppArmor appears to block writing to deleted files with sendfile(2) under some circumstances) and so we need to have a userspace fallback. It's fairly trivial (and handles short-writes). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2019-03-01 23:29:10 +11:00
Aleksa Sarai	16612d74de	nsenter: cloned_binary: try to ro-bind /proc/self/exe before copying The usage of memfd_create(2) and other copying techniques is quite wasteful, despite attempts to minimise it with _LIBCONTAINER_STATEDIR. memfd_create(2) added ~10M of memory usage to the cgroup associated with the container, which can result in some setups getting OOM'd (or just hogging the hosts' memory when you have lots of created-but-not-started containers sticking around). The easiest way of solving this is by creating a read-only bind-mount of the binary, opening that read-only bindmount, and then umounting it to ensure that the host won't accidentally be re-mounted read-write. This avoids all copying and cleans up naturally like the other techniques used. Unfortunately, like the O_TMPFILE fallback, this requires being able to create a file inside _LIBCONTAINER_STATEDIR (since bind-mounting over the most obvious path -- /proc/self/exe -- is a very bad idea). Unfortunately detecting this isn't fool-proof -- on a system with a read-only root filesystem (that might become read-write during "runc init" execution), we cannot tell whether we have already done an ro remount. As a partial mitigation, we store a _LIBCONTAINER_CLONED_BINARY environment variable which is checked alongside the protection being present. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2019-03-01 23:29:08 +11:00
Aleksa Sarai	af9da0a450	nsenter: cloned_binary: use the runc statedir for O_TMPFILE Writing a file to tmpfs actually incurs a memcg penalty, and thus the benefit of being able to disable memfd_create(2) with _LIBCONTAINER_DISABLE_MEMFD_CLONE is fairly minimal -- though it should be noted that quite a few distributions don't use tmpfs for /tmp (and instead have it as a regular directory or subvolume of the host filesystem). Since runc must have write access to the state directory anyway (and the state directory is usually not on a tmpfs) we can use that instead of /tmp -- avoiding potential memcg costs with no real downside. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2019-03-01 23:28:51 +11:00
Aleksa Sarai	2429d59352	nsenter: cloned_binary: expand and add pre-3.11 fallbacks In order to get around the memfd_create(2) requirement, `0a8e4117e7` ("nsenter: clone /proc/self/exe to avoid exposing host binary to container") added an O_TMPFILE fallback. However, this fallback was flawed in two ways: * It required O_TMPFILE which is relatively new (having been added to Linux 3.11). * The fallback choice was made at compile-time, not runtime. This results in several complications when it comes to running binaries on different machines to the ones they were built on. The easiest way to resolve these things is to have fallbacks work in a more procedural way (though it does make the code unfortunately more complicated) and to add a new fallback that uses mkotemp(3). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2019-03-01 23:28:50 +11:00
lifubang	7cb3cde1f4	fix preserve-fds flag may cause runc hang Signed-off-by: lifubang <lifubang@acmcoder.com>	2019-03-01 17:15:17 +08:00
Aleksa Sarai	5b775bf297	nsenter: cloned_binary: detect and handle short copies For a variety of reasons, sendfile(2) can end up doing a short-copy so we need to just loop until we hit the binary size. Since /proc/self/exe is tautologically our own binary, there's no chance someone is going to modify it underneath us (or changing the size). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2019-02-26 19:51:01 +11:00
Mrunal Patel	f79e211b1d	Merge pull request #1995 from giuseppe/exec-preserve-fds exec: expose --preserve-fds	2019-02-25 17:35:28 -08:00
Giuseppe Scrivano	52f4e0facc	exec: expose --preserve-fds The implementation is already there, we only need to add the CLI option and pass it down. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2019-02-25 17:33:04 +01:00
Mrunal Patel	5b5130ad76	Merge pull request #1963 from adrianreber/go-criu Vendor in go-criu and use it for CRIU's RPC definition	2019-02-23 10:44:28 -08:00
Michael Crosby	8084f7611e	Merge pull request #1986 from adrianreber/master switched travis to xenial	2019-02-21 15:36:02 -05:00
Adrian Reber	f1da0d3008	switched travis to xenial The CRIU test for lazy migration was always skipped in Travis because the kernel was too old. This switches Travis testing to dist: xenial which provides a newer kernel which enables CRIU lazy migration testing. Signed-off-by: Adrian Reber <areber@redhat.com>	2019-02-16 19:45:22 +01:00
Aleksa Sarai	751f18de2a	merge branch 'pr-1982' nsexec (CVE-2019-5736): avoid parsing environ LGTMs: @cyphar @crosbymichael Closes #1982	2019-02-15 18:40:33 +11:00
Adrian Reber	9edb5494bb	Use vendored in CRIU Go bindings This makes use of the vendored in Go bindings and removes the copy of the CRIU RPC interface definition. runc now relies on go-criu for RPC definition and hopefully more CRIU functions can be used in the future from the CRIU Go bindings. Signed-off-by: Adrian Reber <areber@redhat.com>	2019-02-14 18:20:02 +01:00
Adrian Reber	bfca1e6262	Vendor in go-criu Now that CRIU has released Go bindings, this commit vendors those in. At first it only replaces the copy of RPC interface but the goal is to use CRIU functions from the Go bindings instead of replicating the functionality in runc. Signed-off-by: Adrian Reber <areber@redhat.com>	2019-02-14 18:20:02 +01:00
Christian Brauner	bb7d8b1f41	nsexec (CVE-2019-5736): avoid parsing environ My first attempt to simplify this and make it less costly focussed on the way constructors are called. I was under the impression that the ELF specification mandated that arg, argv, and actually even envp need to be passed to functions located in the .init_arry section (aka "constructors"). Actually, the specifications is (cf. [2]): SHT_INIT_ARRAY This section contains an array of pointers to initialization functions, as described in ``Initialization and Termination Functions'' in Chapter 5. Each pointer in the array is taken as a parameterless procedure with a void return. which means that this becomes a libc specific decision. Glibc passes down those args, musl doesn't. So this approach can't work. However, we can at least remove the environment parsing part based on POSIX since [1] mandates that there should be an environ variable defined in unistd.h which provides access to the environment. See also the relevant Open Group specification [1]. [1]: http://pubs.opengroup.org/onlinepubs/9699919799/ [2]: http://www.sco.com/developers/gabi/latest/ch4.sheader.html#init_array Fixes: CVE-2019-5736 Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2019-02-14 16:06:21 +01:00
Mrunal Patel	f414f497b5	Merge pull request #1978 from filbranden/systemd5 Remove detection for scope properties, which have always been broken	2019-02-13 11:54:08 -08:00
Daniel, Dao Quang Minh	0a012df867	Merge pull request #1973 from jhowardmsft/jjh/runtimespec Vendor opencontainers/runtime-spec 29686dbc	2019-02-12 17:07:43 +00:00
Filipe Brandenburger	cd41feb46b	Remove detection for scope properties, which have always been broken The detection for scope properties (whether scope units support DefaultDependencies= or Delegate=) has always been broken, since systemd refuses to create scopes unless at least one PID is attached to it (and this has been so since scope units were introduced in systemd v205.) This can be seen in journal logs whenever a container is started with libpod: Feb 11 15:08:07 myhost systemd[1]: libcontainer-12345-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing. Feb 11 15:08:07 myhost systemd[1]: libcontainer-12345-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing. Since this logic never worked, just assume both attributes are supported (which is what the code does when detection fails for this reason, since it's looking for an "unknown attribute" or "read-only attribute" to mark them as false) and skip the detection altogether. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2019-02-11 16:05:37 -08:00
Adrian Reber	7354546cc8	Create mountpoints also on restore runc creates all missing mountpoints when it starts a container, this commit also creates those mountpoints during restore. Now it is possible to restore a container using the same, but newly created rootfs just as during container start. Signed-off-by: Adrian Reber <areber@redhat.com>	2019-02-08 15:59:51 +01:00
Adrian Reber	f661e02343	factor out bind mount mountpoint creation During rootfs setup all mountpoints (directory and files) are created before bind mounting the bind mounts. This does not happen during container restore via CRIU. If restoring in an identical but newly created rootfs, the restore fails right now. This just factors out the code to create the bind mount mountpoints so that it also can be used during restore. Signed-off-by: Adrian Reber <areber@redhat.com>	2019-02-08 15:59:51 +01:00
Aleksa Sarai	6635b4f0c6	merge branch 'cve-2019-5736' nsenter: clone /proc/self/exe to avoid exposing host binary to container Fixes: CVE-2019-5736 LGTMs: @cyphar @crosbymichael	2019-02-08 18:58:10 +11:00
Aleksa Sarai	0a8e4117e7	nsenter: clone /proc/self/exe to avoid exposing host binary to container There are quite a few circumstances where /proc/self/exe pointing to a pretty important container binary is a _bad_ thing, so to avoid this we have to make a copy (preferably doing self-clean-up and not being writeable). We require memfd_create(2) -- though there is an O_TMPFILE fallback -- but we can always extend this to use a scratch MNT_DETACH overlayfs or tmpfs. The main downside to this approach is no page-cache sharing for the runc binary (which overlayfs would give us) but this is far less complicated. This is only done during nsenter so that it happens transparently to the Go code, and any libcontainer users benefit from it. This also makes ExtraFiles and --preserve-fds handling trivial (because we don't need to worry about it). Fixes: CVE-2019-5736 Co-developed-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Aleksa Sarai <asarai@suse.de>	2019-02-08 18:57:59 +11:00
Aleksa Sarai	dd023c457d	merge branch 'pr-1972' Update vendored golang.org/x/sys to latest LGTMs: @crosbymichael @cyphar Closes #1972	2019-02-08 18:52:59 +11:00
John Howard	ec069fe332	Vendor opencontainers/runtime-spec 29686dbc Signed-off-by: John Howard <jhoward@microsoft.com>	2019-02-07 14:49:22 -08:00
Filipe Brandenburger	4a600c04ed	Update vendored golang.org/x/sys to latest Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2019-02-06 17:59:21 -08:00
Mrunal Patel	e4fa8a4575	Merge pull request #1955 from xiaochenshen/rdt-fix-destroy-issue libcontainer: intelrdt: fix null intelrdt path issue in Destroy()	2019-02-01 13:18:56 -08:00

1 2 3 4 5 ...

3823 Commits All Branches Search

3823 Commits

All Branches