jasder/runc - runc - 军科开源项目托管

Commit Graph

Author	SHA1	Message	Date
Adrian Reber	944e057025	Update to latest go-criu (4.0.2) This updates to the latest version of go-criu (4.0.2) which is based on CRIU 3.14. As go-criu provides an existing way to query the CRIU binary for its version this also removes all the code from runc to handle CRIU version checking and now relies on go-criu. An important side effect of this change is that this raises the minimum CRIU version to 3.0.0 as that is the first CRIU version that supports CRIU version queries via RPC in contrast to parsing the output of 'criu --version' CRIU 3.0 has been released in April of 2017. Signed-off-by: Adrian Reber <areber@redhat.com>	2020-05-20 13:49:38 +02:00
Akihiro Suda	f369199ff6	Merge pull request #2413 from JFHwang/2392-spec-check Add nil check of spec.Process in validateProcessSpec()	2020-05-19 08:11:22 +09:00
Mrunal Patel	825e91ada6	Merge pull request #2341 from kolyshkin/test-cpt-lazy runc checkpoint: fix --status-fd to accept fd	2020-05-18 10:43:24 -07:00
John Hwang	7fc291fd45	Replace formatted errors when unneeded Signed-off-by: John Hwang <John.F.Hwang@gmail.com>	2020-05-16 18:13:21 -07:00
Aleksa Sarai	859a780d6f	cgroups: add GetFreezerState() helper to Manager This is effectively a nicer implementation of the container.isPaused() helper, but to be used within the cgroup code for handling some fun issues we have to fix with the systemd cgroup driver. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2020-05-13 17:38:45 +10:00
Kir Kolyshkin	ca1d135bd4	runc checkpoint: fix --status-fd to accept fd 1. The command `runc checkpoint --lazy-server --status-fd $FD` actually accepts a file name as an $FD. Make it accept a file descriptor, like its name implies and the documentation states. In addition, since runc itself does not use the result of CRIU status fd, remove the code which relays it, and pass the FD directly to CRIU. Note 1: runc should close this file descriptor itself after passing it to criu, otherwise whoever waits on it might wait forever. Note 2: due to the way criu swrk consumes the fd (it reopens /proc/$SENDER_PID/fd/$FD), runc can't close it as soon as criu swrk has started. There is no good way to know when criu swrk has reopened the fd, so we assume that as soon as we have received something back, the fd is already reopened. 2. Since the meaning of --status-fd has changed, the test case using it needs to be fixed as well. Modify the lazy migration test to remove "sleep 2", actually waiting for the the lazy page server to be ready. While at it, - remove the double fork (using shell's background process is sufficient here); - check the exit code for "runc checkpoint" and "criu lazy-pages"; - remove the check for no errors in dump.log after restore, as we are already checking its exit code. [v2: properly close status fd after spawning criu] [v3: move close status fd to after the first read] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-11 15:36:50 -07:00
Kir Kolyshkin	714c91e9f7	Simplify cgroup path handing in v2 via unified API This unties the Gordian Knot of using GetPaths in cgroupv2 code. The problem is, the current code uses GetPaths for three kinds of things: 1. Get all the paths to cgroup v1 controllers to save its state (see (linuxContainer).currentState(), (LinuxFactory).loadState() methods). 2. Get all the paths to cgroup v1 controllers to have the setns process enter the proper cgroups in `(*setnsProcess).start()`. 3. Get the path to a specific controller (for example, `m.GetPaths()["devices"]`). Now, for cgroup v2 instead of a set of per-controller paths, we have only one single unified path, and a dedicated function `GetUnifiedPath()` to get it. This discrepancy between v1 and v2 cgroupManager API leads to the following problems with the code: - multiple if/else code blocks that have to treat v1 and v2 separately; - backward-compatible GetPaths() methods in v2 controllers; - - repeated writing of the PID into the same cgroup for v2; Overall, it's hard to write the right code with all this, and the code that is written is kinda hard to follow. The solution is to slightly change the API to do the 3 things outlined above in the same manner for v1 and v2: 1. Use `GetPaths()` for state saving and setns process cgroups entering. 2. Introduce and use Path(subsys string) to obtain a path to a subsystem. For v2, the argument is ignored and the unified path is returned. This commit converts all the controllers to the new API, and modifies all the users to use it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-08 12:04:06 -07:00
Kir Kolyshkin	63854b0ea8	newSetnsProcess: reuse state.CgroupPaths c.cgroupManager.GetPaths() are called twice here: once in currentState() and then in newSetnsProcess(). Reuse the result of the first call, which is stored into state.CgroupPaths. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-08 10:05:59 -07:00
Kir Kolyshkin	9a3e632625	notify: simplify usage Instead of passing the whole map of paths, pass the path to the memory controller which these functions actually require. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-08 10:05:58 -07:00
lifubang	657407ff23	fix runc events error in cgroup v2 Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-05-07 22:18:46 +08:00
Ted Yu	db29dce076	Close fd in case fd.Write() returns error Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-05-02 20:06:08 -07:00
Mrunal Patel	634e51b52c	Merge pull request #2335 from kolyshkin/cgroupv2-cpt Fix cgroupv2 checkpoint/restore	2020-04-24 08:47:36 -07:00
Mrunal Patel	c420a3ec7f	Merge pull request #2324 from kolyshkin/criu-freezer libcontainer: fix Checkpoint wrt cgroupv2	2020-04-23 19:24:38 -07:00
Kir Kolyshkin	9280e3566d	checkpoint/restore: fix cgroupv2 handling In case of cgroupv2 unified hierarchy, the /sys/fs/cgroup mount is the real mount with fstype of cgroup2 (rather than a set of external bind mounts like for cgroupv1). So, we should not add it to the list of "external bind mounts" on both checkpoint and restore. Without this fix, checkpoint integration tests fail on cgroup v2. Also, same is true for cgroup v1 + cgroupns. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-22 11:26:43 -07:00
Kir Kolyshkin	af6b9e7fa9	nit: do not use syscall package In many places (not all of them though) we can use `unix.` instead of `syscall.` as these are indentical. In particular, x/sys/unix defines: ```go type Signal = syscall.Signal type Errno = syscall.Errno type SysProcAttr = syscall.SysProcAttr const ENODEV = syscall.Errno(0x13) ``` and unix.Exec() calls syscall.Exec(). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-18 16:16:49 -07:00
Kir Kolyshkin	b3a481eb77	libcontainer: fix Checkpoint wrt cgroupv2 Commit `9a0184b10f` meant to enable using cgroup v2 freezer for criu >= 3.14, but it looks like it is doing something else instead. The logic here is: - for cgroup v1, set FreezeCgroup, if available - for cgroup v2, only set it for criu >= 3.14 - do not use GetPaths() in case v2 is used (this method is obsoleted for v2 and will be removed) Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-17 16:17:00 -07:00
Ted Yu	7a978e354a	Defer netns.Close() after error check Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-04-15 18:33:20 -07:00
Ted Yu	21d7bb95eb	Close criuServer so that even if CRIU crashes or unexpectedly exits, runc will not hang Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-04-03 15:27:27 -07:00
Kir Kolyshkin	b2272b2cba	libcontainer: use errors.Is() and errors.As() Make use of errors.Is() and errors.As() where appropriate to check the underlying error. The biggest motivation is to simplify the code. The feature requires go 1.13 but since merging #2256 we are already not supporting go 1.12 (which is an unsupported release anyway). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-02 20:34:01 -07:00
Kir Kolyshkin	c39f87a47a	Revert "Merge pull request #2280 from kolyshkin/errors-unwrap" Using errors.Unwrap() is not the best thing to do, since it returns nil in case of an error which was not wrapped. More to say, errors package provides more elegant ways to check for underlying errors, such as errors.As() and errors.Is(). This reverts commit `f8e138855d`, reversing changes made to `6ca9d8e6da`. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-02 19:41:11 -07:00
Michael Crosby	f8e138855d	Merge pull request #2280 from kolyshkin/errors-unwrap Use errors.Unwrap() where possible	2020-04-02 14:39:06 -04:00
Michael Crosby	6ca9d8e6da	Merge pull request #2283 from tedyu/runc-path-in-prefix isPathInPrefixList return value should be reverted	2020-04-02 14:09:49 -04:00
Michael Crosby	b26e4f27c1	Merge pull request #2284 from tedyu/criu-svr-close Avoid double close of criuServer	2020-04-02 14:07:35 -04:00
Mrunal Patel	e3e26cafe9	Merge pull request #2276 from kolyshkin/criu-v2 cgroupv2: don't use GetCgroupMounts for criu c/r	2020-04-01 17:36:24 -07:00
Ted Yu	49896ab0f4	Avoid double close of criuServer Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-04-01 16:15:23 -07:00
Ted Yu	d02fc48422	isPathInPrefixList return value should be reverted Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-04-01 15:45:31 -07:00
Kir Kolyshkin	8d7977ee6e	libct/isPaused: don't use GetPaths from v2 code Using GetPaths from cgroupv2 unified hierarchy code is deprecated and this function will (hopefully) be removed. Use GetUnifiedPath() for v2 case. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-31 20:24:28 -07:00
Kir Kolyshkin	12e156f076	libct.isPaused: use errors.Unwrap Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-31 20:07:04 -07:00
Kir Kolyshkin	fc840f199f	cgroupv2: don't use GetCgroupMounts for criu c/r When performing checkpoint or restore of cgroupv2 unified hierarchy, there is no need to call getCgroupMounts() / cgroups.GetCgroupMounts() as there's only a single mount in there. This eliminates the last internal (i.e. runc) use case of cgroups.GetCgroupMounts() for v2 unified. Unfortunately, there are external ones (e.g. moby/moby) so we can't yet let it return an error. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-31 17:05:11 -07:00
Michael Crosby	9ec5b03e5a	Merge pull request #2259 from adrianreber/v2-test Add minimal cgroup2 checkpoint/restore support	2020-03-31 15:01:18 -04:00
Yulia Nedyalkova	2abc6a3605	Actually check for syscall.ENODEV when checking if a container is paused It turns out that ioutil.Readfile wraps the error in a *os.PathError. Since we cannot guarantee compilation with golang >= v1.13, we are manually unwrapping the error. Signed-off-by: Kieron Browne <kbrowne@pivotal.io>	2020-03-31 15:52:20 +01:00
Adrian Reber	9a0184b10f	cgroup2: use CRIU's new freezer v2 support The newest CRIU version supports freezer v2 and this tells runc to use it if new enough or fall back to non-freezer based process freezing on cgroup v2 system. Signed-off-by: Adrian Reber <areber@redhat.com>	2020-03-31 16:36:35 +02:00
Michael Crosby	88474967d3	Merge pull request #1974 from openSUSE/unreachable-code Remove unreachable code paths	2020-03-16 13:56:05 -04:00
Mrunal Patel	981dbef514	Merge pull request #2226 from avagin/runsc-restore-cmd-wait restore: fix a race condition in process.Wait()	2020-03-15 18:48:16 -07:00
Sascha Grunert	b477a159db	Remove unreachable code paths Signed-off-by: Sascha Grunert <sgrunert@suse.com>	2020-03-12 09:13:03 +01:00
Pradyumna Agrawal	5b2b138d24	Synchronize the call to linuxContainer.Signal() linuxContainer.Signal() can race with another call to say Destroy() which clears the container's initProcess. This can cause a nil pointer dereference in Signal(). This patch will synchronize Signal() and Destroy() by grabbing the container's mutex as part of the Signal() call. Signed-off-by: Pradyumna Agrawal <pradyumnaa@vmware.com>	2020-03-09 11:15:22 -07:00
Andrei Vagin	269ea385a4	restore: fix a race condition in process.Wait() Adrian reported that the checkpoint test stated failing: === RUN TestCheckpoint --- FAIL: TestCheckpoint (0.38s) checkpoint_test.go:297: Did not restore the pipe correctly: The problem here is when we start exec.Cmd, we don't call its wait method. This means that we don't wait cmd.goroutines ans so we don't know when all data will be read from process pipes. Signed-off-by: Andrei Vagin <avagin@gmail.com>	2020-02-10 10:21:08 -08:00
Aleksa Sarai	f6fb7a0338	merge branch 'pr-2133' Julia Nedialkova (1): Handle ENODEV when accessing the freezer.state file LGTMs: @crosbymichael @cyphar Closes #2133	2020-01-17 02:07:19 +11:00
Jordan Liggitt	8541d9cf3d	Fix race checking for process exit and waiting for exec fifo Signed-off-by: Jordan Liggitt <liggitt@google.com>	2019-12-18 18:48:18 +00:00
Radostin Stoyanov	a610a84821	criu: Ensure other users cannot read c/r files No checkpoint files should be readable by anyone else but the user creating it. Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>	2019-10-17 07:49:38 +01:00
Radostin Stoyanov	f017e0f9e1	checkpoint: Set descriptors.json file mode to 0600 Prevent unprivileged users from being able to read descriptors.json Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>	2019-10-12 19:29:44 +01:00
Julia Nedialkova	e63b797f38	Handle ENODEV when accessing the freezer.state file ...when checking if a container is paused Signed-off-by: Julia Nedialkova <julianedialkova@hotmail.com>	2019-09-27 17:02:56 +03:00
Michael Crosby	331692baa7	Only allow proc mount if it is procfs Fixes #2128 This allows proc to be bind mounted for host and rootless namespace usecases but it removes the ability to mount over the top of proc with a directory. ```bash > sudo docker run --rm apparmor docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"/var/lib/docker/volumes/aae28ea068c33d60e64d1a75916cf3ec2dc3634f97571854c9ed30c8401460c1/_data\\\" to rootfs \\\"/var/lib/docker/overlay2/a6be5ae911bf19f8eecb23a295dec85be9a8ee8da66e9fb55b47c841d1e381b7/merged\\\" at \\\"/proc\\\" caused \\\"\\\\\\\"/var/lib/docker/overlay2/a6be5ae911bf19f8eecb23a295dec85be9a8ee8da66e9fb55b47c841d1e381b7/merged/proc\\\\\\\" cannot be mounted because it is not of type proc\\\"\"": unknown. > sudo docker run --rm -v /proc:/proc apparmor docker-default (enforce) root 18989 0.9 0.0 1288 4 ? Ss 16:47 0:00 sleep 20 ``` Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-09-24 11:00:18 -04:00
Giuseppe Scrivano	1932917b71	libcontainer: add initial support for cgroups v2 allow to set what subsystems are used by libcontainer/cgroups/fs.Manager. subsystemsUnified is used on a system running with cgroups v2 unified mode. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2019-09-05 13:02:25 +02:00
Georgi Sabev	a146081828	Write logs to stderr by default Minor refactoring to use the filePair struct for both init sock and log pipe Co-authored-by: Julia Nedialkova <julianedialkova@hotmail.com> Signed-off-by: Georgi Sabev <georgethebeatle@gmail.com>	2019-04-24 15:18:14 +03:00
Georgi Sabev	ba3cabf932	Improve nsexec logging * Simplify logging function * Logs contain __FUNCTION__:__LINE__ * Bail uses write_log Co-authored-by: Julia Nedialkova <julianedialkova@hotmail.com> Co-authored-by: Danail Branekov <danailster@gmail.com> Signed-off-by: Georgi Sabev <georgethebeatle@gmail.com>	2019-04-22 17:53:52 +03:00
Danail Branekov	c486e3c406	Address comments in PR 1861 Refactor configuring logging into a reusable component so that it can be nicely used in both main() and init process init() Co-authored-by: Georgi Sabev <georgethebeatle@gmail.com> Co-authored-by: Giuseppe Capizzi <gcapizzi@pivotal.io> Co-authored-by: Claudia Beresford <cberesford@pivotal.io> Signed-off-by: Danail Branekov <danailster@gmail.com>	2019-04-04 14:57:28 +03:00
Marco Vedovati	9a599f62fb	Support for logging from children processes Add support for children processes logging (including nsexec). A pipe is used to send logs from children to parent in JSON. The JSON format used is the same used by logrus JSON formatted, i.e. children process can use standard logrus APIs. Signed-off-by: Marco Vedovati <mvedovati@suse.com>	2019-04-04 14:53:23 +03:00
Mrunal Patel	2b18fe1d88	Merge pull request #1984 from cyphar/memfd-cleanups nsenter: cloned_binary: "memfd" cleanups	2019-03-07 10:18:33 -08:00
Michael Crosby	f739110263	Merge pull request #1968 from adrianreber/podman Create bind mount mountpoints during restore	2019-03-04 11:37:07 -06:00

1 2 3 4 5

221 Commits