jasder/runc - runc - 军科开源项目托管

Commit Graph

Author	SHA1	Message	Date
Kir Kolyshkin	ca1d135bd4	runc checkpoint: fix --status-fd to accept fd 1. The command `runc checkpoint --lazy-server --status-fd $FD` actually accepts a file name as an $FD. Make it accept a file descriptor, like its name implies and the documentation states. In addition, since runc itself does not use the result of CRIU status fd, remove the code which relays it, and pass the FD directly to CRIU. Note 1: runc should close this file descriptor itself after passing it to criu, otherwise whoever waits on it might wait forever. Note 2: due to the way criu swrk consumes the fd (it reopens /proc/$SENDER_PID/fd/$FD), runc can't close it as soon as criu swrk has started. There is no good way to know when criu swrk has reopened the fd, so we assume that as soon as we have received something back, the fd is already reopened. 2. Since the meaning of --status-fd has changed, the test case using it needs to be fixed as well. Modify the lazy migration test to remove "sleep 2", actually waiting for the the lazy page server to be ready. While at it, - remove the double fork (using shell's background process is sufficient here); - check the exit code for "runc checkpoint" and "criu lazy-pages"; - remove the check for no errors in dump.log after restore, as we are already checking its exit code. [v2: properly close status fd after spawning criu] [v3: move close status fd to after the first read] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-11 15:36:50 -07:00
Lifubang	472fe623a7	criu image path permission error in rootless checkpoint Signed-off-by: Lifubang <lifubang@acmcoder.com>	2019-03-11 23:49:52 +08:00
Akihiro Suda	06f789cf26	Disable rootless mode except RootlessCgMgr when executed as the root in userns This PR decomposes `libcontainer/configs.Config.Rootless bool` into `RootlessEUID bool` and `RootlessCgroups bool`, so as to make "runc-in-userns" to be more compatible with "rootful" runc. `RootlessEUID` denotes that runc is being executed as a non-root user (euid != 0) in the current user namespace. `RootlessEUID` is almost identical to the former `Rootless` except cgroups stuff. `RootlessCgroups` denotes that runc is unlikely to have the full access to cgroups. `RootlessCgroups` is set to false if runc is executed as the root (euid == 0) in the initial namespace. Otherwise `RootlessCgroups` is set to true. (Hint: if `RootlessEUID` is true, `RootlessCgroups` becomes true as well) When runc is executed as the root (euid == 0) in an user namespace (e.g. by Docker-in-LXD, Podman, Usernetes), `RootlessEUID` is set to false but `RootlessCgroups` is set to true. So, "runc-in-userns" behaves almost same as "rootful" runc except that cgroups errors are ignored. This PR does not have any impact on CLI flags and `state.json`. Note about CLI: * Now `runc --rootless=(auto\|true\|false)` CLI flag is only used for setting `RootlessCgroups`. * Now `runc spec --rootless` is only required when `RootlessEUID` is set to true. For runc-in-userns, `runc spec` without `--rootless` should work, when sufficient numbers of UID/GID are mapped. Note about `$XDG_RUNTIME_DIR` (e.g. `/run/user/1000`): * `$XDG_RUNTIME_DIR` is ignored if runc is being executed as the root (euid == 0) in the initial namespace, for backward compatibility. (`/run/runc` is used) * If runc is executed as the root (euid == 0) in an user namespace, `$XDG_RUNTIME_DIR` is honored if `$USER != "" && $USER != "root"`. This allows unprivileged users to allow execute runc as the root in userns, without mounting writable `/run/runc`. Note about `state.json`: * `rootless` is set to true when `RootlessEUID == true && RootlessCgroups == true`. Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-09-07 15:05:03 +09:00
Ace-Tang	4803faf00e	cr: don't restore net namespace by default since runc don't manage net device and their configuration, checkpoint also don't dump net namespace by default, so set 'nsmask = unix.CLONE_NEWNET' by default in restore. Or if user do not pass 'empty-ns network', criu will cost extra time in restore. Signed-off-by: Ace-Tang <aceapril@126.com>	2018-08-17 16:03:21 +08:00
Akihiro Suda	f103de57ec	main: support rootless mode in userns Running rootless containers in userns is useful for mounting filesystems (e.g. overlay) with mapped euid 0, but without actual root privilege. Usage: (Note that `unshare --mount` requires `--map-root-user`) user$ mkdir lower upper work rootfs user$ curl http://dl-cdn.alpinelinux.org/alpine/v3.7/releases/x86_64/alpine-minirootfs-3.7.0-x86_64.tar.gz \| tar Cxz ./lower \|\| ( true; echo "mknod errors were ignored" ) user$ unshare --mount --map-root-user mappedroot# runc spec --rootless mappedroot# sed -i 's/"readonly": true/"readonly": false/g' config.json mappedroot# mount -t overlay -o lowerdir=./lower,upperdir=./upper,workdir=./work overlayfs ./rootfs mappedroot# runc run foo Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-05-10 12:16:43 +09:00
Adrian Reber	60ae7091de	checkpoint: support lazy migration With the help of userfaultfd CRIU supports lazy migration. Lazy migration means that memory pages are only transferred from the migration source to the migration destination on page fault. This enables to reduce the downtime during process or container migration to a minimum as the memory does not need to be transferred during migration. Lazy migration currently depends on userfaultfd being available on the current Linux kernel and if the used CRIU version supports lazy migration. Both dependencies can be checked by querying CRIU via RPC if the lazy migration feature is available. Using feature checking instead of version comparison enables runC to use CRIU features from the criu-dev branch. This way the user can decide if lazy migration should be available by choosing the right kernel and CRIU branch. To use lazy migration the CRIU process during dump needs to dump everything besides the memory pages and then it opens a network port waiting for remote page fault requests: # runc checkpoint httpd --lazy-pages --page-server 0.0.0.0:27 \ --status-fd /tmp/postcopy-pipe In this example CRIU will hang/wait once it has opened the network port and wait for network connection. As runC waits for CRIU to finish it will also hang until the lazy migration has finished. To know when the restore on the destination side can start the '--status-fd' parameter is used: #️ runc checkpoint --help \| grep status --status-fd value criu writes \0 to this FD once lazy-pages is ready The parameter '--status-fd' is directly from CRIU and this way the process outside of runC which controls the migration knows exactly when to transfer the checkpoint (without memory pages) to the destination and that the restore can be started. On the destination side it is necessary to start CRIU in 'lazy-pages' mode like this: # criu lazy-pages --page-server --address 192.168.122.3 --port 27 \ -D checkpoint and tell runC to do a lazy restore: # runc restore -d --image-path checkpoint --work-path checkpoint \ --lazy-pages httpd If both processes on the restore side have the same working directory 'criu lazy-pages' creates a unix domain socket where it waits for requests from the actual restore. runC starts CRIU restore in lazy restore mode and talks to 'criu lazy-pages' that it wants to restore memory pages on demand. CRIU continues to restore the process and once the process is running and accesses the first non-existing memory page the 'criu lazy-pages' server will request the page from the source system. Thus all pages from the source system will be transferred to the destination system. Once all pages have been transferred runC on the source system will end and the container will have finished migration. This can also be combined with CRIU's pre-copy support. The combination of pre-copy and post-copy (lazy migration) provides the possibility to migrate containers with minimal downtimes. Some additional background about post-copy migration can be found in these articles: https://lisas.de/~adrian/?p=1253 https://lisas.de/~adrian/?p=1183 Signed-off-by: Adrian Reber <areber@redhat.com>	2017-09-06 12:35:38 +00:00
Nikolas Sepos	3f234b15d0	Add auto-dedup flag for checkpoint/restore When doing incremental dumps is useful to use auto deduplication of memory images to save space. Signed-off-by: Nikolas Sepos <nikolas.sepos@gmail.com>	2017-08-18 16:19:21 +02:00
Andrei Vagin	1c43d091a1	checkpoint: add support for containers with terminals CRIU was extended to report about orphaned master pty-s via RPC. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-05-02 04:48:47 +03:00
Andrei Vagin	a4fcbfb704	Prepare startContainer() to have more action Currently startContainer() is used to create and to run a container. In the next patch it will be used to restore a container. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-05-01 21:55:57 +03:00
Tim Potter	9458b39ca9	Fix misspelling of "properties" in various places Signed-off-by: Tim Potter <tpot@hpe.com>	2017-04-21 13:29:58 +10:00
Aleksa Sarai	d2f49696b0	runc: add support for rootless containers This enables the support for the rootless container mode. There are many restrictions on what rootless containers can do, so many different runC commands have been disabled: * runc checkpoint * runc events * runc pause * runc ps * runc restore * runc resume * runc update The following commands work: * runc create * runc delete * runc exec * runc kill * runc list * runc run * runc spec * runc state In addition, any specification options that imply joining cgroups have also been disabled. This is due to support for unprivileged subtree management not being available from Linux upstream. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-03-23 20:45:24 +11:00
Michael Crosby	00a0ecf554	Add separate console socket Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2017-03-16 10:23:59 -07:00
Mrunal Patel	899b0748f0	Merge pull request #1308 from giuseppe/fix-systemd-notify fix systemd-notify when using a different PID namespace	2017-02-24 11:05:21 -08:00
Giuseppe Scrivano	d5026f0e43	signals: support detach and notify socket together let runc run until READY= is received and then proceed with detaching the process. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2017-02-22 22:28:03 +01:00
Giuseppe Scrivano	892f2ded6f	fix systemd-notify when using a different PID namespace The current support of systemd-notify has a race condition as the message send to the systemd notify socket might be dropped if the sender process is not running by the time systemd checks for the sender of the datagram. A proper fix of this in systemd would require changes to the kernel to maintain the cgroup of the sender process when it is dead (but it is not probably going to happen...) Generally, the solution to this issue is to specify the PID in the message itself so that systemd has not to guess the sender, but this wouldn't work when running in a PID namespace as the container will pass the PID known in its namespace (something like PID=1,2,3..) and systemd running on the host is not able to map it to the runc service. The proposed solution is to have a proxy in runc that forwards the messages to the host systemd. Example of this issue: https://github.com/projectatomic/atomic-system-containers/pull/24 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2017-02-22 22:27:59 +01:00
Deng Guangxing	98f004182b	add pre-dump and parent-path to checkpoint CRIU gets pre-dump to complete iterative migration. pre-dump saves process memory info only. And it need parent-path to specify the former memory files. This patch add pre-dump and parent-path arguments to runc checkpoint Signed-off-by: Deng Guangxing <dengguangxing@huawei.com> Signed-off-by: Adrian Reber <areber@redhat.com>	2017-02-14 19:45:07 +08:00
Mrunal Patel	c54f1495e3	Fix error shadow and error check warnings Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-01-06 16:21:23 -08:00
Aleksa Sarai	c6d8a2f26f	merge branch 'pr-1158' Closes #1158 LGTMs: @hqhq @cyphar	2016-12-26 13:59:47 +11:00
Aleksa Sarai	244c9fc426	*: console rewrite This implements {createTTY, detach} and all of the combinations and negations of the two that were previously implemented. There are some valid questions about out-of-OCI-scope topics like !createTTY and how things should be handled (why do we dup the current stdio to the process, and how is that not a security issue). However, these will be dealt with in a separate patchset. In order to allow for late console setup, split setupRootfs into the "preparation" section where all of the mounts are created and the "finalize" section where we pivot_root and set things as ro. In between the two we can set up all of the console mountpoints and symlinks we need. We use two-stage synchronisation to ensures that when the syscalls are reordered in a suboptimal way, an out-of-place read() on the parentPipe will not gobble the ancilliary information. This patch is part of the console rewrite patchset. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-12-01 15:49:36 +11:00
Zhang Wei	b517076907	Check args numbers before application start Add a general args number validator for all client commands. Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2016-11-29 11:18:51 +08:00
xiekeyang	55e783b57a	remove unused returned variables name The returned variables name seems be able to removed. Signed-off-by: xiekeyang <xiekeyang@huawei.com>	2016-06-15 17:41:57 +08:00
Andrew Vagin	acef7461a4	restore: add the empty-ns option For example: ./runc restore --empty-ns network CTID In this case criu creates a network namespace, but doesn't restore it. We are going to use this option to restore docker containers and Docker sets a hook to restore a network namespace. https://github.com/xemul/criu/issues/165 Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>	2016-06-07 20:24:59 +03:00
Mrunal Patel	a753b06645	Replace github.com/codegangsta/cli by github.com/urfave/cli The package got moved to a different repository Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-06-06 11:47:20 -07:00
Qiang Huang	2503fca35d	Update man pages to refect the latest cli change The major change is the description of options, change it as the latest cli help message shows, which specify a "value" after an option if it takes value, and add (default: xxx) if the option has a default value. This also includes some other minor consistency fixes. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-05-28 13:33:57 +08:00
Aleksa Sarai	1a913c7b89	*: correctly chown() consoles In user namespaces, we need to make sure we don't chown() the console to unmapped users. This means we need to get both the UID and GID of the root user in the container when changing the owner. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-05-22 22:37:13 +10:00
Qiang Huang	8477638aab	Update cli package The old one has bug when showing help message for IntFlags. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-05-10 13:58:09 +08:00
Michael Crosby	f417e993d0	Update spec to v0.5.0 Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-04-12 14:11:40 -07:00
Michael Crosby	12bd4cffd0	Add --no-pivot option for containers on ramdisk This adds a `--no-pivot` cli flag to runc so that a container's rootfs can be located ontop of ramdisk/tmpfs and not fail because you cannot pivot root. This should be a cli flag and not part of the spec because this is a detail of the host/runtime environment and not an attribute of a container. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-03-30 12:02:17 -07:00
Ido Yariv	28b21a5988	Export CreateLibcontainerConfig Users of libcontainer other than runc may also require parsing and converting specification configuration files. Since runc cannot be imported, move the relevant functions and definitions to a separate package, libcontainer/specconv. Signed-off-by: Ido Yariv <ido@wizery.com>	2016-03-25 12:19:18 -04:00
Mrunal Patel	7e91a96605	Add support for systemd cgroups in runc Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-03-22 17:08:07 -07:00
Michael Crosby	fdb100d247	Destroy container along with processes before stdio We need to make sure the container is destroyed before closing the stdio for the container. This becomes a big issues when running in the host's pid namespace because the other processes could have inherited the stdio of the initial process. The call to close will just block as they still have the io open. Calling destroy before closing io, especially in the host pid namespace will cause all additional processes to be killed in the container's cgroup. This will allow the io to be closed successfuly. This change makes sure the order for destroy and close is correct as well as ensuring that if any errors encoutered during start or exec will be handled by terminating the process and destroying the container. We cannot use defers here because we need to enforce the correct ordering on destroy. This also sets the subreaper setting for runc so that when running in pid host, runc can wait on the addiontal processes launched by the container, useful on destroy, but also good for reaping the additional processes that were launched. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-03-15 13:17:11 -07:00
Michael Crosby	47eaa08f5a	Update runc usage for new specs changes Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-03-10 14:18:39 -08:00
Michael Crosby	044e298507	Improve error handling in runc The error handling on the runc cli is currenly pretty messy because messages to the user are split between regular stderr format and logrus message format. This changes all the error reporting to the cli to only output on stderr and exit(1) for consumers of the api. By default logrus logs to /dev/null so that it is not seen by the user. If the user wants extra and/or structured loggging/errors from runc they can use the `--log` flag to provide a path to the file where they want this information. This allows a consistent behavior on the cli but extra power and information when debugging with logs. This also includes a change to enable the same logging information inside the container's init by adding an init cli command that can share the existing flags for all other runc commands. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-03-09 11:08:30 -08:00
Michael Crosby	8d0a05b8dd	Wait for pipes to write all data before exit Add a waitgroup to wait for the io.Copy of stdout/err to finish before existing runc. The problem happens more in exec because it is really fast and the pipe has data buffered but not yet read after the process has already exited. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-02-26 12:14:47 -08:00
Mrunal Patel	90472aeb9e	Merge pull request #546 from mikebrow/usage-updates updating usage for runc, and all runc commands that now use <container id> as the first argument	2016-02-17 21:13:22 +05:30
Mike Brown	f4e37ab63e	updating usage for runc and runc commands Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2016-02-17 09:00:39 -06:00
Michael Crosby	ce72f86a2b	Merge pull request #558 from rajasec/tty-panic panic during start of failed detached container	2016-02-16 16:01:08 -08:00
Julian Friedman	5fbdf6c3fc	Register signal handlers earlier to avoid zombies newSignalHandler needs to be called before the process is started, otherwise when the process exits quickly the SIGCHLD is recieved (and ignored) before the handler is set up. When this happens the reaper never runs, the process becomes a zombie, and the exit code isn't returned to the user. Signed-off-by: Julian Friedman <julz.friedman@uk.ibm.com>	2016-02-16 18:38:54 +00:00
rajasec	321b842404	panic during start of failed detached container Signed-off-by: rajasec <rajasec79@gmail.com> Adding nil check before closing tty for restore operation Signed-off-by: rajasec <rajasec79@gmail.com>	2016-02-14 19:11:09 +05:30
rajasec	a7ee55b716	Adding tty closure for restore operation Signed-off-by: rajasec <rajasec79@gmail.com>	2016-02-10 09:48:12 +05:30
Michael Crosby	a7278cad98	Require containerd id as arg 1 Closes #532 This requires the container id to always be passed to all runc commands as arg one on the cli. This was the result of the last OCI meeting and how operations work with the spec. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-02-09 11:20:55 -08:00
Mike Brown	c2c0458598	merges latest spec with runc Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2016-02-05 12:47:09 -08:00
Michael Crosby	fbc74c0eba	Add detach and pid-file to restore Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-02-05 11:56:21 -08:00
Michael Crosby	4c4c9b85b7	Add --console to specify path to use from runc This flag allows systems that are running runc to allocate tty's that they own and provide to the container. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-01-07 15:01:36 -08:00
Michael Crosby	4415446c32	Add state pattern for container state transition Signed-off-by: Michael Crosby <crosbymichael@gmail.com> Add state status() method Signed-off-by: Michael Crosby <crosbymichael@gmail.com> Allow multiple checkpoint on restore Signed-off-by: Michael Crosby <crosbymichael@gmail.com> Handle leave-running state Signed-off-by: Michael Crosby <crosbymichael@gmail.com> Fix state transitions for inprocess Because the tests use libcontainer in process between the various states we need to ensure that that usecase works as well as the out of process one. Signed-off-by: Michael Crosby <crosbymichael@gmail.com> Remove isDestroyed method Signed-off-by: Michael Crosby <crosbymichael@gmail.com> Handling Pausing from freezer state Signed-off-by: Rajasekaran <rajasec79@gmail.com> freezer status Signed-off-by: Rajasekaran <rajasec79@gmail.com> Fixing review comments Signed-off-by: Rajasekaran <rajasec79@gmail.com> Added comment when freezer not available Signed-off-by: Rajasekaran <rajasec79@gmail.com> Signed-off-by: Michael Crosby <crosbymichael@gmail.com> Conflicts: libcontainer/container_linux.go Change checkFreezer logic to isPaused() Signed-off-by: Michael Crosby <crosbymichael@gmail.com> Remove state base and factor out destroy func Signed-off-by: Michael Crosby <crosbymichael@gmail.com> Add unit test for state transitions Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2015-12-17 13:55:38 -08:00
Michael Crosby	29b139f702	Move STDIO initialization to libcontainer.Process Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2015-12-10 16:11:49 -08:00
Mike Brown	8b19581694	adding support for --bundle -b to start, restore, and spec; fixes issue #310 Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2015-11-13 09:13:57 -06:00
Hui Kang	25da513c4b	Add option to support criu manage cgroups mode for dump and restore CRIU supports cgroup-manage mode from v1.7 Signed-off-by: Hui Kang <hkang.sunysb@gmail.com>	2015-10-11 04:42:54 +00:00
Alexander Morozov	ea5032bc5e	Adjust runc to new opencontainers/specs version I deleted possibility to specify config file from commands for now. Until we decide how it'll be done. Also I changed runc spec interface to write config files instead of output them. Signed-off-by: Alexander Morozov <lk4d4@docker.com>	2015-09-15 08:35:25 -07:00
Rajasekaran	77af09efd6	Restorefixforrunningcontainer Signed-off-by: Rajasekaran <rajasec79@gmail.com>	2015-08-31 22:16:38 +05:30

1 2

62 Commits