jasder/runc - runc - 军科开源项目托管

Commit Graph

Author	SHA1	Message	Date
Kir Kolyshkin	b19f9cecfe	Merge pull request #2343 from lifubang/updateSystemdScope fix data inconsistency when using runc update in systemd driven cgroup	2020-04-24 23:34:19 -07:00
Akihiro Suda	0fd8d468ea	Merge pull request #2318 from lifubang/linuxResources cgroupv2: use default allowed devices when linux resources is null	2020-04-25 09:00:23 +09:00
Mrunal Patel	634e51b52c	Merge pull request #2335 from kolyshkin/cgroupv2-cpt Fix cgroupv2 checkpoint/restore	2020-04-24 08:47:36 -07:00
Akihiro Suda	49ca1fd074	Merge pull request #2347 from kolyshkin/v2-allow-all-devs cgroupv2: allow to set EnableAllDevices=true	2020-04-24 16:09:40 +09:00
Mrunal Patel	c420a3ec7f	Merge pull request #2324 from kolyshkin/criu-freezer libcontainer: fix Checkpoint wrt cgroupv2	2020-04-23 19:24:38 -07:00
Kir Kolyshkin	440244268b	Merge pull request #2330 from KentaTada/use-linuxnamespace-const libcontainer: use consts of Namespace from runtime-spec	2020-04-23 18:58:29 -07:00
Kir Kolyshkin	55d5c99ca7	libct/mountToRootfs: rm useless code To make a bind mount read-only, it needs to be remounted. This is what the code removed does, but it is not needed here. We have to deal with three cases here: 1. cgroup v2 unified mode. In this case the mount is real mount with fstype=cgroup2, and there is no need to have a bind mount on top, as we pass readonly flag to the mount as is. 2. cgroup v1 + cgroupns (enableCgroupns == true). In this case the "mount" is in fact a set of real mounts with fstype=cgroup, and they are all performed in mountCgroupV1, with readonly flag added if needed. 3. cgroup v1 as is (enableCgroupns == false). In this case mountCgroupV1() calls mountToRootfs() again with an argument from the list obtained from getCgroupMounts(), i.e. a bind mount with the same flags as the original mount has (plus unix.MS_BIND \| unix.MS_REC), and mountToRootfs() does remounting (under the case "bind":). So, the code which this patch is removing is not needed -- it essentially does nothing in case 3 above (since the bind mount is already remounted readonly), and in cases 1 and 2 it creates an unneeded extra bind mount on top of a real one (or set of real ones). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-23 16:49:12 -07:00
Kir Kolyshkin	20959b1666	libcontainer/integration/checkpoint_test: simplify Since commit `9280e3566d` it is not longer needed to have `cgroup2' mount. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-23 15:22:32 -07:00
lifubang	1d4ccc8e0c	fix data inconsistent when runc update in systemd driven cgroup v1 Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-04-23 19:32:57 +08:00
lifubang	7682a2b2a5	fix data inconsistent when runc update in systemd driven cgroup v2 Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-04-23 19:32:07 +08:00
Kenta Tada	4474795388	libcontainer: use x/sys/unix instead of the hardcoded value PR_SET_CHILD_SUBREAPER is defined in x/sys/unix. Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>	2020-04-23 10:49:51 +09:00
Kir Kolyshkin	9280e3566d	checkpoint/restore: fix cgroupv2 handling In case of cgroupv2 unified hierarchy, the /sys/fs/cgroup mount is the real mount with fstype of cgroup2 (rather than a set of external bind mounts like for cgroupv1). So, we should not add it to the list of "external bind mounts" on both checkpoint and restore. Without this fix, checkpoint integration tests fail on cgroup v2. Also, same is true for cgroup v1 + cgroupns. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-22 11:26:43 -07:00
Kir Kolyshkin	75a92ea615	cgroupv2: allow to set EnableAllDevices=true In this case we just do not install any eBPF rules checking the devices. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-22 11:05:36 -07:00
Mrunal Patel	46be7b612e	Merge pull request #2299 from kolyshkin/fs2-init-ctrl cgroupv2: fix fs2 driver initialization	2020-04-20 21:27:42 -07:00
Kir Kolyshkin	ab276b1c09	cgroups/fs2/Destroy: use Remove, ignore ENOENT 1. There is no need to try removing it recursively. 2. Do not treat ENOENT as an error (similar to fs and systemd v1 drivers). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:27:40 -07:00
Kir Kolyshkin	4b4bc995ad	CreateCgroupPath: only enable needed controllers 1. Instead of enabling all available controllers, figure out which ones are required, and only enable those. 2. Amend all setFoo() functions to call isFooSet(). While this might seem unnecessary, it might actually help to uncover a bug. Imagine someone: - adds a cgroup.Resources.CpuFoo setting; - modifies setCpu() to apply the new setting; - but forgets to amend isCpuSet() accordingly <-- BUG In this case, a test case modifying CpuFoo will help to uncover the BUG. This is the reason why it's added. This patch could be amended by enabling controllers on a best-effort basis, i.e. : - do not return an error early if we can't enable some controllers; - if we fail to enable all controllers at once (usually because one of them can't be enabled), try enabling them one by one. Currently this is not implemented, and it's not clear whether this would be a good way to go or not. [v2: add/use is${Controller}Set() functions] [v3: document neededControllers()] [v4: drop "best-effort" part] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:27:40 -07:00
Kir Kolyshkin	bb47e35843	cgroup/systemd: reorganize 1. Rename the files - v1.go: cgroupv1 aka legacy; - v2.go: cgroupv2 aka unified hierarchy; - unsupported.go: when systemd is not available. 2. Move the code that is common between v1 and v2 to common.go Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:27:40 -07:00
Kir Kolyshkin	de1134156b	cgroups/fs2/CreateCgroupPath: nit This slightly improves code readability. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:27:40 -07:00
Kir Kolyshkin	b5c1949f2a	cgroups/fs2/CreateCgroupPath: reinstate check This check was removed in commit `5406833a65`. Now, when this function is called from a few places, it is no longer obvious that the path always starts with /sys/fs/cgroup/, so reinstate the check just to be on the safe side. This check also ensures that elements[3:] can be used safely. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:27:40 -07:00
Kir Kolyshkin	813cb3eb94	cgroupv2: fix fs2 cgroup init fs2 cgroup driver was not working because it did not enable controllers while creating cgroup directory; instead it was merely doing MkdirAll() and gathered the list of available controllers in NewManager(). Also, cgroup should be created in Apply(), not while creating a new manager instance. To fix: 1. Move the createCgroupsv2Path function from systemd driver to fs2 driver, renaming it to CreateCgroupPath. Use in Apply() from both fs2 and systemd drivers. 2. Delay available controllers map initialization to until it is needed. With this patch: - NewManager() only performs minimal initialization (initializin m.dirPath, if not provided); - Apply() properly creates cgroup path, enabling the controllers; - m.controllers is initialized lazily on demand. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:27:40 -07:00
Kir Kolyshkin	60eaed2ed6	cgroupv2: move sanity path check to common code The fs2 cgroup driver has a sanity check for path. Since systemd driver is relying on the same path, it makes sense to move this check to the common code. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:27:40 -07:00
Kir Kolyshkin	dbeff89491	cgroupv2/systemd: privatize UnifiedManager ... and its Cgroup field. There is no sense to keep it public. This was generated by gorename. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:27:40 -07:00
Kir Kolyshkin	88c13c0713	cgroupv2: use SecureJoin in systemd driver It seems that some paths are coming from user and are therefore untrusted. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:20:22 -07:00
Kir Kolyshkin	9c80cd672d	cgroupv2: rm legacy Paths from systemd driver Having map of per-subsystem paths in systemd unified cgroups driver does not make sense and makes the code less readable. To get rid of it, move the systemd v1-or-v2 init code to libcontainer/factory_linux.go which already has a function to deduce unified path out of paths map. End result is much cleaner code. Besides, we no longer write pid to the same cgroup file 7 times in Apply() like we did before. While at it - add `rootless` flag which is passed on to fs2 manager - merge getv2Path() into GetUnifiedPath(), don't overwrite path if it is set during initialization (on Load). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:19:51 -07:00
Kenta Tada	3de8613327	libcontainer: use consts of Namespace from runtime-spec Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>	2020-04-19 23:21:40 +09:00
Kir Kolyshkin	480bca91be	cgroups/fs2: move type decl to beginning It was weird having it somewhere in the middle. No code change, just moving it around. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-18 18:43:41 -07:00
Kir Kolyshkin	353e91770b	cgroups/fs2: do not use securejoin In this very case, the code is writing to cgroup2 filesystem, and the file name is well known and can't possibly be a symlink. So, using securejoin is redundant. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-18 18:43:41 -07:00
Kir Kolyshkin	58f970a01f	cgroups/fscommon: use errors.Is This is a forgotten hunk from PR #2291. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-18 16:16:49 -07:00
Kir Kolyshkin	af6b9e7fa9	nit: do not use syscall package In many places (not all of them though) we can use `unix.` instead of `syscall.` as these are indentical. In particular, x/sys/unix defines: ```go type Signal = syscall.Signal type Errno = syscall.Errno type SysProcAttr = syscall.SysProcAttr const ENODEV = syscall.Errno(0x13) ``` and unix.Exec() calls syscall.Exec(). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-18 16:16:49 -07:00
Kir Kolyshkin	b3a481eb77	libcontainer: fix Checkpoint wrt cgroupv2 Commit `9a0184b10f` meant to enable using cgroup v2 freezer for criu >= 3.14, but it looks like it is doing something else instead. The logic here is: - for cgroup v1, set FreezeCgroup, if available - for cgroup v2, only set it for criu >= 3.14 - do not use GetPaths() in case v2 is used (this method is obsoleted for v2 and will be removed) Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-17 16:17:00 -07:00
lifubang	d0f9b9ce42	default join cgroup namespace in runc example Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-04-17 21:37:50 +08:00
Aleksa Sarai	e4981c91b5	merge branch 'pr-2317' Ted Yu (1): Defer netns.Close() after error check LGTMs: @AkihiroSuda @cyphar Closes #2317	2020-04-16 23:35:07 +10:00
lifubang	d2a9c5da37	using default allowed devices when linux resources is null Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-04-16 11:40:44 +08:00
Ted Yu	7a978e354a	Defer netns.Close() after error check Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-04-15 18:33:20 -07:00
Akihiro Suda	9f6a2d4ddc	Merge pull request #2305 from kolyshkin/fs2-fix-default cgroupv2: fix fs2 driver default path	2020-04-16 10:16:48 +09:00
Michael Crosby	5c6216b1ed	Merge pull request #2278 from iwankgb/memory.numa_stats Exposing memory.numa_stats	2020-04-14 11:32:51 -04:00
Ted Yu	614bb96676	cgroupv2/systemd: Properly remove intermediate directory Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-04-13 08:32:08 -07:00
Kir Kolyshkin	ea36045fe1	cgroupv2: fix fs2 driver default path When the cgroupv2 fs driver is used without setting cgroupsPath, it picks up a path from /proc/self/cgroup. On a host with systemd, such a path can look like (examples from my machines): - /user.slice/user-1000.slice/session-4.scope - /user.slice/user-1000.slice/user@1000.service/gnome-launched-xfce4-terminal.desktop-4260.scope - /user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service This cgroup already contains processes in it, which prevents to enable controllers for a sub-cgroup (writing to cgroup.subtree_control fails with EBUSY or EOPNOTSUPP). Obviously, a parent cgroup (which does not contain tasks) should be used. Fixes opencontainers/runc/issues/2298 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-09 10:47:19 -07:00
Kenta Tada	e58a406b77	libcontainer: remove unneeded import Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>	2020-04-09 20:14:39 +09:00
Michael Crosby	9a93b7378c	Merge pull request #2288 from kolyshkin/mem-swap cgroupv2: fix setting MemorySwap	2020-04-08 14:54:22 -04:00
iwankgb	7fe0a98e79	Exposing memory.numa_stats Making information on page usage by type and NUMA node available Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>	2020-04-08 17:40:09 +02:00
Kir Kolyshkin	568cd62fa1	cgroupv2: only treat -1 as "max" Commit `6905b72154` treats all negative values as "max", citing cgroup v1 compatibility as a reason. In fact, in cgroup v1 only -1 is treated as "unlimited", and other negative values usually calse an error. Treat -1 as "max", pass other negative values as is (the error will be returned from the kernel). Fixes: `6905b72154` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-08 04:08:49 -07:00
Kir Kolyshkin	c86be8a2c1	cgroupv2: fix setting MemorySwap The resources.MemorySwap field from OCI is memory+swap, while cgroupv2 has a separate swap limit, so subtract memory from the limit (and make sure values are set and sane). Make sure to set MemorySwapMax for systemd, too. Since systemd does not have MemorySwapMax for cgroupv1, it is only needed for v2 driver. [v2: return -1 on any negative value, add unit test] [v3: treat any negative value other than -1 as error] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-07 20:45:53 -07:00
Giuseppe Scrivano	8b7ac5f4a5	libcontainer: use cgroups.NewStats otherwise the memoryStats and hugetlbStats maps are not initialized and GetStats() segfaults when using them. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2020-04-07 09:45:57 +02:00
Michael Crosby	d5e91b1c22	Merge pull request #2289 from AkihiroSuda/fix-TestGetContainerStateAfterUpdate Fix TestGetContainerStateAfterUpdate on cgroup v2	2020-04-06 17:30:11 -04:00
Mrunal Patel	0c7a9c0267	Merge pull request #2294 from tklauser/unused-consts Remove unused consts testScopeWait and testSliceWait	2020-04-06 13:26:42 -07:00
Ted Yu	21d7bb95eb	Close criuServer so that even if CRIU crashes or unexpectedly exits, runc will not hang Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-04-03 15:27:27 -07:00
Tobias Klauser	3e678c08f9	Remove unused consts testScopeWait and testSliceWait These are unused since commit `518c855833` ("Remove libcontainer detection for systemd features") Signed-off-by: Tobias Klauser <tklauser@distanz.ch>	2020-04-03 21:09:43 +02:00
Michael Crosby	e4363b0387	Merge pull request #2291 from kolyshkin/errors-unwrap-v2 Use errors.As() and errors.Is() to unwrap errors	2020-04-03 11:46:11 -04:00
Michael Crosby	ec8c6950c7	Merge pull request #2235 from Zyqsempai/add-hugetlb-controller-to-cgroupv2 Added HugeTlb controller for cgroupv2	2020-04-03 11:15:06 -04:00

1 2 3 4 5 ...

1447 Commits