jasder/runc - runc - 军科开源项目托管

Commit Graph

Author	SHA1	Message	Date
Kir Kolyshkin	0af5cd2041	Nit: fix use of bufio.Scanner.Err The Err() method should be called after the Scan() loop, not inside it. Found by git grep -A3 -F '.Scan()' \| grep Err Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-27 00:12:17 -07:00
Qiang Huang	d4a6a1d998	Merge pull request #2258 from masters-of-cats/eintr-retry Retry writing to cgroup files on EINTR error	2020-03-27 11:21:41 +08:00
Kir Kolyshkin	b45db5d3b2	libcontainer/cgroup: obsolete Get*Cgroup for v2 These functions should not be called from any code handling the cgroup2 unified hierarchy. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-26 19:20:00 -07:00
Kir Kolyshkin	a949e4f22f	cgroupv2: UnifiedManager.Apply: simplify Remove joinCgroupsV2() function, as its name and second parameter are misleading. Use createCgroupsv2Path() directly, do not call getv2Path() twice. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-26 19:20:00 -07:00
Kir Kolyshkin	5406833a65	cgroupv2/systemd: add getv2Path Function getSubsystemPath(), while works for v2 unified case, is suboptimal, as it does a few unnecessary calls. Add a simplified version of getSubsystemPath(), called getv2Path(), and use it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-26 19:17:09 -07:00
Kir Kolyshkin	ec1f957b23	cgroupv2: don't use getSubsystemPath in Apply This code is a copy-paste from cgroupv1 systemd code. Its aim is to check whether a subsystem is available, and skip those that are not. In case v2 unified hierarchy is used, getSubsystemPath never returns "not found" error, so calling it is useless. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-26 13:32:34 -07:00
Kir Kolyshkin	6905b72154	cgroupv2: use "max" for negative values Cgroup v1 kernel doc [1] says: > We can write "-1" to reset the ``.limit_in_bytes(unlimited)``. and cgroup v2 kernel documentation [2] says: > - If a controller implements an absolute resource guarantee and/or > limit, the interface files should be named "min" and "max" > respectively. If a controller implements best effort resource > guarantee and/or limit, the interface files should be named "low" > and "high" respectively. > > In the above four control files, the special token "max" should be > used to represent upward infinity for both reading and writing. Allow -1 value to still be used for v2, converting it to "max" where it makes sense to do so. This fixes the following issue: > runc update test_update --memory-swap -1: > error while setting cgroup v2: [write /sys/fs/cgroup/machine.slice/runc-cgroups-integration-test.scope/memory.swap.max: invalid argument > failed to write "-1" to "/sys/fs/cgroup/machine.slice/runc-cgroups-integration-test.scope/memory.swap.max" > github.com/opencontainers/runc/libcontainer/cgroups/fscommon.WriteFile > /home/kir/go/src/github.com/opencontainers/runc/libcontainer/cgroups/fscommon/fscommon.go:21 > github.com/opencontainers/runc/libcontainer/cgroups/fs2.setMemory > /home/kir/go/src/github.com/opencontainers/runc/libcontainer/cgroups/fs2/memory.go:20 > github.com/opencontainers/runc/libcontainer/cgroups/fs2.(manager).Set > /home/kir/go/src/github.com/opencontainers/runc/libcontainer/cgroups/fs2/fs2.go:175 > github.com/opencontainers/runc/libcontainer/cgroups/systemd.(UnifiedManager).Set > /home/kir/go/src/github.com/opencontainers/runc/libcontainer/cgroups/systemd/unified_hierarchy.go:290 > github.com/opencontainers/runc/libcontainer.(linuxContainer).Set > /home/kir/go/src/github.com/opencontainers/runc/libcontainer/container_linux.go:211 [1] linux/Documentation/admin-guide/cgroup-v1/memory.rst [2] linux/Documentation/admin-guide/cgroup-v2.rst Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-26 11:14:32 -07:00
Kir Kolyshkin	a675b5ebea	cgroupv2: don't try to set kmem for systemd case To the best of my knowledge, it has been decided to drop the kernel memory controller from the cgroupv2 hierarchy, so "kernel memory limits" do not exist if we're using v2 unified. So, we need to ignore kernel memory setting. This was already done in non-systemd case (see commit `88e8350de`), let's do the same for systemd. This fixes the following error: > container_linux.go:349: starting container process caused "process_linux.go:306: applying cgroup configuration for process caused \"open /sys/fs/cgroup/machine.slice/runc-cgroups-integration-test.scope/tasks: no such file or directory\"" Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-25 20:00:23 -07:00
Mrunal Patel	7de5db3dad	Merge pull request #2263 from kolyshkin/nits Assorted minor nits in libcontainer	2020-03-24 14:17:22 -07:00
Akihiro Suda	cc183ca662	Merge pull request #2242 from AkihiroSuda/vendor-systemd vendor: update go-systemd and godbus	2020-03-25 02:40:22 +09:00
Kir Kolyshkin	5542a2c77d	libcontainer/cgroups: GetAllPids: optimize 1. Return earlier if there is an error. 2. Do not use filepath.Split on every entry, use info.Name() instead. 3. Make readProcsFile() accept file name as an argument, to avoid unnecessary file name and directory splitting and merging. 4. Skip on info.IsDir() -- this avoids an error when cgroup name is set to "cgroup.procs". This is still not very good since filepath.Walk() performs an unnecessary stat(2) on every entry, but better than before. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-20 12:27:36 -07:00
Kir Kolyshkin	12dc475dd6	libcontainer: simplify createCgroupsv2Path fmt.Sprintf is slow and is not needed here, string concatenation would be sufficient. It is also redundant to convert []byte from string and back, since `bytes` package now provides the same functions as `strings`. Use Fields() instead of TrimSpace() and Split(), mainly for readability (note Fields() is somewhat slower than Split() but here it doesn't matter much). Use Join() to prepend the plus signs. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-20 11:51:55 -07:00
Mario Nitchev	648295be98	Skip test for cgroups v2 Signed-off-by: Yulia Nedyalkova <julianedialkova@hotmail.com>	2020-03-19 12:54:54 +02:00
Danail Branekov	f34eb2c003	Retry writing to cgroup files on EINTR error Golang 1.14 introduces asynchronous preemption which results into applications getting frequent EINTR (syscall interrupted) errors when invoking slow syscalls, e.g. when writing to cgroup files. As writing to cgroups is idempotent, it is safe to retry writing to the file whenever the write syscall is interrupted. Signed-off-by: Mario Nitchev <marionitchev@gmail.com>	2020-03-18 13:00:05 +02:00
Akihiro Suda	492d525e55	vendor: update go-systemd and godbus Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-03-16 13:26:03 +09:00
Akihiro Suda	aa269315a4	cgroup2: add CpuMax conversion Fix #2243 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-03-13 02:58:39 +09:00
Akihiro Suda	64e9a97981	cgroup2: fix conversion * TestConvertCPUSharesToCgroupV2Value(0) was returning 70369281052672, while the correct value is 0 * ConvertBlkIOToCgroupV2Value(0) was returning 32, while the correct value is 0 * ConvertBlkIOToCgroupV2Value(1000) was returning 4, while the correct value is 10000 Fix #2244 Follow-up to #2212 #2213 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-03-13 02:57:07 +09:00
Boris Popovschi	89a87adb38	Changed hugetlb pagesizes info source Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>	2020-03-10 15:28:45 +02:00
Boris Popovschi	d804611d05	Added failcnt stats Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>	2020-03-10 15:19:44 +02:00
Akihiro Suda	6503438fd6	Merge pull request #2212 from Zyqsempai/2211-convert-blkio-weight-properly Convert blkioWeight to io.weight properly	2020-03-05 09:32:45 +09:00
Qiang Huang	3b7e32feba	Merge pull request #2210 from Zyqsempai/2164-remove-deprecated-systemd-resources Exchange deprecated systemd resources with the appropriate for cgroupv2	2020-02-29 10:13:55 +08:00
Boris Popovschi	7f37afa892	Added HugeTlb controller for cgroupv2 Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>	2020-02-25 14:50:55 +02:00
Aleksa Sarai	0f32b03dda	merge branch 'pr-2192' Boris Popovschi (2): Fix skip message for cgroupv2 Fix MAJ:MIN io.stat parsing order LGTMs: @hqhq @cyphar Closes #2192	2020-02-21 16:00:17 +11:00
Boris Popovschi	4b8134f63b	Convert blkioWeight to io.weight properly Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>	2020-02-18 15:44:07 +02:00
Kir Kolyshkin	4c5c3fb960	Support for setting systemd properties via annotations In case systemd is used to set cgroups for the container, it creates a scope unit dedicated to it (usually named `runc-$ID.scope`). This patch adds an ability to set arbitrary systemd properties for the systemd unit via runtime spec annotations. Initially this was developed as an ability to specify the `TimeoutStopUSec` property, but later generalized to work with arbitrary ones. Example usage: add the following to runtime spec (config.json): ``` "annotations": { "org.systemd.property.TimeoutStopUSec": "uint64 123456789", "org.systemd.property.CollectMode":"'inactive-or-failed'" }, ``` and start the container (e.g. `runc --systemd-cgroup run $ID`). The above will set the following systemd parameters: * `TimeoutStopSec` to 2 minutes and 3 seconds, * `CollectMode` to "inactive-or-failed". The values are in the gvariant format (see [1]). To figure out which type systemd expects for a particular parameter, see systemd sources. In particular, parameters with `USec` suffix require an `uint64` typed argument, while gvariant assumes int32 for a numeric values, therefore the explicit type is required. NOTE that systemd receives the time-typed parameters as USec but shows them (in `systemctl show`) as Sec. For example, the stop timeout should be set as `TimeoutStopUSec` but is shown as `TimeoutStopSec`. [1] https://developer.gnome.org/glib/stable/gvariant-text.html Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-02-17 16:07:19 -08:00
Boris Popovschi	7c439cc6f6	Added conversion for cpu.weight v2 Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>	2020-02-12 11:32:34 +02:00
Boris Popovschi	3b992087b8	Fix skip message for cgroupv2 Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>	2020-02-03 14:27:12 +02:00
Boris Popovschi	5b96f314ba	Exchanged deprecated systemd resources with the appropriate for cgroupv2 Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>	2020-01-15 18:09:33 +02:00
Boris Popovschi	cf9b7c33e1	Fix MAJ:MIN io.stat parsing order Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>	2020-01-15 14:39:14 +02:00
Akihiro Suda	5c20ea1472	fix merging #2177 and #2169 A new method was added to the cgroup interface when #2177 was merged. After #2177 got merged, #2169 was merged without rebase (sorry!) and compilation was failing: libcontainer/cgroups/fs2/fs2.go:208:22: container.Cgroup undefined (type *configs.Config has no field or method Cgroup) Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-01-14 11:13:25 +09:00
Mrunal Patel	5cc0deaf7a	Merge pull request #2169 from AkihiroSuda/split-fs cgroup2: split fs2 from fs	2020-01-13 16:23:27 -08:00
Julio Montes	8ddd892072	libcontainer: add method to get cgroup config from cgroup Manager `configs.Cgroup` contains the configuration used to create cgroups. This configuration must be saved to disk, since it's required to restore the cgroup manager that was used to create the cgroups. Add method to get cgroup configuration from cgroup Manager to allow API users save it to disk and restore a cgroup manager later. fixes #2176 Signed-off-by: Julio Montes <julio.montes@intel.com>	2019-12-17 22:46:03 +00:00
Akihiro Suda	ec49f98d72	fs2: support legacy device spec (to pass CI) Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-12-06 15:53:07 +09:00
Akihiro Suda	88e8350de2	cgroup2: split fs2 from fs split fs2 package from fs, as mixing up fs and fs2 is very likely to result in unmaintainable code. Inspired by containerd/cgroups#109 Fix #2157 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-12-06 15:42:10 +09:00
Michael Crosby	8bb10af481	Merge pull request #2165 from AkihiroSuda/travis-f31 .travis.yml: add Fedora 31 vagrant box (for cgroup2)	2019-12-05 16:26:51 -05:00
Akihiro Suda	faf1e44ea9	cgroup2: ebpf: increase RLIM_MEMLOCK to avoid BPF_PROG_LOAD error Fix #2167 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-11-07 15:43:27 +09:00
Mrunal Patel	46def4cc4c	Merge pull request #2154 from jpeach/2008-remove-static-build-tag Remove the static_build build tag.	2019-11-04 17:10:59 -08:00
Akihiro Suda	ccd4436fc4	.travis.yml: add Fedora 31 vagrant box (for cgroup2) As the baby step, only unit tests are executed. Failing tests are currently skipped and will be fixed in follow-up PRs. Fix #2124 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-10-31 16:53:01 +09:00
Akihiro Suda	faf673ee45	cgroup2: port over eBPF device controller from crun The implementation is based on https://github.com/containers/crun/blob/0.10.2/src/libcrun/ebpf.c Although ebpf.c is originally licensed under LGPL-3.0-or-later, the author Giuseppe Scrivano agreed to relicense the file in Apache License 2.0: https://github.com/opencontainers/runc/issues/2144#issuecomment-543116397 See libcontainer/cgroups/ebpf/devicefilter/devicefilter_test.go for tested configurations. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-10-31 14:01:46 +09:00
Qiang Huang	e57a774066	Merge pull request #2149 from AkihiroSuda/cgroup2-ps cgroup2: implement `runc ps`	2019-10-31 09:44:39 +08:00
Qiang Huang	d239ca8425	Merge pull request #2148 from AkihiroSuda/cg2-ignore-cpuset-when-no-config cgroup2: cpuset_v2: skip Apply when no limit is specified	2019-10-29 21:57:58 +08:00
Akihiro Suda	74a3fe5d1b	cgroup2: do not parse /proc/cgroups /proc/cgroups is meaningless for v2 and should be ignored. https://github.com/torvalds/linux/blob/v5.3/Documentation/admin-guide/cgroup-v2.rst#deprecated-v1-core-features * Now GetAllSubsystems() parses /sys/fs/cgroup/cgroup.controller, not /proc/cgroups. The function result also contains "pseudo" controllers: {"devices", "freezer"}. As it is hard to detect availability of pseudo controllers, pseudo controllers are always assumed to be available. * Now IOGroupV2.Name() returns "io", not "blkio" Fix #2155 #2156 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-10-28 00:00:33 +09:00
James Peach	13919f5dfd	Remove the static_build build tag. The `static_build` build tag was introduced in `e9944d0f` to remove build warnings related to systemd cgroup driver dependencies. Since then, those dependencies have changed and building the systemd cgroup driver no longer imports dlopen. After this change, runc builds will always include the systemd cgroup driver. This fixes #2008. Signed-off-by: James Peach <jpeach@apache.org>	2019-10-26 08:28:45 +11:00
Michael Crosby	c4d8e1688c	Merge pull request #2140 from crosbymichael/fs-unified Set unified mountpoint in find mnt func	2019-10-24 15:20:47 -04:00
Akihiro Suda	dbd771e475	cgroup2: implement `runc ps` Implemented `runc ps` for cgroup v2 , using a newly added method `m.GetUnifiedPath()`. Unlike the v1 implementation that checks `m.GetPaths()["devices"]`, the v2 implementation does not require the device controller to be available. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-10-19 01:59:24 +09:00
Akihiro Suda	d918e7f408	cpuset_v2: skip Apply when no limit is specified Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-10-19 00:33:31 +09:00
Akihiro Suda	033936ef76	io_v2.go: remove blkio v1 code Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-10-18 21:33:48 +09:00
Michael Crosby	b28f58f31b	Set unified mountpoint in find mnt func This is needed for the fsv2 cgroups to work when there is a unified mountpoint. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-10-15 15:40:03 -04:00
tianye15	28e58a0f6a	Support different field counts of cpuaact.stats Signed-off-by: skilxnTL <tylxltt@gmail.com>	2019-09-29 10:20:58 +08:00
Giuseppe Scrivano	524cb7c318	libcontainer: add systemd.UnifiedManager Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2019-09-05 13:02:27 +02:00
Giuseppe Scrivano	ec11136828	libcontainer, cgroups: rename systemd.Manager to LegacyManager Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2019-09-05 13:02:26 +02:00
Giuseppe Scrivano	1932917b71	libcontainer: add initial support for cgroups v2 allow to set what subsystems are used by libcontainer/cgroups/fs.Manager. subsystemsUnified is used on a system running with cgroups v2 unified mode. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2019-09-05 13:02:25 +02:00
Mrunal Patel	c61c7370f9	Merge pull request #2103 from sipsma/cgnil cgroups/fs: check nil pointers in cgroup manager	2019-08-26 14:05:44 -07:00
Mrunal Patel	3525eddec5	Merge pull request #2117 from filbranden/detection1 Remove libcontainer detection for systemd features	2019-08-25 13:15:15 -07:00
Filipe Brandenburger	518c855833	Remove libcontainer detection for systemd features Transient units (and transient slice units) have been available for quite a long time and RHEL 7 with systemd v219 (likely the oldest OS we care about at this point) supports that. A system running a systemd without these features is likely to break a lot of other stuff that runc/libcontainer care about. Regarding delegated slices, modern systemd doesn't allow it and runc/libcontainer run fine on it, so we might as well just stop requesting it on older versions of systemd which allowed it. (Those versions never really changed behavior significantly when that option was passed anyways.) Signed-off-by: Filipe Brandenburger <filbranden@gmail.com>	2019-08-22 21:53:24 -07:00
Filipe Brandenburger	588f040a77	Avoid the dependency on cgo through go-systemd/util package This dependency is only needed in package "github.com/coreos/go-systemd/util" and we only use it for IsRunningSystemd(), which is a simple Go function that just stats a file. Let's just borrow it here, so we remove the dependency and can remove that package from vendored build. This also removes dependencies on dlopen and on trying to find libsystemd.so or libsystemd-login.so in the system. Tested that this still builds and works as expected. Signed-off-by: Filipe Brandenburger <filbranden@gmail.com>	2019-08-22 21:07:24 -07:00
Erik Sipsma	9c822e4847	cgroups/fs: check nil pointers in cgroup manager Signed-off-by: Erik Sipsma <sipsma@amazon.com>	2019-08-14 09:50:45 -07:00
Odin Ugedal	6f77e35daf	Export list of HugePageSizeUnits This will allow others to import it instead of copying it. Signed-off-by: Odin Ugedal <odin@ugedal.com>	2019-05-30 20:17:30 +02:00
Odin Ugedal	c6445b1c1c	Add tests for GetHugePageSize Add tests to avoid regressions Signed-off-by: Odin Ugedal <odin@ugedal.com>	2019-05-30 17:27:32 +02:00
Odin Ugedal	273e7b74a7	Fix cgroup hugetlb size prefix for kB The hugetlb cgroup control files (introduced here in 2012: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abb8206cb0773) use "KB" and not "kB" (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349). The behavior in the kernel has not changed since the introduction, and the current code using "kB" will therefore fail on devices with small amounts of ram (see https://github.com/kubernetes/kubernetes/issues/77169) running a kernel with config flag CONFIG_HUGETLBFS=y As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB", "MB" and "GB" are used, so the others may be removed as well. Here is a real world example of the files inside the "/sys/kernel/mm/hugepages/" directory: - "hugepages-64kB" - "hugepages-2048kB" - "hugepages-32768kB" - "hugepages-1048576kB" And the corresponding cgroup files: - "hugetlb.64KB._____" - "hugetlb.2MB._____" - "hugetlb.32MB._____" - "hugetlb.1GB._____" Signed-off-by: Odin Ugedal <odin@ugedal.com>	2019-05-29 21:52:43 +02:00
Filipe Brandenburger	46351eb3d1	Move systemd.Manager initialization into a function in that module This will permit us to extend the internals of systemd.Manager to include further information about the system, such as whether cgroupv1, cgroupv2 or both are in effect. Furthermore, it allows a future refactor of moving more of UseSystemd() code into the factory initialization function. Signed-off-by: Filipe Brandenburger <filbranden@gmail.com>	2019-05-01 13:22:19 -07:00
Filipe Brandenburger	cd41feb46b	Remove detection for scope properties, which have always been broken The detection for scope properties (whether scope units support DefaultDependencies= or Delegate=) has always been broken, since systemd refuses to create scopes unless at least one PID is attached to it (and this has been so since scope units were introduced in systemd v205.) This can be seen in journal logs whenever a container is started with libpod: Feb 11 15:08:07 myhost systemd[1]: libcontainer-12345-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing. Feb 11 15:08:07 myhost systemd[1]: libcontainer-12345-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing. Since this logic never worked, just assume both attributes are supported (which is what the code does when detection fails for this reason, since it's looking for an "unknown attribute" or "read-only attribute" to mark them as false) and skip the detection altogether. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2019-02-11 16:05:37 -08:00
Mrunal Patel	4e4c907193	Merge pull request #1950 from cloudfoundry-incubator/enter-pid-race Resilience in adding of exec tasks to cgroups	2019-02-01 13:18:16 -08:00
Giuseppe Scrivano	f01923376d	systemd: fix setting kernel memory limit since commit `df3fa115f9` it is not possible to set a kernel memory limit when using the systemd cgroups backend as we use cgroup.Apply twice. Skip enabling kernel memory if there are already tasks in the cgroup. Without this patch, runc fails with: container_linux.go:344: starting container process caused "process_linux.go:311: applying cgroup configuration for process caused \"failed to set memory.kmem.limit_in_bytes, because either tasks have already joined this cgroup or it has children\"" Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2019-01-10 11:33:50 +01:00
Tom Godkin	bdf3524b34	Retry adding pids to cgroups when EINVAL occurs The kernel will sometimes return EINVAL when writing a pid to a cgroup.procs file. It does so when the task being added still has the state TASK_NEW. See: https://elixir.bootlin.com/linux/v4.8/source/kernel/sched/core.c#L8286 Co-authored-by: Danail Branekov <danailster@gmail.com> Signed-off-by: Tom Godkin <tgodkin@pivotal.io> Signed-off-by: Danail Branekov <danailster@gmail.com>	2018-12-17 15:34:47 +00:00
JoeWrightss	769d6c4a75	Fix some typos Signed-off-by: JoeWrightss <zhoulin.xie@daocloud.io>	2018-12-09 23:52:54 +08:00
Aleksa Sarai	8a4629f7b5	cgroups: nokmem: error out on explicitly-set kmemcg limits When built with nokmem we explicitly are disabling support for kmemcg, but it is a strict specification requirement that if we cannot fulfil an aspect of the container configuration that we error out. Completely ignoring explicitly-requested kmemcg limits with nokmem would undoubtably lead to problems. Fixes: `6a2c155968` ("libcontainer: ability to compile without kmem") Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-12-01 14:31:35 +11:00
Michael Crosby	76520a4bf0	Merge pull request #1872 from masters-of-cats/better-find-cgroup-mountpoint Respect container's cgroup path	2018-11-16 14:06:54 -05:00
Mrunal Patel	4769cdf607	Merge pull request #1916 from crosbymichael/cgns Add support for cgroup namespace	2018-11-13 12:21:38 -08:00
Mrunal Patel	f000fe11ec	Merge pull request #1917 from slp/master libcontainer: map PidsLimit to systemd's TasksMax property	2018-11-13 12:21:23 -08:00
Michael Crosby	aa7917b751	Merge pull request #1911 from theSuess/linter-fixes Various cleanups to address linter issues	2018-11-13 12:13:34 -05:00
Kir Kolyshkin	6a2c155968	libcontainer: ability to compile without kmem Commit `fe898e7862` (PR #1350) enables kernel memory accounting for all cgroups created by libcontainer -- even if kmem limit is not configured. Kernel memory accounting is known to be broken in some kernels, specifically the ones from RHEL7 (including RHEL 7.5). Those kernels do not support kernel memory reclaim, and are prone to oopses. Unconditionally enabling kmem acct on such kernels lead to bugs, such as * https://github.com/opencontainers/runc/issues/1725 * https://github.com/kubernetes/kubernetes/issues/61937 * https://github.com/moby/moby/issues/29638 This commit gives a way to compile runc without kernel memory setting support. To do so, use something like make BUILDTAGS="seccomp nokmem" Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-10-31 20:35:51 -07:00
Yuanhong Peng	df3fa115f9	Add support for cgroup namespace Cgroup namespace can be configured in `config.json` as other namespaces. Here is an example: ``` "namespaces": [ { "type": "pid" }, { "type": "network" }, { "type": "ipc" }, { "type": "uts" }, { "type": "mount" }, { "type": "cgroup" } ], ``` Note that if you want to run a container which has shared cgroup ns with another container, then it's strongly recommended that you set proper `CgroupsPath` of both containers(the second container's cgroup path must be the subdirectory of the first one). Or there might be some unexpected results. Signed-off-by: Yuanhong Peng <pengyuanhong@huawei.com> Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2018-10-31 10:51:43 -04:00
Sergio Lopez	5c6b9c3c1c	libcontainer: map PidsLimit to systemd's TasksMax property Currently runc applies PidsLimit restriction by writing directly to cgroup's pids.max, without notifying systemd. As a consequence, when the later updates the context of the corresponding scope, pids.max is reset to the value of systemd's TasksMax property. This can be easily reproduced this way (I'm using "postfix" here just an example, any unrelated but existing service will do): # CTR=`docker run --pids-limit 111 --detach --rm busybox /bin/sleep 8h` # cat /sys/fs/cgroup/pids/system.slice/docker-${CTR}.scope/pids.max 111 # systemctl disable --now postfix # systemctl enable --now postfix # cat /sys/fs/cgroup/pids/system.slice/docker-${CTR}.scope/pids.max max This patch adds TasksAccounting=true and TasksMax=PidsLimit to the properties sent to systemd. Signed-off-by: Sergio Lopez <slp@redhat.com>	2018-10-24 17:20:27 +02:00
Mrunal Patel	a00bf01908	Merge pull request #1862 from AkihiroSuda/decompose-rootless-pr Disable rootless mode except RootlessCgMgr when executed as the root in userns (fix Docker-in-LXD regression)	2018-10-15 17:32:15 -07:00
Dominik Süß	0b412e9482	various cleanups to address linter issues Signed-off-by: Dominik Süß <dominik@suess.wtf>	2018-10-13 21:14:03 +02:00
Danail Branekov	a1d5398afa	Respect container's cgroup path Respect the container's cgroup path when finding the container's cgroup mount point, which is useful in multi-tenant environments, where containers have their own unique cgroup mounts Signed-off-by: Danail Branekov <danailster@gmail.com> Signed-off-by: Oliver Stenbom <ostenbom@pivotal.io> Signed-off-by: Giuseppe Capizzi <gcapizzi@pivotal.io>	2018-09-25 17:43:36 +01:00
Aleksa Sarai	578fe65e4f	merge branch 'pr-1817' Fix duplicate entries and missing entries in getCgroupMountsHelper Add test for testing cgroup mounts on bedrock linux Stop relying on number of subsystems for cgroups LGTMs: @crosbymichael @cyphar Closes #1817	2018-09-19 19:48:17 +10:00
Akihiro Suda	06f789cf26	Disable rootless mode except RootlessCgMgr when executed as the root in userns This PR decomposes `libcontainer/configs.Config.Rootless bool` into `RootlessEUID bool` and `RootlessCgroups bool`, so as to make "runc-in-userns" to be more compatible with "rootful" runc. `RootlessEUID` denotes that runc is being executed as a non-root user (euid != 0) in the current user namespace. `RootlessEUID` is almost identical to the former `Rootless` except cgroups stuff. `RootlessCgroups` denotes that runc is unlikely to have the full access to cgroups. `RootlessCgroups` is set to false if runc is executed as the root (euid == 0) in the initial namespace. Otherwise `RootlessCgroups` is set to true. (Hint: if `RootlessEUID` is true, `RootlessCgroups` becomes true as well) When runc is executed as the root (euid == 0) in an user namespace (e.g. by Docker-in-LXD, Podman, Usernetes), `RootlessEUID` is set to false but `RootlessCgroups` is set to true. So, "runc-in-userns" behaves almost same as "rootful" runc except that cgroups errors are ignored. This PR does not have any impact on CLI flags and `state.json`. Note about CLI: * Now `runc --rootless=(auto\|true\|false)` CLI flag is only used for setting `RootlessCgroups`. * Now `runc spec --rootless` is only required when `RootlessEUID` is set to true. For runc-in-userns, `runc spec` without `--rootless` should work, when sufficient numbers of UID/GID are mapped. Note about `$XDG_RUNTIME_DIR` (e.g. `/run/user/1000`): * `$XDG_RUNTIME_DIR` is ignored if runc is being executed as the root (euid == 0) in the initial namespace, for backward compatibility. (`/run/runc` is used) * If runc is executed as the root (euid == 0) in an user namespace, `$XDG_RUNTIME_DIR` is honored if `$USER != "" && $USER != "root"`. This allows unprivileged users to allow execute runc as the root in userns, without mounting writable `/run/runc`. Note about `state.json`: * `rootless` is set to true when `RootlessEUID == true && RootlessCgroups == true`. Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-09-07 15:05:03 +09:00
Yan Zhu	feb90346e0	doc: fix typo Signed-off-by: Yan Zhu <yanzhu@alauda.io>	2018-09-07 11:58:59 +08:00
Jay Kamat	a2faaa1317	Fix duplicate entries and missing entries in getCgroupMountsHelper Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2018-07-31 20:12:18 -07:00
Jay Kamat	e5a7c61f3c	Add test for testing cgroup mounts on bedrock linux Add a mountinfo from a bedrock linux system with 4 strata, and include it for tests Signed-off-by: Jay Kamat <jaygkamat@gmail.com> Signed-off-by: Daniel Dao <dqminh89@gmail.com>	2018-06-24 00:01:07 +01:00
Daniel Dao	5ee0648bfb	Stop relying on number of subsystems for cgroups When there are complicated mount setups, there can be multiple mount points which have the subsystem we are looking for. Instead of counting the mountpoints, tick off subsystems until we have found them all. Without the 'all' flag, ignore duplicate subsystems after the first. Signed-off-by: Daniel Dao <dqminh89@gmail.com>	2018-06-24 00:00:58 +01:00
Aleksa Sarai	939d5a3753	cgroup: clean up isIgnorableError for skippable EROFS Include a rootless argument for isIgnorableError to avoid people accidentally using isIgnorableError when they shouldn't (we don't ignore any errors when running as root as that really isn't safe). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-05-25 11:31:41 +10:00
Qiang Huang	dd67ab10d7	Merge pull request #1759 from cyphar/rootless-erofs-as-eperm rootless: cgroup: treat EROFS as a skippable error	2018-05-25 09:24:16 +08:00
Derek Carr	b515963c10	systemd cpu quota ignores -1 Signed-off-by: Derek Carr <decarr@redhat.com>	2018-05-23 14:28:39 -04:00
Filipe Brandenburger	165ee45334	Make channel for StartTransientUnit buffered So that, if a timeout happens and we decide to stop blocking on the operation, the writer will not block when they try to report the result of the operation. This should address Issue #1780 and it's a follow up for PR #1683, PR #1754 and PR #1772. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2018-04-14 08:49:50 -07:00
Filipe Brandenburger	0e16bd9b53	Detect whether Delegate is available on both slices and scopes Starting with systemd 237, in preparation for cgroup v2, delegation is only now available for scopes, not slices. Update libcontainer code to detect whether delegation is available on both and use that information when creating new slices. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2018-04-10 11:42:55 -07:00
Filipe Brandenburger	8ab251f298	Fix systemd.Apply() to check for DBus error before waiting on a channel. The channel was introduced in #1683 to work around a race condition. However, the check for error in StartTransientUnit ignores the error for an already existing unit, and in that case there will be no notification from DBus (so waiting on the channel will make it hang.) Later PR #1754 added a timeout, which worked around the issue, but we can fix this correctly by only waiting on the channel when there is no error. Fix the code to do so. The timeout handling was kept, since there might be other cases where this situation occurs (https://bugzilla.redhat.com/show_bug.cgi?id=1548358 mentions calling this code from inside a container, it's unclear whether an existing container was in use or not, so not sure whether this would have fixed that bug as well.) Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2018-04-09 11:51:59 -07:00
Aleksa Sarai	03e585985f	rootless: cgroup: treat EROFS as a skippable error In some cases, /sys/fs/cgroups is mounted read-only. In rootless containers we can consider this effectively identical to having cgroups that we don't have write permission to -- because the user isn't responsible for the read-only setup and cannot modify it. The rules are identical to when /sys/fs/cgroups is not writable by the unprivileged user. An example of this is the default configuration of Docker, where cgroups are mounted as read-only as a preventative security measure. Reported-by: Vladimir Rutsky <rutsky@google.com> Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-03-17 13:53:42 +11:00
Qiang Huang	9facb87f87	Merge pull request #1754 from vikaschoudhary16/add-timeout Add timeout while waiting for StartTransinetUnit completion signal	2018-03-08 09:09:34 +08:00
vikaschoudhary16	04e95b526d	Add timeout while waiting for StartTransinetUnit completion signal from dbus Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>	2018-03-07 05:11:38 -05:00
Denys Smirnov	3d26fc3fd7	cgroups/fs: fix NPE on Destroy than no cgroups are set Currently Manager accepts nil cgroups when calling Apply, but it will panic then trying to call Destroy with the same config. Signed-off-by: Denys Smirnov <denys@sourced.tech>	2018-03-06 23:31:31 +01:00
Michael Crosby	595bea022f	Merge pull request #1722 from ravisantoshgudimetla/fix-systemd-path fix systemd slice expansion so that it could be consumed by cAdvisor	2018-02-20 09:59:24 -05:00
ravisantoshgudimetla	7019e1de7b	fix systemd slice expansion so that it could be consumed by cAdvisor Signed-off-by: ravisantoshgudimetla <ravisantoshgudimetla@gmail.com>	2018-02-18 21:32:39 -05:00
vikaschoudhary16	d5b4a3eddb	Fix race against systemd - T0: runc triggers a systemd unit creation asynchronously from [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L298) - T1: runc then moves ahead and starts creating cgroup paths(.scope directories), [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L348). Kernel creates .scope directory and cgroup.procs file(along with other default files) in the directory automatically, in an atomic manner. - T3: systemd execution thread which was invoked at time `T0`, is still in the process of unit creation. systemd also trying to create cgroup paths and deletes the `.scope` directory which is created at time `T1` by runc from [here](https://github.com/systemd/systemd/blob/v219/src/shared/cgroup-util.c#L1630) in the code Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>	2018-01-08 09:37:26 -05:00
Seth Jennings	bca53e7b49	systemd: adjust CPUQuotaPerSecUSec to compensate for systemd internal handling Signed-off-by: Seth Jennings <sjenning@redhat.com>	2017-11-15 20:20:06 -06:00
Michael Crosby	ff4481dbf6	Merge pull request #1540 from cloudfoundry-incubator/rootless-cgroups Support cgroups with limits as rootless	2017-10-16 12:03:49 -04:00
Sebastien Boeuf	acb93c9c62	libcontainer: cgroups: Write freezer state after every state check This commit ensures we write the expected freezer cgroup state after every state check, in case the state check does not give the expected result. This can happen when a new task is created and prevents the whole cgroup to be FROZEN, leaving the state into FREEZING instead. This patch prevents the case of an infinite loop to happen. Fixes https://github.com/opencontainers/runc/issues/1609 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2017-10-12 07:07:28 -07:00
Will Martin	ca4f427af1	Support cgroups with limits as rootless Signed-off-by: Ed King <eking@pivotal.io> Signed-off-by: Gabriel Rosenhouse <grosenhouse@pivotal.io> Signed-off-by: Konstantinos Karampogias <konstantinos.karampogias@swisscom.com>	2017-10-05 11:22:54 +01:00
Yong Tang	e9944d0f4c	Disable systemd in static build This fix tries to address the warnings caused by static build with go 1.9. As systemd needs dlopen/dlclose, the following warnings will be generated for static build in go 1.9: ``` root@f4b077232050:/go/src/github.com/opencontainers/runc# make static CGO_ENABLED=1 go build -tags "seccomp cgo static_build" -ldflags "-w -extldflags -static -X main.gitCommit="1c81e2a794c6e26a4c650142ae8893c47f619764" -X main.version=1.0.0-rc4+dev " -o runc . /tmp/go-link-113476657/000007.o: In function `_cgo_a5acef59ed3f_Cfunc_dlopen': /tmp/go-build/github.com/opencontainers/runc/vendor/github.com/coreos/pkg/dlopen/_obj/cgo-gcc-prolog:76: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking ``` This fix disables systemd when `static_build` flag is on (apply_nosystemd.go is used instead). This fix also fixes a small bug in `apply_nosystemd.go` for return value. Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2017-09-11 18:38:22 +00:00
Qiang Huang	acaf6897f5	Fix systemd cgroup after memory type changed Fixes: #1557 I'm not quite sure about the root cause, looks like systemd still want them to be uint64. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2017-08-25 01:14:16 -04:00
Michael Crosby	882d8eaba6	Merge pull request #1537 from tklauser/staticcheck Fix issues found by staticcheck	2017-08-02 09:52:11 -04:00
Tobias Klauser	e4e56cb6d8	libcontainer: remove ineffective break statements go's switch statement doesn't need an explicit break. Remove it where that is the case and add a comment to indicate the purpose where the removal would lead to an empty case. Found with honnef.co/go/tools/cmd/staticcheck Signed-off-by: Tobias Klauser <tklauser@distanz.ch>	2017-07-28 15:13:39 +02:00
Steven Hartland	ee4f68e302	Updated logrus to v1 Updated logrus to use v1 which includes a breaking name change Sirupsen -> sirupsen. This includes a manual edit of the docker term package to also correct the name there too. Signed-off-by: Steven Hartland <steven.hartland@multiplay.co.uk>	2017-07-19 15:20:56 +00:00
Daniel, Dao Quang Minh	7139b61f7f	Merge pull request #1378 from derekwaynecarr/expose_use_hierarchy Expose memory.use_hierarchy in MemoryStats	2017-06-30 16:08:21 +01:00
Justin Cormack	3d9074ead3	Update memory specs to use int64 not uint64 replace #1492 #1494 fix #1422 Since https://github.com/opencontainers/runtime-spec/pull/876 the memory specifications are now `int64`, as that better matches the visible interface where `-1` is a valid value. Otherwise finding the correct value was difficult as it was kernel dependent. Signed-off-by: Justin Cormack <justin.cormack@docker.com>	2017-06-27 12:16:07 +01:00
Daniel, Dao Quang Minh	67bd2ab554	Merge pull request #1442 from clnperez/libcontainer-sys-unix Move libcontainer to x/sys/unix	2017-05-26 12:18:33 +01:00
Michael Crosby	18cd7e06f7	Merge pull request #1372 from cloudfoundry-incubator/cpuset-mount-root Handle container creation when cgroups have already been mounted in another location	2017-05-25 09:53:57 -07:00
Christy Perez	3d7cb4293c	Move libcontainer to x/sys/unix Since syscall is outdated and broken for some architectures, use x/sys/unix instead. There are still some dependencies on the syscall package that will remain in syscall for the forseeable future: Errno Signal SysProcAttr Additionally: - os still uses syscall, so it needs to be kept for anything returning *os.ProcessState, such as process.Wait. Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>	2017-05-22 17:35:20 -05:00
Derek Carr	4d6225aec2	Expose memory.use_hierarchy in MemoryStats Signed-off-by: Derek Carr <decarr@redhat.com>	2017-03-31 13:40:34 -04:00
Aleksa Sarai	baeef29858	rootless: add rootless cgroup manager The rootless cgroup manager acts as a noop for all set and apply operations. It is just used for rootless setups. Currently this is far too simple (we need to add opportunistic cgroup management), but is good enough as a first-pass at a noop cgroup manager. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-03-23 20:46:20 +11:00
Qiang Huang	8430cc4f48	Use uint64 for resources to keep consistency with runtime-spec Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2017-03-20 18:51:39 +08:00
Craig Furman	f5c5aac958	Create containers when cgroups already mounted Runc needs to copy certain files from the top of the cgroup cpuset hierarchy into the container's cpuset cgroup directory. Currently, runc determines which directory is the top of the hierarchy by using the parent dir of the first entry in /proc/self/mountinfo of type cgroup. This creates problems when cgroup subsystems are mounted arbitrarily in different dirs on the host. Now, we use the most deeply nested mountpoint that contains the container's cpuset cgroup directory. Signed-off-by: Konstantinos Karampogias <konstantinos.karampogias@swisscom.com> Signed-off-by: Will Martin <wmartin@pivotal.io>	2017-03-15 10:10:30 +00:00
Qiang Huang	8773c5f9a6	Remove unused function in systemd cgroup Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2017-03-07 15:11:37 +08:00
Michael Crosby	49a33c41f8	Merge pull request #1344 from xuxinkun/fixCPUQuota20170224 fix cpu.cfs_quota_us changed when systemd daemon-reload using systemd.	2017-03-06 10:02:28 -08:00
xuxinkun	c44aec9b23	fix cpu.cfs_quota_us changed when systemd daemon-reload using systemd. Signed-off-by: xuxinkun <xuxinkun@gmail.com>	2017-03-06 20:08:30 +11:00
Qiang Huang	fe898e7862	Fix kmem accouting when use with cgroupsPath Fixes: #1347 Fixes: #1083 The root cause of #1083 is because we're joining an existed cgroup whose kmem accouting is not initialized, and it has child cgroup or tasks in it. Fix it by checking if the cgroup is first time created, and we should enable kmem accouting if the cgroup is craeted by libcontainer with or without kmem limit configed. Otherwise we'll get issue like #1347 Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2017-02-25 10:58:18 -08:00
Qiang Huang	6b1d0e76f2	Merge pull request #1127 from boynux/fix-set-mem-to-unlimited Fixes set memory to unlimited	2017-02-16 09:51:23 +08:00
Mohammad Arab	18ebc51b3c	Reset Swap when memory is set to unlimited (-1) Kernel validation fails if memory set to -1 which is unlimited but swap is not set so. Signed-off-by: Mohammad Arab <boynux@gmail.com>	2017-02-15 08:11:57 +01:00
Daniel, Dao Quang Minh	0fefa36f3a	Merge pull request #1278 from datawolf/scanner move error check out of the for loop	2017-01-20 17:49:44 +00:00
Daniel, Dao Quang Minh	b8cefd7d8f	Merge pull request #1266 from mrunalp/ignore_cgroup_v2 Ignore cgroup2 mountpoints	2017-01-20 17:26:46 +00:00
Wang Long	3a71eb0256	move error check out of the for loop The `bufio.Scanner.Scan` method returns false either by reaching the end of the input or an error. After Scan returns false, the Err method will return any error that occurred during scanning, except that if it was io.EOF, Err will return nil. We should check the error when Scan return false(out of the for loop). Signed-off-by: Wang Long <long.wanglong@huawei.com>	2017-01-18 05:02:39 +00:00
Qiang Huang	a9610f2c02	Merge pull request #1249 from datawolf/small-refactor small refactor	2017-01-13 02:04:59 -06:00
Mrunal Patel	c7ebda72ac	Add a test for testing that we ignore cgroup2 mounts Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-01-11 16:49:53 -08:00
Mrunal Patel	e7b57cb042	Ignore cgroup2 mountpoints Our current cgroup parsing logic assumes cgroup v1 mounts so we should ignore cgroup2 mounts for now Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-01-11 12:34:50 -08:00
Mrunal Patel	7ae521cef0	Merge pull request #1251 from datawolf/update-cgroup-comment cgroups: update the comments	2017-01-09 11:13:39 -08:00
Michael Crosby	44e60af49d	Merge pull request #1196 from hqhq/fix_cgroup_leftover Fix leftover cgroup directory issue	2017-01-09 10:31:04 -08:00
Wang Long	4732f46fd9	small refactor Signed-off-by: Wang Long <long.wanglong@huawei.com>	2017-01-04 11:39:44 +08:00
Wang Long	4dfd350a38	cgroups: update the comments Signed-off-by: Wang Long <long.wanglong@huawei.com>	2017-01-03 22:40:12 +08:00
Qiang Huang	14d58e1e48	Fix leftover cgroup directory issue In the cases that we got failure on a subsystem's Apply, we'll get some subsystems' cgroup directories leftover. On Docker's point of view, start a container failed, use `docker rm` to remove the container, but some cgroup files are leftover. Sometimes we don't want to clean everyting up when something went wrong, because we need these inter situation information to debug what's going on, but cgroup directories are not useful information we want to keep. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-11-22 08:02:43 +08:00
Qiang Huang	aee46862ec	Fix cpuset issue with cpuset.cpu_exclusive This PR fix issue in this scenario: ``` in terminal 1: ~# cd /sys/fs/cgroup/cpuset ~# mkdir test ~# cd test ~# cat cpuset.cpus 0-3 ~# echo 1 > cpuset.cpu_exclusive (make sure you don't have other cgroups under root) in terminal 2: ~# echo $$ > /sys/fs/cgroup/cpuset/test/tasks // set resources.cpu.cpus="0-2" in config.json ~# runc run test1 back to terminal 1: ~# cd test1 ~# cat cpuset.cpus 0-2 ~# echo 1 > cpuset.cpu_exclusive in terminal 3: ~# echo $$ > /sys/fs/cgroup/test/tasks // set resources.cpu.cpus="3" in config.json ~# runc run test2 container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"failed to write 0-3\\n to cpuset.cpus: write /sys/fs/cgroup/cpuset/test2/cpuset.cpus: invalid argument\"" ``` Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-11-18 15:28:40 +08:00
Derek Carr	d223e2adae	Ignore error when starting transient unit that already exists Signed-off-by: Derek Carr <decarr@redhat.com>	2016-10-19 14:55:52 -04:00
Daniel Dao	1b876b0bf2	fix typos with misspell pipe the source through https://github.com/client9/misspell. typos be gone! Signed-off-by: Daniel Dao <dqminh89@gmail.com>	2016-10-11 23:22:48 +00:00
Michael Crosby	11222ee1f1	Don't enable kernel mem if not set Don't enable the kmem limit if it is not specified in the config. Fixes #1083 Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-10-07 10:02:19 -07:00
derekwaynecarr	1a75f815d5	systemd cgroup driver supports slice management Signed-off-by: derekwaynecarr <decarr@redhat.com>	2016-09-27 16:01:37 -04:00
Mrunal Patel	5653ced544	Merge pull request #1059 from datawolf/use-WriteCgrougProc cgroup: using WriteCgroupProc to write the specified pid into the cgroup's cgroup.procs file	2016-09-22 11:31:35 -07:00
Wang Long	ce9951834c	cgroup: using WriteCgroupProc to write the specified pid into the cgroup's cgroup.procs file cgroupData.join method using `WriteCgroupProc` to place the pid into the proc file, it can avoid attach any pid to the cgroup if -1 is specified as a pid. so, replace `writeFile` with `WriteCgroupProc` like `cpuset.go`'s ApplyDir method. Signed-off-by: Wang Long <long.wanglong@huawei.com>	2016-09-21 10:57:03 +00:00
Mrunal Patel	f557996401	Add flag to allow getting all mounts for cgroups subsystems Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-09-15 15:19:27 -04:00
Wang Long	fd92846686	move m.GetPaths out of the loop only call m.GetPaths once is ok. os move it out of the loop. Signed-off-by: Wang Long <long.wanglong@huawei.com>	2016-09-13 12:19:48 +00:00
Michael Crosby	9a072b611e	Merge pull request #1013 from hqhq/fix_ps_issue Fix runc ps issue	2016-09-12 14:03:21 -07:00
Qiang Huang	b5b6989e9a	Fix runc pause and runc update Fixes: #1034 Fixes: #1031 Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-09-12 16:02:56 +08:00
Qiang Huang	da7bac1c90	Fix runc ps issue After #1009, we don't always set `cgroup.Paths`, so `getCgroupPath()` will return wrong cgroup path because it'll take current process's cgroup as the parent, which would be wrong when we try to find the cgroup path in `runc ps` and `runc kill`. Fix it by using `m.GetPath()` to get the true cgroup paths. Reported-by: Yang Shukui <yangshukui@huawei.com> Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-09-12 15:41:16 +08:00
Yuanhong Peng	a71a301a28	Fix typo. Signed-off-by: Yuanhong Peng <pengyuanhong@huawei.com>	2016-09-09 16:18:54 +08:00
Alexander Morozov	0c6733d669	Merge pull request #970 from hqhq/fix_race_cgroup_paths Fix race condition when using cgroups.Paths	2016-08-23 10:47:00 -07:00
Michael Crosby	7d8f322fdd	Merge pull request #860 from bgray/806-set_cgroup_cpu_rt_before_joining Set the cpu cgroup RT sched params before joining.	2016-08-12 09:24:15 -07:00
Qiang Huang	6ecb469b2b	Fix race condition when using cgroups.Paths Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-08-02 15:43:04 +08:00
Qiang Huang	50f0a2b1e1	Merge pull request #962 from dubstack/fix_kmem_limits Remove kmem Initialization check while setting memory configuration	2016-08-02 10:04:18 +08:00
Mrunal Patel	56fc0ac9ce	Merge pull request #966 from sjenning/fix-initscope-cgroup-path fix init.scope in cgroup paths	2016-08-01 14:29:47 -07:00
Buddha Prakash	fcd966f501	Remove kmem Initialization check Signed-off-by: Buddha Prakash <buddhap@google.com>	2016-08-01 09:47:34 -07:00
Seth Jennings	4b44b98596	fix init.scope in cgroup paths Signed-off-by: Seth Jennings <sjenning@redhat.com>	2016-08-01 11:14:29 -05:00
Qiang Huang	1a81e9ab1f	Merge pull request #958 from dubstack/skip-devices Skip updates on parent Devices cgroup	2016-07-29 10:31:49 +08:00
Buddha Prakash	d4c67195c6	Add test Signed-off-by: Buddha Prakash <buddhap@google.com>	2016-07-28 17:14:51 -07:00
Buddha Prakash	ef4ff6a8ad	Skip updates on parent Devices cgroup Signed-off-by: Buddha Prakash <buddhap@google.com>	2016-07-25 10:30:46 -07:00
Daniel, Dao Quang Minh	f0e17e9a46	Merge pull request #961 from hqhq/revert_935 Revert "Use update time to detect if kmem limits have been set"	2016-07-21 14:51:21 +01:00
Daniel, Dao Quang Minh	ff88baa42f	Merge pull request #611 from mrunalp/fix_set Fix cgroup Set when Paths are specified	2016-07-21 14:00:22 +01:00
Qiang Huang	15c93ee9e0	Revert "Use update time to detect if kmem limits have been set" Revert: #935 Fixes: #946 I can reproduce #946 on some machines, the problem is on some machines, it could be very fast that modify time of `memory.kmem.limit_in_bytes` could be the same as before it's modified. And now we'll call `SetKernelMemory` twice on container creation which cause the second time failure. Revert this before we find a better solution. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-07-21 19:14:38 +08:00
Buddha Prakash	ebe85bf180	Allow cgroup creation without attaching a pid Signed-off-by: Buddha Prakash <buddhap@google.com>	2016-07-20 13:49:48 -07:00
Mrunal Patel	4dedd09396	Merge pull request #937 from hushan/net_cls-classid fix setting net_cls classid	2016-07-18 17:18:23 -04:00
Hushan Jia	bb42f80a86	fix setting net_cls classid Setting classid of net_cls cgroup failed: ERRO[0000] process_linux.go:291: setting cgroup config for ready process caused "failed to write 𐀁 to net_cls.classid: write /sys/fs/cgroup/net_cls,net_prio/user.slice/abc/net_cls.classid: invalid argument" process_linux.go:291: setting cgroup config for ready process caused "failed to write 𐀁 to net_cls.classid: write /sys/fs/cgroup/net_cls,net_prio/user.slice/abc/net_cls.classid: invalid argument" The spec has classid as a *uint32, the libcontainer configs should match the type. Signed-off-by: Hushan Jia <hushan.jia@gmail.com>	2016-07-11 05:00:35 +08:00
Vishnu kannan	8dd3d63455	Look at modify time to check if kmem limits are initialized. Signed-off-by: Vishnu kannan <vishnuk@google.com>	2016-07-06 15:14:25 -07:00
Ben	14e55d1692	Add unit test for setting the CPU RT sched cgroups values at apply time Added a unit test to verify that 'cpu.rt_runtime_us' and 'cpu.rt_runtime_us' cgroup values are set when the cgroup is applied to a process. Signed-off-by: Ben Gray <ben.r.gray@gmail.com>	2016-07-04 13:11:53 +01:00
ben	950700e73c	Set the 'cpu.rt_runtime_us' and 'cpu.rt_runtime_us' values of the cpu cgroup before trying to move the process into the cgroup. This is required if runc itself is running in SCHED_RR mode, as it is not possible to add a process in SCHED_RR mode to a cgroup which hasn't been assigned any RT bandwidth. And RT bandwidth is not inherited, each new cgroup starts with 0 b/w. Signed-off-by: Ben Gray <ben.r.gray@gmail.com>	2016-07-04 13:10:21 +01:00
Qiang Huang	42dfd60643	Merge pull request #904 from euank/fix-cgroup-parsing-err cgroups: Fix issue if cgroup path contains :	2016-06-14 14:19:20 +08:00
rajasec	146218ab92	Removing unused variable for cgroup subsystem Signed-off-by: rajasec <rajasec79@gmail.com>	2016-06-12 12:35:49 +05:30
Euan Kemp	394610a396	cgroups: Parse correctly if cgroup path contains : Prior to this change a cgroup with a `:` character in it's path was not parsed correctly (as occurs on some instances of systemd cgroups under some versions of systemd, e.g. 225 with accounting). This fixes that issue and adds a test. Signed-off-by: Euan Kemp <euank@coreos.com>	2016-06-10 23:09:03 -07:00
Christian Brauner	a1f8e0f184	fail if path to devices subsystem is missing The presence of the "devices" subsystem is a necessary condition for a (privileged) container. Signed-off-by: Christian Brauner <cbrauner@suse.com>	2016-06-08 16:44:15 +02:00
Daniel, Dao Quang Minh	d5ecf5c67c	systemd cgroup: check for Delegate property Delegate is only available in systemd >218, applying it for older systemd will result in an error. Therefore we should check for it when testing systemd properties. Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>	2016-06-01 14:32:24 +00:00
Qiang Huang	6fa490c664	Remove use_hierarchy check when set kernel memory Kernel memory cannot be set in these circumstances (before kernel 4.6): 1. kernel memory is not initialized, and there are tasks in cgroup 2. kernel memory is not initialized, and use_hierarchy is enabled, and there are sub-cgroups While we don't need to cover case 2 because when we set kernel memory in runC, it's either: - in Apply phase when we create the container, and in this case, set kernel memory would definitely be valid; - or in update operation, and in this case, there would be tasks in cgroup, we only need to check if kernel memory is initialized or not. Even if we want to check use_hierarchy, we need to check sub-cgroups as well, but for here, we can just leave it aside. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-05-28 15:22:58 +08:00
Mrunal Patel	4a8f0b4db4	Fix cgroup Set when Paths are specified Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-05-09 16:06:03 -07:00
Kenfe-Mickael Laventure	27814ee120	Allow updating kmem.limit_in_bytes if initialized at cgroup creation Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>	2016-05-06 08:05:15 -07:00
Jim Berlage	c5b0caf76d	Correct outdated URL `libcontainer/cgroups/utils.go` uses an incorrect path to the documentation for cgroups. This updates the comment to use the correct URL. Fixes #794. Signed-off-by: Jim Berlage <james.berlage@gmail.com>	2016-04-29 10:44:27 -05:00
Tatsushi Inagaki	2a1a6cdf44	Cgroup: reduce redundant parsing of mountinfo Avoid parsing the whole lines of mountinfo after all mountpoints of the target subsytems are found, or when the target subsystem is not enabled. Signed-off-by: Tatsushi Inagaki <e29253@jp.ibm.com>	2016-04-22 09:41:28 +09:00
Michael Crosby	660029b476	Merge pull request #745 from AkihiroSuda/very-trivial-style-fix Fix trivial style errors reported by `go vet` and `golint`	2016-04-12 13:33:00 -07:00
Akihiro Suda	1829531241	Fix trivial style errors reported by `go vet` and `golint` No substantial code change. Note that some style errors reported by `golint` are not fixed due to possible compatibility issues. Signed-off-by: Akihiro Suda <suda.kyoto@gmail.com>	2016-04-12 08:13:16 +00:00
Qiang Huang	792251ae38	Fix problem when swap memory unsupported When swap memory is unsupported, Docker will set cgroup.Resources.MemorySwap as -1. Fixes: https://github.com/docker/docker/pull/21937 Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-04-12 15:08:10 +08:00
Mrunal Patel	3f4f4420fd	Merge pull request #592 from hqhq/hq_fix_update_memory Fix problem when update memory and swap memory	2016-04-05 10:19:33 -07:00
Mrunal Patel	857d418b09	Merge pull request #698 from ggaaooppeenngg/gaopeng/format-errorf Use %v for map structure format	2016-03-28 09:28:28 -07:00
Qiang Huang	d8b8f76c4f	Fix problem when update memory and swap memory Currently, if we start a container with: `docker run -ti --name foo --memory 300M --memory-swap 500M busybox sh` Then we want to update it with: `docker update --memory 600M --memory-swap 800M foo` It'll get error because we can't set memory to 600M with the 500M limit of swap memory. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-03-28 10:48:29 +08:00
Peng Gao	ffbc626e53	Use %v for map structure format Based on Golang document, %s is for "the uninterpreted bytes of the string or slice", so %v is more appropriate. Signed-off-by: Peng Gao <peng.gao.dut@gmail.com>	2016-03-26 23:28:59 +08:00
allencloud	10cc27888c	fix typos Signed-off-by: allencloud <allen.sun@daocloud.io>	2016-03-25 11:11:48 +08:00
Qiang Huang	69f8a50081	Merge pull request #669 from mrunalp/fix_test Fix the kmem TCP test	2016-03-22 09:45:13 +08:00
Michael Crosby	e80b6b67e6	Merge pull request #651 from mrunalp/quota_validation Add more information in the error messages when writing to a file	2016-03-21 17:53:49 -07:00
Mrunal Patel	73e48633a3	Fix the kmem TCP test Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-03-21 15:51:42 -07:00
Mrunal Patel	4d7929274d	Merge pull request #644 from cyphar/fix-pids-max-unlimited libcontainer: cgroups: deal with unlimited case for pids.max	2016-03-21 14:55:20 -07:00
Mrunal Patel	35541ebcd2	Add more information in the error messages when writing to a file This is helpful to debug "invalid argument" errors when writing to cgroup files Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-03-21 09:27:24 -07:00
Aleksa Sarai	f5e60cf775	libcontainer: cgroups: add statistics for kmem.tcp Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-03-20 22:04:02 +11:00
Aleksa Sarai	1448fe9568	libcontainer: cgroups: add support for kmem.tcp limits Kernel TCP memory has its own special knobs inside the cgroup. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-03-20 22:03:52 +11:00
Aleksa Sarai	a6d5179f60	libcontainer: cgroups: add tests for pids.max == "max" Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-03-18 08:46:24 +11:00
Aleksa Sarai	087b953dc5	libcontainer: cgroups: deal with unlimited case for pids.max Make sure we don't error out collecting statistics for cases where pids.max == "max". In that case, we can use a limit of 0 which means "unlimited". In addition, change the name of the stats attribute (Max) to mirror the name of the resources attribute in the spec (Limit) so that it's consistent internally. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-03-18 08:46:24 +11:00
Jessica Frazelle	2c5b10189c	remove deadcode Signed-off-by: Jessica Frazelle <acidburn@docker.com>	2016-03-17 13:36:28 -07:00
Mrunal Patel	93d1a1a6ea	Set Delegate to true for cgroups transient units This is required because we manage some of the cgroups ourselves. This recommendation came from talking with systemd devs about some of the issues that we see when using the systemd cgroups driver. Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-03-16 09:44:27 -07:00
Aleksa Sarai	64286b443d	libcontainer: cgroups: add tests for pids.max in PidsStats Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-03-13 14:16:38 +11:00
Aleksa Sarai	2b1e086f62	libcontainer: cgroups: add pids.max to PidsStats In order to allow nice usage statistics (in terms of percentages and other such data), add the value of pids.max to the PidsStats struct returned from the pids cgroup controller. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-03-13 04:53:20 +11:00
Mrunal Patel	5b439d8c48	Merge pull request #491 from hqhq/hq_cleanup_systemd_apply Cleanup systemd apply	2016-03-08 08:32:02 -08:00
Mrunal Patel	6fc66fea48	Merge pull request #601 from LK4D4/fix_stats_race Fix race between Apply and GetStats	2016-02-29 11:01:09 -08:00
Alexander Morozov	e5906f7ed5	Fix race between Apply and GetStats Signed-off-by: Alexander Morozov <lk4d4@docker.com>	2016-02-29 08:50:42 -08:00
rajasec	3b2805834b	Adding linux label to test file Signed-off-by: rajasec <rajasec79@gmail.com> Fixed review comments Signed-off-by: rajasec <rajasec79@gmail.com>	2016-02-25 07:52:32 +05:30
Phil Estes	0b5581fd28	Handle memory swappiness as a pointer to handle default/unset case This prior fix to set "-1" explicitly was lost, and it is simpler to use the same pointer type from the OCI spec to handle nil pointer == -1 == unset case. Also, as a nearly humorous aside, there was a test for MemorySwappiness that was actually setting Memory, and it was passing because of this bug (as it was always setting everyone's MemorySwappiness to zero!) Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)	2016-02-24 09:02:06 -06:00
Michael Crosby	ee6a72df4e	Merge pull request #577 from crosbymichael/m-named-cgroup Move the process outside of the systemd cgroup	2016-02-19 13:51:58 -08:00

... 2 3 4 5 6 ...

436 Commits