jasder/runc - runc - 军科开源项目托管

Commit Graph

Author	SHA1	Message	Date
Kir Kolyshkin	c86be8a2c1	cgroupv2: fix setting MemorySwap The resources.MemorySwap field from OCI is memory+swap, while cgroupv2 has a separate swap limit, so subtract memory from the limit (and make sure values are set and sane). Make sure to set MemorySwapMax for systemd, too. Since systemd does not have MemorySwapMax for cgroupv1, it is only needed for v2 driver. [v2: return -1 on any negative value, add unit test] [v3: treat any negative value other than -1 as error] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-07 20:45:53 -07:00
Tobias Klauser	3e678c08f9	Remove unused consts testScopeWait and testSliceWait These are unused since commit `518c855833` ("Remove libcontainer detection for systemd features") Signed-off-by: Tobias Klauser <tklauser@distanz.ch>	2020-04-03 21:09:43 +02:00
Mrunal Patel	d05e5728aa	systemd: Lazy initialize the systemd dbus connection Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2020-03-30 15:24:06 -07:00
Mrunal Patel	33c6125da6	systemd: Export IsSystemdRunning() function Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2020-03-30 15:24:06 -07:00
Kir Kolyshkin	a949e4f22f	cgroupv2: UnifiedManager.Apply: simplify Remove joinCgroupsV2() function, as its name and second parameter are misleading. Use createCgroupsv2Path() directly, do not call getv2Path() twice. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-26 19:20:00 -07:00
Kir Kolyshkin	5406833a65	cgroupv2/systemd: add getv2Path Function getSubsystemPath(), while works for v2 unified case, is suboptimal, as it does a few unnecessary calls. Add a simplified version of getSubsystemPath(), called getv2Path(), and use it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-26 19:17:09 -07:00
Kir Kolyshkin	ec1f957b23	cgroupv2: don't use getSubsystemPath in Apply This code is a copy-paste from cgroupv1 systemd code. Its aim is to check whether a subsystem is available, and skip those that are not. In case v2 unified hierarchy is used, getSubsystemPath never returns "not found" error, so calling it is useless. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-26 13:32:34 -07:00
Kir Kolyshkin	a675b5ebea	cgroupv2: don't try to set kmem for systemd case To the best of my knowledge, it has been decided to drop the kernel memory controller from the cgroupv2 hierarchy, so "kernel memory limits" do not exist if we're using v2 unified. So, we need to ignore kernel memory setting. This was already done in non-systemd case (see commit `88e8350de`), let's do the same for systemd. This fixes the following error: > container_linux.go:349: starting container process caused "process_linux.go:306: applying cgroup configuration for process caused \"open /sys/fs/cgroup/machine.slice/runc-cgroups-integration-test.scope/tasks: no such file or directory\"" Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-25 20:00:23 -07:00
Mrunal Patel	7de5db3dad	Merge pull request #2263 from kolyshkin/nits Assorted minor nits in libcontainer	2020-03-24 14:17:22 -07:00
Kir Kolyshkin	12dc475dd6	libcontainer: simplify createCgroupsv2Path fmt.Sprintf is slow and is not needed here, string concatenation would be sufficient. It is also redundant to convert []byte from string and back, since `bytes` package now provides the same functions as `strings`. Use Fields() instead of TrimSpace() and Split(), mainly for readability (note Fields() is somewhat slower than Split() but here it doesn't matter much). Use Join() to prepend the plus signs. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-20 11:51:55 -07:00
Akihiro Suda	492d525e55	vendor: update go-systemd and godbus Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-03-16 13:26:03 +09:00
Qiang Huang	3b7e32feba	Merge pull request #2210 from Zyqsempai/2164-remove-deprecated-systemd-resources Exchange deprecated systemd resources with the appropriate for cgroupv2	2020-02-29 10:13:55 +08:00
Kir Kolyshkin	4c5c3fb960	Support for setting systemd properties via annotations In case systemd is used to set cgroups for the container, it creates a scope unit dedicated to it (usually named `runc-$ID.scope`). This patch adds an ability to set arbitrary systemd properties for the systemd unit via runtime spec annotations. Initially this was developed as an ability to specify the `TimeoutStopUSec` property, but later generalized to work with arbitrary ones. Example usage: add the following to runtime spec (config.json): ``` "annotations": { "org.systemd.property.TimeoutStopUSec": "uint64 123456789", "org.systemd.property.CollectMode":"'inactive-or-failed'" }, ``` and start the container (e.g. `runc --systemd-cgroup run $ID`). The above will set the following systemd parameters: * `TimeoutStopSec` to 2 minutes and 3 seconds, * `CollectMode` to "inactive-or-failed". The values are in the gvariant format (see [1]). To figure out which type systemd expects for a particular parameter, see systemd sources. In particular, parameters with `USec` suffix require an `uint64` typed argument, while gvariant assumes int32 for a numeric values, therefore the explicit type is required. NOTE that systemd receives the time-typed parameters as USec but shows them (in `systemctl show`) as Sec. For example, the stop timeout should be set as `TimeoutStopUSec` but is shown as `TimeoutStopSec`. [1] https://developer.gnome.org/glib/stable/gvariant-text.html Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-02-17 16:07:19 -08:00
Boris Popovschi	5b96f314ba	Exchanged deprecated systemd resources with the appropriate for cgroupv2 Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>	2020-01-15 18:09:33 +02:00
Mrunal Patel	5cc0deaf7a	Merge pull request #2169 from AkihiroSuda/split-fs cgroup2: split fs2 from fs	2020-01-13 16:23:27 -08:00
Julio Montes	8ddd892072	libcontainer: add method to get cgroup config from cgroup Manager `configs.Cgroup` contains the configuration used to create cgroups. This configuration must be saved to disk, since it's required to restore the cgroup manager that was used to create the cgroups. Add method to get cgroup configuration from cgroup Manager to allow API users save it to disk and restore a cgroup manager later. fixes #2176 Signed-off-by: Julio Montes <julio.montes@intel.com>	2019-12-17 22:46:03 +00:00
Akihiro Suda	88e8350de2	cgroup2: split fs2 from fs split fs2 package from fs, as mixing up fs and fs2 is very likely to result in unmaintainable code. Inspired by containerd/cgroups#109 Fix #2157 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-12-06 15:42:10 +09:00
Mrunal Patel	46def4cc4c	Merge pull request #2154 from jpeach/2008-remove-static-build-tag Remove the static_build build tag.	2019-11-04 17:10:59 -08:00
Akihiro Suda	faf673ee45	cgroup2: port over eBPF device controller from crun The implementation is based on https://github.com/containers/crun/blob/0.10.2/src/libcrun/ebpf.c Although ebpf.c is originally licensed under LGPL-3.0-or-later, the author Giuseppe Scrivano agreed to relicense the file in Apache License 2.0: https://github.com/opencontainers/runc/issues/2144#issuecomment-543116397 See libcontainer/cgroups/ebpf/devicefilter/devicefilter_test.go for tested configurations. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-10-31 14:01:46 +09:00
James Peach	13919f5dfd	Remove the static_build build tag. The `static_build` build tag was introduced in `e9944d0f` to remove build warnings related to systemd cgroup driver dependencies. Since then, those dependencies have changed and building the systemd cgroup driver no longer imports dlopen. After this change, runc builds will always include the systemd cgroup driver. This fixes #2008. Signed-off-by: James Peach <jpeach@apache.org>	2019-10-26 08:28:45 +11:00
Akihiro Suda	dbd771e475	cgroup2: implement `runc ps` Implemented `runc ps` for cgroup v2 , using a newly added method `m.GetUnifiedPath()`. Unlike the v1 implementation that checks `m.GetPaths()["devices"]`, the v2 implementation does not require the device controller to be available. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-10-19 01:59:24 +09:00
Giuseppe Scrivano	524cb7c318	libcontainer: add systemd.UnifiedManager Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2019-09-05 13:02:27 +02:00
Giuseppe Scrivano	ec11136828	libcontainer, cgroups: rename systemd.Manager to LegacyManager Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2019-09-05 13:02:26 +02:00
Mrunal Patel	3525eddec5	Merge pull request #2117 from filbranden/detection1 Remove libcontainer detection for systemd features	2019-08-25 13:15:15 -07:00
Filipe Brandenburger	518c855833	Remove libcontainer detection for systemd features Transient units (and transient slice units) have been available for quite a long time and RHEL 7 with systemd v219 (likely the oldest OS we care about at this point) supports that. A system running a systemd without these features is likely to break a lot of other stuff that runc/libcontainer care about. Regarding delegated slices, modern systemd doesn't allow it and runc/libcontainer run fine on it, so we might as well just stop requesting it on older versions of systemd which allowed it. (Those versions never really changed behavior significantly when that option was passed anyways.) Signed-off-by: Filipe Brandenburger <filbranden@gmail.com>	2019-08-22 21:53:24 -07:00
Filipe Brandenburger	588f040a77	Avoid the dependency on cgo through go-systemd/util package This dependency is only needed in package "github.com/coreos/go-systemd/util" and we only use it for IsRunningSystemd(), which is a simple Go function that just stats a file. Let's just borrow it here, so we remove the dependency and can remove that package from vendored build. This also removes dependencies on dlopen and on trying to find libsystemd.so or libsystemd-login.so in the system. Tested that this still builds and works as expected. Signed-off-by: Filipe Brandenburger <filbranden@gmail.com>	2019-08-22 21:07:24 -07:00
Filipe Brandenburger	46351eb3d1	Move systemd.Manager initialization into a function in that module This will permit us to extend the internals of systemd.Manager to include further information about the system, such as whether cgroupv1, cgroupv2 or both are in effect. Furthermore, it allows a future refactor of moving more of UseSystemd() code into the factory initialization function. Signed-off-by: Filipe Brandenburger <filbranden@gmail.com>	2019-05-01 13:22:19 -07:00
Filipe Brandenburger	cd41feb46b	Remove detection for scope properties, which have always been broken The detection for scope properties (whether scope units support DefaultDependencies= or Delegate=) has always been broken, since systemd refuses to create scopes unless at least one PID is attached to it (and this has been so since scope units were introduced in systemd v205.) This can be seen in journal logs whenever a container is started with libpod: Feb 11 15:08:07 myhost systemd[1]: libcontainer-12345-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing. Feb 11 15:08:07 myhost systemd[1]: libcontainer-12345-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing. Since this logic never worked, just assume both attributes are supported (which is what the code does when detection fails for this reason, since it's looking for an "unknown attribute" or "read-only attribute" to mark them as false) and skip the detection altogether. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2019-02-11 16:05:37 -08:00
Giuseppe Scrivano	f01923376d	systemd: fix setting kernel memory limit since commit `df3fa115f9` it is not possible to set a kernel memory limit when using the systemd cgroups backend as we use cgroup.Apply twice. Skip enabling kernel memory if there are already tasks in the cgroup. Without this patch, runc fails with: container_linux.go:344: starting container process caused "process_linux.go:311: applying cgroup configuration for process caused \"failed to set memory.kmem.limit_in_bytes, because either tasks have already joined this cgroup or it has children\"" Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2019-01-10 11:33:50 +01:00
Michael Crosby	76520a4bf0	Merge pull request #1872 from masters-of-cats/better-find-cgroup-mountpoint Respect container's cgroup path	2018-11-16 14:06:54 -05:00
Sergio Lopez	5c6b9c3c1c	libcontainer: map PidsLimit to systemd's TasksMax property Currently runc applies PidsLimit restriction by writing directly to cgroup's pids.max, without notifying systemd. As a consequence, when the later updates the context of the corresponding scope, pids.max is reset to the value of systemd's TasksMax property. This can be easily reproduced this way (I'm using "postfix" here just an example, any unrelated but existing service will do): # CTR=`docker run --pids-limit 111 --detach --rm busybox /bin/sleep 8h` # cat /sys/fs/cgroup/pids/system.slice/docker-${CTR}.scope/pids.max 111 # systemctl disable --now postfix # systemctl enable --now postfix # cat /sys/fs/cgroup/pids/system.slice/docker-${CTR}.scope/pids.max max This patch adds TasksAccounting=true and TasksMax=PidsLimit to the properties sent to systemd. Signed-off-by: Sergio Lopez <slp@redhat.com>	2018-10-24 17:20:27 +02:00
Danail Branekov	a1d5398afa	Respect container's cgroup path Respect the container's cgroup path when finding the container's cgroup mount point, which is useful in multi-tenant environments, where containers have their own unique cgroup mounts Signed-off-by: Danail Branekov <danailster@gmail.com> Signed-off-by: Oliver Stenbom <ostenbom@pivotal.io> Signed-off-by: Giuseppe Capizzi <gcapizzi@pivotal.io>	2018-09-25 17:43:36 +01:00
Derek Carr	b515963c10	systemd cpu quota ignores -1 Signed-off-by: Derek Carr <decarr@redhat.com>	2018-05-23 14:28:39 -04:00
Filipe Brandenburger	165ee45334	Make channel for StartTransientUnit buffered So that, if a timeout happens and we decide to stop blocking on the operation, the writer will not block when they try to report the result of the operation. This should address Issue #1780 and it's a follow up for PR #1683, PR #1754 and PR #1772. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2018-04-14 08:49:50 -07:00
Filipe Brandenburger	0e16bd9b53	Detect whether Delegate is available on both slices and scopes Starting with systemd 237, in preparation for cgroup v2, delegation is only now available for scopes, not slices. Update libcontainer code to detect whether delegation is available on both and use that information when creating new slices. Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2018-04-10 11:42:55 -07:00
Filipe Brandenburger	8ab251f298	Fix systemd.Apply() to check for DBus error before waiting on a channel. The channel was introduced in #1683 to work around a race condition. However, the check for error in StartTransientUnit ignores the error for an already existing unit, and in that case there will be no notification from DBus (so waiting on the channel will make it hang.) Later PR #1754 added a timeout, which worked around the issue, but we can fix this correctly by only waiting on the channel when there is no error. Fix the code to do so. The timeout handling was kept, since there might be other cases where this situation occurs (https://bugzilla.redhat.com/show_bug.cgi?id=1548358 mentions calling this code from inside a container, it's unclear whether an existing container was in use or not, so not sure whether this would have fixed that bug as well.) Signed-off-by: Filipe Brandenburger <filbranden@google.com>	2018-04-09 11:51:59 -07:00
vikaschoudhary16	04e95b526d	Add timeout while waiting for StartTransinetUnit completion signal from dbus Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>	2018-03-07 05:11:38 -05:00
Michael Crosby	595bea022f	Merge pull request #1722 from ravisantoshgudimetla/fix-systemd-path fix systemd slice expansion so that it could be consumed by cAdvisor	2018-02-20 09:59:24 -05:00
ravisantoshgudimetla	7019e1de7b	fix systemd slice expansion so that it could be consumed by cAdvisor Signed-off-by: ravisantoshgudimetla <ravisantoshgudimetla@gmail.com>	2018-02-18 21:32:39 -05:00
vikaschoudhary16	d5b4a3eddb	Fix race against systemd - T0: runc triggers a systemd unit creation asynchronously from [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L298) - T1: runc then moves ahead and starts creating cgroup paths(.scope directories), [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L348). Kernel creates .scope directory and cgroup.procs file(along with other default files) in the directory automatically, in an atomic manner. - T3: systemd execution thread which was invoked at time `T0`, is still in the process of unit creation. systemd also trying to create cgroup paths and deletes the `.scope` directory which is created at time `T1` by runc from [here](https://github.com/systemd/systemd/blob/v219/src/shared/cgroup-util.c#L1630) in the code Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>	2018-01-08 09:37:26 -05:00
Seth Jennings	bca53e7b49	systemd: adjust CPUQuotaPerSecUSec to compensate for systemd internal handling Signed-off-by: Seth Jennings <sjenning@redhat.com>	2017-11-15 20:20:06 -06:00
Yong Tang	e9944d0f4c	Disable systemd in static build This fix tries to address the warnings caused by static build with go 1.9. As systemd needs dlopen/dlclose, the following warnings will be generated for static build in go 1.9: ``` root@f4b077232050:/go/src/github.com/opencontainers/runc# make static CGO_ENABLED=1 go build -tags "seccomp cgo static_build" -ldflags "-w -extldflags -static -X main.gitCommit="1c81e2a794c6e26a4c650142ae8893c47f619764" -X main.version=1.0.0-rc4+dev " -o runc . /tmp/go-link-113476657/000007.o: In function `_cgo_a5acef59ed3f_Cfunc_dlopen': /tmp/go-build/github.com/opencontainers/runc/vendor/github.com/coreos/pkg/dlopen/_obj/cgo-gcc-prolog:76: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking ``` This fix disables systemd when `static_build` flag is on (apply_nosystemd.go is used instead). This fix also fixes a small bug in `apply_nosystemd.go` for return value. Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2017-09-11 18:38:22 +00:00
Qiang Huang	acaf6897f5	Fix systemd cgroup after memory type changed Fixes: #1557 I'm not quite sure about the root cause, looks like systemd still want them to be uint64. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2017-08-25 01:14:16 -04:00
Tobias Klauser	e4e56cb6d8	libcontainer: remove ineffective break statements go's switch statement doesn't need an explicit break. Remove it where that is the case and add a comment to indicate the purpose where the removal would lead to an empty case. Found with honnef.co/go/tools/cmd/staticcheck Signed-off-by: Tobias Klauser <tklauser@distanz.ch>	2017-07-28 15:13:39 +02:00
Aleksa Sarai	baeef29858	rootless: add rootless cgroup manager The rootless cgroup manager acts as a noop for all set and apply operations. It is just used for rootless setups. Currently this is far too simple (we need to add opportunistic cgroup management), but is good enough as a first-pass at a noop cgroup manager. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-03-23 20:46:20 +11:00
Qiang Huang	8430cc4f48	Use uint64 for resources to keep consistency with runtime-spec Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2017-03-20 18:51:39 +08:00
Qiang Huang	8773c5f9a6	Remove unused function in systemd cgroup Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2017-03-07 15:11:37 +08:00
xuxinkun	c44aec9b23	fix cpu.cfs_quota_us changed when systemd daemon-reload using systemd. Signed-off-by: xuxinkun <xuxinkun@gmail.com>	2017-03-06 20:08:30 +11:00
Derek Carr	d223e2adae	Ignore error when starting transient unit that already exists Signed-off-by: Derek Carr <decarr@redhat.com>	2016-10-19 14:55:52 -04:00
Daniel Dao	1b876b0bf2	fix typos with misspell pipe the source through https://github.com/client9/misspell. typos be gone! Signed-off-by: Daniel Dao <dqminh89@gmail.com>	2016-10-11 23:22:48 +00:00

1 2

94 Commits