Commit Graph

60 Commits

Author SHA1 Message Date
Filipe Brandenburger 0e16bd9b53 Detect whether Delegate is available on both slices and scopes
Starting with systemd 237, in preparation for cgroup v2, delegation is
only now available for scopes, not slices.

Update libcontainer code to detect whether delegation is available on
both and use that information when creating new slices.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
2018-04-10 11:42:55 -07:00
Filipe Brandenburger 8ab251f298 Fix systemd.Apply() to check for DBus error before waiting on a channel.
The channel was introduced in #1683 to work around a race condition.
However, the check for error in StartTransientUnit ignores the error for
an already existing unit, and in that case there will be no notification
from DBus (so waiting on the channel will make it hang.)

Later PR #1754 added a timeout, which worked around the issue, but we
can fix this correctly by only waiting on the channel when there is no
error. Fix the code to do so.

The timeout handling was kept, since there might be other cases where
this situation occurs (https://bugzilla.redhat.com/show_bug.cgi?id=1548358
mentions calling this code from inside a container, it's unclear whether
an existing container was in use or not, so not sure whether this would
have fixed that bug as well.)

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
2018-04-09 11:51:59 -07:00
vikaschoudhary16 04e95b526d Add timeout while waiting for StartTransinetUnit completion signal from dbus
Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>
2018-03-07 05:11:38 -05:00
Michael Crosby 595bea022f
Merge pull request #1722 from ravisantoshgudimetla/fix-systemd-path
fix systemd slice expansion so that it could be consumed by cAdvisor
2018-02-20 09:59:24 -05:00
ravisantoshgudimetla 7019e1de7b fix systemd slice expansion so that it could be consumed by cAdvisor
Signed-off-by: ravisantoshgudimetla <ravisantoshgudimetla@gmail.com>
2018-02-18 21:32:39 -05:00
vikaschoudhary16 d5b4a3eddb Fix race against systemd
- T0: runc triggers a systemd unit creation asynchronously from [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L298)
- T1: runc then moves ahead and starts creating cgroup paths(.scope directories), [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L348). Kernel creates .scope directory and cgroup.procs file(along with other default files) in the directory automatically, in an atomic manner.
- T3: systemd execution thread which was invoked at time `T0`, is still in the process of unit creation. systemd also trying to create cgroup paths and deletes the `.scope` directory which is created at time `T1` by runc from [here](https://github.com/systemd/systemd/blob/v219/src/shared/cgroup-util.c#L1630) in the code

Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>
2018-01-08 09:37:26 -05:00
Seth Jennings bca53e7b49 systemd: adjust CPUQuotaPerSecUSec to compensate for systemd internal handling
Signed-off-by: Seth Jennings <sjenning@redhat.com>
2017-11-15 20:20:06 -06:00
Yong Tang e9944d0f4c Disable systemd in static build
This fix tries to address the warnings caused by static build
with go 1.9. As systemd needs dlopen/dlclose, the following warnings
will be generated for static build in go 1.9:
```
root@f4b077232050:/go/src/github.com/opencontainers/runc# make static
CGO_ENABLED=1 go build  -tags "seccomp cgo static_build" -ldflags "-w -extldflags -static -X main.gitCommit="1c81e2a794c6e26a4c650142ae8893c47f619764" -X main.version=1.0.0-rc4+dev " -o runc .
/tmp/go-link-113476657/000007.o: In function `_cgo_a5acef59ed3f_Cfunc_dlopen':
/tmp/go-build/github.com/opencontainers/runc/vendor/github.com/coreos/pkg/dlopen/_obj/cgo-gcc-prolog:76: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
```

This fix disables systemd when `static_build` flag is on (apply_nosystemd.go
is used instead).

This fix also fixes a small bug in `apply_nosystemd.go` for return value.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2017-09-11 18:38:22 +00:00
Qiang Huang acaf6897f5 Fix systemd cgroup after memory type changed
Fixes: #1557

I'm not quite sure about the root cause, looks like
systemd still want them to be uint64.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-08-25 01:14:16 -04:00
Tobias Klauser e4e56cb6d8 libcontainer: remove ineffective break statements
go's switch statement doesn't need an explicit break. Remove it where
that is the case and add a comment to indicate the purpose where the
removal would lead to an empty case.

Found with honnef.co/go/tools/cmd/staticcheck

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2017-07-28 15:13:39 +02:00
Aleksa Sarai baeef29858
rootless: add rootless cgroup manager
The rootless cgroup manager acts as a noop for all set and apply
operations. It is just used for rootless setups. Currently this is far
too simple (we need to add opportunistic cgroup management), but is good
enough as a first-pass at a noop cgroup manager.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:20 +11:00
Qiang Huang 8430cc4f48 Use uint64 for resources to keep consistency with runtime-spec
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-20 18:51:39 +08:00
Qiang Huang 8773c5f9a6 Remove unused function in systemd cgroup
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-07 15:11:37 +08:00
xuxinkun c44aec9b23 fix cpu.cfs_quota_us changed when systemd daemon-reload using systemd.
Signed-off-by: xuxinkun <xuxinkun@gmail.com>
2017-03-06 20:08:30 +11:00
Derek Carr d223e2adae Ignore error when starting transient unit that already exists
Signed-off-by: Derek Carr <decarr@redhat.com>
2016-10-19 14:55:52 -04:00
Daniel Dao 1b876b0bf2 fix typos with misspell
pipe the source through https://github.com/client9/misspell. typos be gone!

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2016-10-11 23:22:48 +00:00
derekwaynecarr 1a75f815d5 systemd cgroup driver supports slice management
Signed-off-by: derekwaynecarr <decarr@redhat.com>
2016-09-27 16:01:37 -04:00
Qiang Huang 50f0a2b1e1 Merge pull request #962 from dubstack/fix_kmem_limits
Remove kmem Initialization check while setting memory configuration
2016-08-02 10:04:18 +08:00
Buddha Prakash fcd966f501 Remove kmem Initialization check
Signed-off-by: Buddha Prakash <buddhap@google.com>
2016-08-01 09:47:34 -07:00
Seth Jennings 4b44b98596 fix init.scope in cgroup paths
Signed-off-by: Seth Jennings <sjenning@redhat.com>
2016-08-01 11:14:29 -05:00
Daniel, Dao Quang Minh ff88baa42f Merge pull request #611 from mrunalp/fix_set
Fix cgroup Set when Paths are specified
2016-07-21 14:00:22 +01:00
Daniel, Dao Quang Minh d5ecf5c67c systemd cgroup: check for Delegate property
Delegate is only available in systemd >218, applying it for older systemd will
result in an error. Therefore we should check for it when testing systemd
properties.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-06-01 14:32:24 +00:00
Mrunal Patel 4a8f0b4db4 Fix cgroup Set when Paths are specified
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-05-09 16:06:03 -07:00
Kenfe-Mickael Laventure 27814ee120 Allow updating kmem.limit_in_bytes if initialized at cgroup creation
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2016-05-06 08:05:15 -07:00
Akihiro Suda 1829531241 Fix trivial style errors reported by `go vet` and `golint`
No substantial code change.
Note that some style errors reported by `golint` are not fixed due to possible compatibility issues.

Signed-off-by: Akihiro Suda <suda.kyoto@gmail.com>
2016-04-12 08:13:16 +00:00
Jessica Frazelle 2c5b10189c
remove deadcode
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-03-17 13:36:28 -07:00
Mrunal Patel 93d1a1a6ea Set Delegate to true for cgroups transient units
This is required because we manage some of the cgroups ourselves.
This recommendation came from talking with systemd devs about
some of the issues that we see when using the systemd cgroups driver.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-03-16 09:44:27 -07:00
Qiang Huang bda7742019 Cleanup systemd apply
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-02-15 15:56:59 +08:00
Aleksa Sarai 57ba666ef3 cgroup: systemd: further systemd slice validation
Add some further (not critical, since Docker does this already)
validation to systemd slice names, to make sure users don't get cryptic
errors.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-27 19:00:52 +11:00
Aleksa Sarai 8b32914065 cgroup: systemd: properly expand systemd slice names
Rather than using '/' to denote hierarchy in slice names, systemd uses
'-' in an odd way. This results in runC incorrectly assuming that
certain kernel features are missing (and using inconsistent paths for
the cgroups not supported by systemd), because the "subsystem path" used
is not the one that systemd has created. Fix all of this by properly
expanding slice names.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-25 23:18:34 +11:00
Aleksa Sarai 75e38f94a0 cgroups: set memory cgroups in Set
Modify the memory cgroup code such that kmem is not managed by Set(), in
order to allow updating of memory constraints for containers by Docker.
This also removes the need to make memory a special case cgroup.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-22 07:46:43 +11:00
Mrunal Patel 41d9d26513 Add support for just joining in apply using cgroup paths
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-01-20 14:23:05 -05:00
Mrunal Patel 4c767d7046 Merge pull request #446 from cyphar/18-add-pids-controller
cgroup: add PIDs cgroup controller support
2016-01-11 16:56:00 -08:00
Aleksa Sarai a95483402e libcontainer: cgroups: loudly fail with Set
It is vital to loudly fail when a user attempts to set a cgroup limit
(rather than using the system default). Otherwise the user will assume
they have security they do not actually have. This mirrors the original
Apply() (that would set cgroup configs) semantics.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:35 +11:00
Aleksa Sarai f36ed4b174 libcontainer: cgroups: don't Set in Apply
Apply and Set are two separate operations, and it doesn't make sense to
group the two together (especially considering that the bootstrap
process is added to the cgroup as well). The only exception to this is
the memory cgroup, which requires the configuration to be set before
processes can join.

One of the weird cases to deal with is systemd. Systemd sets some of the
cgroup configuration options, but not all of them. Because memory is a
special case, we need to explicitly set memory in the systemd Apply().
Otherwise, the rest can be safely re-applied in .Set() as usual.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:35 +11:00
Aleksa Sarai db3159c9d9 libcontainer: cgroups: add pids controller support
Add support for the pids cgroup controller to libcontainer, a recent
feature that is available in Linux 4.3+.

Unfortunately, due to the init process being written in Go, it can spawn
an an unknown number of threads due to blocked syscalls. This results in
the init process being unable to run properly, and thus small pids.max
configs won't work properly.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:32 +11:00
Jimmi Dyson 91c7024e52 Revert to non-recursive GetPids, add recursive GetAllPids
Signed-off-by: Jimmi Dyson <jimmidyson@gmail.com>
2016-01-08 19:42:25 +00:00
Mrunal Patel 4124ba9468 Revert "cgroups: add pids controller support"
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-12-19 07:48:48 -08:00
Aleksa Sarai 88e6d489f6 libcontainer: cgroups: loudly fail with Set
It is vital to loudly fail when a user attempts to set a cgroup limit
(rather than using the system default). Otherwise the user will assume
they have security they do not actually have. This mirrors the original
Apply() (that would set cgroup configs) semantics.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:47 +11:00
Aleksa Sarai 8a740d5391 libcontainer: cgroups: don't Set in Apply
Apply and Set are two separate operations, and it doesn't make sense to
group the two together (especially considering that the bootstrap
process is added to the cgroup as well). The only exception to this is
the memory cgroup, which requires the configuration to be set before
processes can join.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:47 +11:00
Aleksa Sarai 37789f5bf1 libcontainer: cgroups: add pids controller support
Add support for the pids cgroup controller to libcontainer, a recent
feature that is available in Linux 4.3+.

Unfortunately, due to the init process being written in Go, it can spawn
an an unknown number of threads due to blocked syscalls. This results in
the init process being unable to run properly, and thus small pids.max
configs won't work properly.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:38 +11:00
Mrunal Patel 55a49f2110 Move the cgroups setting into a Resources struct
This allows us to distinguish cases where a container
needs to just join the paths or also additionally
set cgroups settings. This will help in implementing
cgroupsPath support in the spec.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-12-16 15:53:31 -05:00
Qiang Huang 7695a0ddb0 systemd: support cgroup parent with specified slice
Pick up #119
Fixes: docker/docker#16681

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-12-02 23:57:02 -05:00
Michael Crosby ba2ce3b25a Cgroup set order for systemd
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-10-19 13:32:45 -07:00
Alexander Morozov 6dad176d01 Get PIDs from cgroups recursively
Also lookup cgroup for systemd is changed to "device" to be consistent
with fs implementation.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-10-13 10:19:01 -07:00
Mrunal Patel cc84f2cc9b Merge pull request #305 from hqhq/hq_add_softlimit_systemd
Add memory reservation support for systemd
2015-10-05 16:37:32 -07:00
Mrunal Patel 223975564a Merge pull request #276 from runcom/adapt-spec-96bcd043aa8a28f6f64c95ad61329765f01de1ba
Adapt spec 96bcd043aa
2015-10-05 16:36:09 -07:00
Mrunal Patel 79a02e35fb cgroups: Add name=systemd to list of subsystems
This allows getting the path to the subsystem and so is subsequently
used in EnterPid by an exec process.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-10-05 14:24:11 -04:00
Antonio Murdaca c6e406af24 Adjust runc to new opencontainers/specs version
Godeps: Vendor opencontainers/specs 96bcd043aa

Fix a bug where it's impossible to pass multiple devices to blkio
cgroup controller files. See https://github.com/opencontainers/runc/issues/274

Signed-off-by: Antonio Murdaca <runcom@linux.com>
2015-10-03 12:25:33 +02:00
Qiang Huang 6a5ba1109c Systemd: Join perf_event cgroup
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-09-29 15:42:29 +08:00