Commit Graph

53 Commits

Author SHA1 Message Date
Yong Tang e9944d0f4c Disable systemd in static build
This fix tries to address the warnings caused by static build
with go 1.9. As systemd needs dlopen/dlclose, the following warnings
will be generated for static build in go 1.9:
```
root@f4b077232050:/go/src/github.com/opencontainers/runc# make static
CGO_ENABLED=1 go build  -tags "seccomp cgo static_build" -ldflags "-w -extldflags -static -X main.gitCommit="1c81e2a794c6e26a4c650142ae8893c47f619764" -X main.version=1.0.0-rc4+dev " -o runc .
/tmp/go-link-113476657/000007.o: In function `_cgo_a5acef59ed3f_Cfunc_dlopen':
/tmp/go-build/github.com/opencontainers/runc/vendor/github.com/coreos/pkg/dlopen/_obj/cgo-gcc-prolog:76: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
```

This fix disables systemd when `static_build` flag is on (apply_nosystemd.go
is used instead).

This fix also fixes a small bug in `apply_nosystemd.go` for return value.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2017-09-11 18:38:22 +00:00
Qiang Huang acaf6897f5 Fix systemd cgroup after memory type changed
Fixes: #1557

I'm not quite sure about the root cause, looks like
systemd still want them to be uint64.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-08-25 01:14:16 -04:00
Tobias Klauser e4e56cb6d8 libcontainer: remove ineffective break statements
go's switch statement doesn't need an explicit break. Remove it where
that is the case and add a comment to indicate the purpose where the
removal would lead to an empty case.

Found with honnef.co/go/tools/cmd/staticcheck

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2017-07-28 15:13:39 +02:00
Aleksa Sarai baeef29858
rootless: add rootless cgroup manager
The rootless cgroup manager acts as a noop for all set and apply
operations. It is just used for rootless setups. Currently this is far
too simple (we need to add opportunistic cgroup management), but is good
enough as a first-pass at a noop cgroup manager.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:20 +11:00
Qiang Huang 8430cc4f48 Use uint64 for resources to keep consistency with runtime-spec
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-20 18:51:39 +08:00
Qiang Huang 8773c5f9a6 Remove unused function in systemd cgroup
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-07 15:11:37 +08:00
xuxinkun c44aec9b23 fix cpu.cfs_quota_us changed when systemd daemon-reload using systemd.
Signed-off-by: xuxinkun <xuxinkun@gmail.com>
2017-03-06 20:08:30 +11:00
Derek Carr d223e2adae Ignore error when starting transient unit that already exists
Signed-off-by: Derek Carr <decarr@redhat.com>
2016-10-19 14:55:52 -04:00
Daniel Dao 1b876b0bf2 fix typos with misspell
pipe the source through https://github.com/client9/misspell. typos be gone!

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2016-10-11 23:22:48 +00:00
derekwaynecarr 1a75f815d5 systemd cgroup driver supports slice management
Signed-off-by: derekwaynecarr <decarr@redhat.com>
2016-09-27 16:01:37 -04:00
Qiang Huang 50f0a2b1e1 Merge pull request #962 from dubstack/fix_kmem_limits
Remove kmem Initialization check while setting memory configuration
2016-08-02 10:04:18 +08:00
Buddha Prakash fcd966f501 Remove kmem Initialization check
Signed-off-by: Buddha Prakash <buddhap@google.com>
2016-08-01 09:47:34 -07:00
Seth Jennings 4b44b98596 fix init.scope in cgroup paths
Signed-off-by: Seth Jennings <sjenning@redhat.com>
2016-08-01 11:14:29 -05:00
Daniel, Dao Quang Minh ff88baa42f Merge pull request #611 from mrunalp/fix_set
Fix cgroup Set when Paths are specified
2016-07-21 14:00:22 +01:00
Daniel, Dao Quang Minh d5ecf5c67c systemd cgroup: check for Delegate property
Delegate is only available in systemd >218, applying it for older systemd will
result in an error. Therefore we should check for it when testing systemd
properties.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-06-01 14:32:24 +00:00
Mrunal Patel 4a8f0b4db4 Fix cgroup Set when Paths are specified
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-05-09 16:06:03 -07:00
Kenfe-Mickael Laventure 27814ee120 Allow updating kmem.limit_in_bytes if initialized at cgroup creation
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2016-05-06 08:05:15 -07:00
Akihiro Suda 1829531241 Fix trivial style errors reported by `go vet` and `golint`
No substantial code change.
Note that some style errors reported by `golint` are not fixed due to possible compatibility issues.

Signed-off-by: Akihiro Suda <suda.kyoto@gmail.com>
2016-04-12 08:13:16 +00:00
Jessica Frazelle 2c5b10189c
remove deadcode
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-03-17 13:36:28 -07:00
Mrunal Patel 93d1a1a6ea Set Delegate to true for cgroups transient units
This is required because we manage some of the cgroups ourselves.
This recommendation came from talking with systemd devs about
some of the issues that we see when using the systemd cgroups driver.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-03-16 09:44:27 -07:00
Qiang Huang bda7742019 Cleanup systemd apply
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-02-15 15:56:59 +08:00
Aleksa Sarai 57ba666ef3 cgroup: systemd: further systemd slice validation
Add some further (not critical, since Docker does this already)
validation to systemd slice names, to make sure users don't get cryptic
errors.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-27 19:00:52 +11:00
Aleksa Sarai 8b32914065 cgroup: systemd: properly expand systemd slice names
Rather than using '/' to denote hierarchy in slice names, systemd uses
'-' in an odd way. This results in runC incorrectly assuming that
certain kernel features are missing (and using inconsistent paths for
the cgroups not supported by systemd), because the "subsystem path" used
is not the one that systemd has created. Fix all of this by properly
expanding slice names.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-25 23:18:34 +11:00
Aleksa Sarai 75e38f94a0 cgroups: set memory cgroups in Set
Modify the memory cgroup code such that kmem is not managed by Set(), in
order to allow updating of memory constraints for containers by Docker.
This also removes the need to make memory a special case cgroup.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-22 07:46:43 +11:00
Mrunal Patel 41d9d26513 Add support for just joining in apply using cgroup paths
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-01-20 14:23:05 -05:00
Mrunal Patel 4c767d7046 Merge pull request #446 from cyphar/18-add-pids-controller
cgroup: add PIDs cgroup controller support
2016-01-11 16:56:00 -08:00
Aleksa Sarai a95483402e libcontainer: cgroups: loudly fail with Set
It is vital to loudly fail when a user attempts to set a cgroup limit
(rather than using the system default). Otherwise the user will assume
they have security they do not actually have. This mirrors the original
Apply() (that would set cgroup configs) semantics.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:35 +11:00
Aleksa Sarai f36ed4b174 libcontainer: cgroups: don't Set in Apply
Apply and Set are two separate operations, and it doesn't make sense to
group the two together (especially considering that the bootstrap
process is added to the cgroup as well). The only exception to this is
the memory cgroup, which requires the configuration to be set before
processes can join.

One of the weird cases to deal with is systemd. Systemd sets some of the
cgroup configuration options, but not all of them. Because memory is a
special case, we need to explicitly set memory in the systemd Apply().
Otherwise, the rest can be safely re-applied in .Set() as usual.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:35 +11:00
Aleksa Sarai db3159c9d9 libcontainer: cgroups: add pids controller support
Add support for the pids cgroup controller to libcontainer, a recent
feature that is available in Linux 4.3+.

Unfortunately, due to the init process being written in Go, it can spawn
an an unknown number of threads due to blocked syscalls. This results in
the init process being unable to run properly, and thus small pids.max
configs won't work properly.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:32 +11:00
Jimmi Dyson 91c7024e52 Revert to non-recursive GetPids, add recursive GetAllPids
Signed-off-by: Jimmi Dyson <jimmidyson@gmail.com>
2016-01-08 19:42:25 +00:00
Mrunal Patel 4124ba9468 Revert "cgroups: add pids controller support"
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-12-19 07:48:48 -08:00
Aleksa Sarai 88e6d489f6 libcontainer: cgroups: loudly fail with Set
It is vital to loudly fail when a user attempts to set a cgroup limit
(rather than using the system default). Otherwise the user will assume
they have security they do not actually have. This mirrors the original
Apply() (that would set cgroup configs) semantics.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:47 +11:00
Aleksa Sarai 8a740d5391 libcontainer: cgroups: don't Set in Apply
Apply and Set are two separate operations, and it doesn't make sense to
group the two together (especially considering that the bootstrap
process is added to the cgroup as well). The only exception to this is
the memory cgroup, which requires the configuration to be set before
processes can join.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:47 +11:00
Aleksa Sarai 37789f5bf1 libcontainer: cgroups: add pids controller support
Add support for the pids cgroup controller to libcontainer, a recent
feature that is available in Linux 4.3+.

Unfortunately, due to the init process being written in Go, it can spawn
an an unknown number of threads due to blocked syscalls. This results in
the init process being unable to run properly, and thus small pids.max
configs won't work properly.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:38 +11:00
Mrunal Patel 55a49f2110 Move the cgroups setting into a Resources struct
This allows us to distinguish cases where a container
needs to just join the paths or also additionally
set cgroups settings. This will help in implementing
cgroupsPath support in the spec.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-12-16 15:53:31 -05:00
Qiang Huang 7695a0ddb0 systemd: support cgroup parent with specified slice
Pick up #119
Fixes: docker/docker#16681

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-12-02 23:57:02 -05:00
Michael Crosby ba2ce3b25a Cgroup set order for systemd
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-10-19 13:32:45 -07:00
Alexander Morozov 6dad176d01 Get PIDs from cgroups recursively
Also lookup cgroup for systemd is changed to "device" to be consistent
with fs implementation.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-10-13 10:19:01 -07:00
Mrunal Patel cc84f2cc9b Merge pull request #305 from hqhq/hq_add_softlimit_systemd
Add memory reservation support for systemd
2015-10-05 16:37:32 -07:00
Mrunal Patel 223975564a Merge pull request #276 from runcom/adapt-spec-96bcd043aa8a28f6f64c95ad61329765f01de1ba
Adapt spec 96bcd043aa
2015-10-05 16:36:09 -07:00
Mrunal Patel 79a02e35fb cgroups: Add name=systemd to list of subsystems
This allows getting the path to the subsystem and so is subsequently
used in EnterPid by an exec process.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-10-05 14:24:11 -04:00
Antonio Murdaca c6e406af24 Adjust runc to new opencontainers/specs version
Godeps: Vendor opencontainers/specs 96bcd043aa

Fix a bug where it's impossible to pass multiple devices to blkio
cgroup controller files. See https://github.com/opencontainers/runc/issues/274

Signed-off-by: Antonio Murdaca <runcom@linux.com>
2015-10-03 12:25:33 +02:00
Qiang Huang 6a5ba1109c Systemd: Join perf_event cgroup
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-09-29 15:42:29 +08:00
Qiang Huang fb5a56fb97 Add memory reservation support for systemd
Seems it's missed in the first place.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-09-29 10:02:12 +08:00
Michael Crosby b1821a4edc Merge pull request #150 from runcom/update-go-systemd-dbus-v3
Update go systemd dbus v3
2015-08-03 16:11:52 -04:00
Kir Kolyshkin 6f82d4b544 Simplify and fix os.MkdirAll() usage
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.

Quoting MkdirAll documentation:

> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.

This means two things:

1. If a directory to be created already exists, no error is
returned.

2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.

The above is a theory, based on quoted documentation and my UNIX
knowledge.

3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.

Because of #1, IsExist check after MkdirAll is not needed.

Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.

Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.

[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
2015-07-29 18:03:27 -07:00
Antonio Murdaca 5eab2d59d3 Swap check for systemd booted to use go-systemd method
Signed-off-by: Antonio Murdaca <runcom@linux.com>
2015-07-25 01:36:14 +02:00
Antonio Murdaca 15741a4ab3 Adapt code to go-systemd/dbus v3
Signed-off-by: Antonio Murdaca <runcom@linux.com>
2015-07-24 15:54:59 +02:00
Qiang Huang b4d1df0131 Add oom-kill-disable support for systemd
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-07-08 09:21:46 +08:00
Raghavendra K T 88104a4444 Treat -1 as default value for memory swappiness.
In some older kernels setting swappiness fails. This happens even
when nobody tries to configure swappiness from docker UI because
we would still get some default value from host config.
With this we treat -1 value as default value (set implicitly) and skip
the enforcement of swappiness.

However from the docker UI setting an invalid value anything other than
0-100 including -1 should fail. This patch enables that fix in docker UI.

without this fix container creation with invalid value succeeds with a
default value (60) which in incorrect.

Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
2015-07-03 18:19:45 +05:30