This (and the converting function) is only used by one of the four
cgroup drivers. The other three do some checking and conversion in
place, so let the fs2 do the same.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The code that adds CpuQuotaPerSecUSec is the same in v1 and v2
systemd cgroup driver. Move it to common.
No functional change.
Note that the comment telling that we always set this property
contradicts with the current code, and therefore it is removed.
[v2: drop cgroupv1-specific comment]
[v3: drop returning error as it's not used]
[v4: remove an obsoleted comment]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Use r instead of c.Resources for readability. No functional change.
This commit has been brought to you by '<,'>s/c\.Resources\./r./g
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Fix#2046
Previously, the test was failing with EINVAL during writing 500001 to `/sys/fs/cgroup/cpu,cpuacct/runc-cgroups-integration-test/test-cgroup/cpu.rt_runtime_us`, because `/sys/fs/cgroup/cpu,cpuacct/runc-cgroups-integration-test/cpu.rt_runtime_us` was initialized with 0.
The issue had not been caught in Ubuntu 18.04 CI because it doesn't support rt.
Tested on Ubuntu 20.04.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
When we use cgroup with systemd driver, the cgroup path will be auto removed
by systemd when all processes exited. So we should check cgroup path exists
when we access the cgroup path, for example in `kill/ps`, or else we will
got an error.
Signed-off-by: lifubang <lifubang@acmcoder.com>
Apply() determines and creates cgroup path(s), configures parent cgroups
(for some v1 controllers), and creates a systemd unit (in case of a
systemd cgroup manager), then adds a pid specified to the cgroup
for all configured controllers.
This is a relatively heavy procedure (in particular, for cgroups v1 it
involves parsing /proc/self/mountinfo about a dozen times), and it seems
there is no need to do it twice.
More to say, even merely adding the child pid to the same cgroup seems
redundant, as we added the parent pid to the cgroup before sending the
data to the child (runc init process), and it waits for the data before
doing clone(), so its children will be in the same cgroup anyway.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case swap cgroup control is not available, the "event oom" test gives
the following error:
> # not ok 30 events oom
> # (in test file tests/integration/events.bats, line 134)
> # `[ "$status" -eq 0 ]' failed
> # <....>
> # runc run -d --console-socket /tmp/console.sock test_busybox (status=1):
> # time="2020-05-29T02:10:20Z" level=warning msg="signal: killed"
> # time="2020-05-29T02:10:20Z" level=error msg="container_linux.go:353: starting container process caused: process_linux.go:437: container init caused: process_linux.go:403: setting cgroup config for procHooks process caused: failed to write \"33554432\" to \"/sys/fs/cgroup/memory/test_busybox/memory.memsw.limit_in_bytes\": open /sys/fs/cgroup/memory/test_busybox/memory.memsw.limit_in_bytes: permission denied"
When I try to run the test without setting the swap limit, the shell
process is still getting killed, but the test hangs. I am not sure what
the reason is, but realistically this test is hard to perform without
the swap limit, so let's require cgroup swap for it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
For v2, mem+swap is always present. For v1, check it once and set a
variable which is used below.
This also removes CGROUP_MEMORY for v2 case since it's no longer used.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The "unlimited" value is the same for memory and memory+swap,
so let's use SYSTEM_MEM for both.
In fact, it was already used in one place to check swap, probably due to
a typo.
This also fixes the following failure on a cgroup v1 system without
mem+swap control (Ubuntu 19.04):
> # not ok 78 update cgroup v1/v2 common limits
> # (in test file tests/integration/update.bats, line 72)
> # `SYSTEM_MEM_SWAP=$(cat "${CGROUP_MEMORY_BASE_PATH}/$MEM_SWAP")' failed
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is a regression from commit 1d4ccc8e0. We only need to enable
kernel memory accounting once, from the (*legacyManager*).Apply(),
and there is no need to do it in (*legacyManager*).Set().
While at it, rename the method to better reflect what it's doing.
This saves 1 call to mountinfo parser.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit 4e65e0e90a added a check for cpu shares. Apparently, the
kernel allows to set a value higher than max or lower than min without
an error, but the value read back is always within the limits.
The check (which was later moved out to a separate CheckCpushares()
function) is always performed after setting the cpu shares, so let's
move it to the very place where it is set.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. __runc does not set $status, so the check is misleading.
2. Add set +eux to the nest.sh script so we can error out early, and see
what is going on.
3. Doing "echo +io" > cgroup.controllers is giving an error on my
machine ("sh: write error: Operation not supported"). It is probably
fine to just enable pids controller.
4. Add status check for runc exec nest.sh
5. Remove the second check for cgroup.threads contents -- it was already
checked earlier (the output of nest.sh script).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. In cases there are no sub-cgroups, a single rmdir should be faster
than iterating through the list of files.
2. Use unix.Rmdir() to save one more syscall since os.Remove() tries
unlink(2) first which fails on a directory, and only then tries
rmdir(2).
3. Re-use rmdir.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is a quick-n-dirty fix the regression introduced by commit
06d7c1d, which made it impossible to only set CpuQuota
(without the CpuPeriod). It partially reverts the above commit,
and adds a test case.
The proper fix will follow.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This relies on https://github.com/checkpoint-restore/criu/pull/1069
and emulates the previous behavior by writing \0 and closing status
fd (as it was done by criu).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
... and mem+swap is not explicitly set otherwise.
This ensures compatibility with cgroupv1 controller which interprets
things this way.
With this fixed, we can finally enable swap tests for cgroupv2.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Partially revert "CreateCgroupPath: only enable needed controllers"
If we update a resource which did not limited in the beginning,
it will have no effective.
2. Returns err if we use an non enabled controller,
or else the user may feel success, but actually there are no effective.
Signed-off-by: lifubang <lifubang@acmcoder.com>
Commit 18ebc51b3cc3 "Reset Swap when memory is set to unlimited (-1)"
added handling of the case when a user updates the container limits
to set memory to unlimited (-1) but do not set any other limits.
Apparently, in this case, if swap limit was previously set, kernel fails
to set memory.limit_in_bytes to -1 if memory.memsw.limit_in_bytes is
not set to -1.
What the above commit fails to handle correctly is the request when
Memory is set to -1 and MemorySwap is set to some specific limit N
(where N > 0). In this case, the value of N is silently discarded
and MemorySwap is set to -1 instead.
This is wrong thing to do, as the limit set, even if incorrectly,
should not be ignored.
Fix this by only assigning MemorySwap == -1 in case it was not
explicitly set.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>