In the cases that we got failure on a subsystem's Apply,
we'll get some subsystems' cgroup directories leftover.
On Docker's point of view, start a container failed, use
`docker rm` to remove the container, but some cgroup files
are leftover.
Sometimes we don't want to clean everyting up when something
went wrong, because we need these inter situation
information to debug what's going on, but cgroup directories
are not useful information we want to keep.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
This PR fix issue in this scenario:
```
in terminal 1:
~# cd /sys/fs/cgroup/cpuset
~# mkdir test
~# cd test
~# cat cpuset.cpus
0-3
~# echo 1 > cpuset.cpu_exclusive (make sure you don't have other cgroups under root)
in terminal 2:
~# echo $$ > /sys/fs/cgroup/cpuset/test/tasks
// set resources.cpu.cpus="0-2" in config.json
~# runc run test1
back to terminal 1:
~# cd test1
~# cat cpuset.cpus
0-2
~# echo 1 > cpuset.cpu_exclusive
in terminal 3:
~# echo $$ > /sys/fs/cgroup/test/tasks
// set resources.cpu.cpus="3" in config.json
~# runc run test2
container_linux.go:247: starting container process caused "process_linux.go:258:
applying cgroup configuration for process caused \"failed to write 0-3\\n to
cpuset.cpus: write /sys/fs/cgroup/cpuset/test2/cpuset.cpus: invalid argument\""
```
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
cgroupData.join method using `WriteCgroupProc` to place the pid into
the proc file, it can avoid attach any pid to the cgroup if -1 is
specified as a pid.
so, replace `writeFile` with `WriteCgroupProc` like `cpuset.go`'s
ApplyDir method.
Signed-off-by: Wang Long <long.wanglong@huawei.com>
After #1009, we don't always set `cgroup.Paths`, so
`getCgroupPath()` will return wrong cgroup path because
it'll take current process's cgroup as the parent, which
would be wrong when we try to find the cgroup path in
`runc ps` and `runc kill`.
Fix it by using `m.GetPath()` to get the true cgroup
paths.
Reported-by: Yang Shukui <yangshukui@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Revert: #935Fixes: #946
I can reproduce #946 on some machines, the problem is on
some machines, it could be very fast that modify time
of `memory.kmem.limit_in_bytes` could be the same as
before it's modified.
And now we'll call `SetKernelMemory` twice on container
creation which cause the second time failure.
Revert this before we find a better solution.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Setting classid of net_cls cgroup failed:
ERRO[0000] process_linux.go:291: setting cgroup config for ready process caused "failed to write 𐀁 to net_cls.classid: write /sys/fs/cgroup/net_cls,net_prio/user.slice/abc/net_cls.classid: invalid argument"
process_linux.go:291: setting cgroup config for ready process caused "failed to write 𐀁 to net_cls.classid: write /sys/fs/cgroup/net_cls,net_prio/user.slice/abc/net_cls.classid: invalid argument"
The spec has classid as a *uint32, the libcontainer configs should match the type.
Signed-off-by: Hushan Jia <hushan.jia@gmail.com>
Added a unit test to verify that 'cpu.rt_runtime_us' and 'cpu.rt_runtime_us'
cgroup values are set when the cgroup is applied to a process.
Signed-off-by: Ben Gray <ben.r.gray@gmail.com>
before trying to move the process into the cgroup.
This is required if runc itself is running in SCHED_RR mode, as it is not
possible to add a process in SCHED_RR mode to a cgroup which hasn't been
assigned any RT bandwidth. And RT bandwidth is not inherited, each new
cgroup starts with 0 b/w.
Signed-off-by: Ben Gray <ben.r.gray@gmail.com>
Kernel memory cannot be set in these circumstances (before kernel 4.6):
1. kernel memory is not initialized, and there are tasks in cgroup
2. kernel memory is not initialized, and use_hierarchy is enabled,
and there are sub-cgroups
While we don't need to cover case 2 because when we set kernel
memory in runC, it's either:
- in Apply phase when we create the container, and in this case,
set kernel memory would definitely be valid;
- or in update operation, and in this case, there would be tasks
in cgroup, we only need to check if kernel memory is initialized
or not.
Even if we want to check use_hierarchy, we need to check sub-cgroups
as well, but for here, we can just leave it aside.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
No substantial code change.
Note that some style errors reported by `golint` are not fixed due to possible compatibility issues.
Signed-off-by: Akihiro Suda <suda.kyoto@gmail.com>
When swap memory is unsupported, Docker will set
cgroup.Resources.MemorySwap as -1.
Fixes: https://github.com/docker/docker/pull/21937
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Currently, if we start a container with:
`docker run -ti --name foo --memory 300M --memory-swap 500M busybox sh`
Then we want to update it with:
`docker update --memory 600M --memory-swap 800M foo`
It'll get error because we can't set memory to 600M with
the 500M limit of swap memory.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Make sure we don't error out collecting statistics for cases where
pids.max == "max". In that case, we can use a limit of 0 which means
"unlimited".
In addition, change the name of the stats attribute (Max) to mirror the
name of the resources attribute in the spec (Limit) so that it's
consistent internally.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
In order to allow nice usage statistics (in terms of percentages and
other such data), add the value of pids.max to the PidsStats struct
returned from the pids cgroup controller.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This prior fix to set "-1" explicitly was lost, and it is simpler to use
the same pointer type from the OCI spec to handle nil pointer == -1 ==
unset case.
Also, as a nearly humorous aside, there was a test for MemorySwappiness
that was actually setting Memory, and it was passing because of this
bug (as it was always setting everyone's MemorySwappiness to zero!)
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
If you don't move the process out of the named cgroup for systemd then
systemd will try to delete all the cgroups that the process is currently
in.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>