Make use of errors.Is() and errors.As() where appropriate to check
the underlying error. The biggest motivation is to simplify the code.
The feature requires go 1.13 but since merging #2256 we are already
not supporting go 1.12 (which is an unsupported release anyway).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using errors.Unwrap() is not the best thing to do, since it returns
nil in case of an error which was not wrapped. More to say,
errors package provides more elegant ways to check for underlying
errors, such as errors.As() and errors.Is().
This reverts commit f8e138855d, reversing
changes made to 6ca9d8e6da.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Return earlier if there is an error.
2. Do not use filepath.Split on every entry, use info.Name() instead.
3. Make readProcsFile() accept file name as an argument, to avoid
unnecessary file name and directory splitting and merging.
4. Skip on info.IsDir() -- this avoids an error when cgroup name is
set to "cgroup.procs".
This is still not very good since filepath.Walk() performs an unnecessary
stat(2) on every entry, but better than before.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
* TestConvertCPUSharesToCgroupV2Value(0) was returning 70369281052672, while the correct value is 0
* ConvertBlkIOToCgroupV2Value(0) was returning 32, while the correct value is 0
* ConvertBlkIOToCgroupV2Value(1000) was returning 4, while the correct value is 10000
Fix#2244
Follow-up to #2212#2213
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
/proc/cgroups is meaningless for v2 and should be ignored.
https://github.com/torvalds/linux/blob/v5.3/Documentation/admin-guide/cgroup-v2.rst#deprecated-v1-core-features
* Now GetAllSubsystems() parses /sys/fs/cgroup/cgroup.controller, not /proc/cgroups.
The function result also contains "pseudo" controllers: {"devices", "freezer"}.
As it is hard to detect availability of pseudo controllers, pseudo controllers
are always assumed to be available.
* Now IOGroupV2.Name() returns "io", not "blkio"
Fix#2155#2156
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
allow to set what subsystems are used by
libcontainer/cgroups/fs.Manager.
subsystemsUnified is used on a system running with cgroups v2 unified
mode.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The hugetlb cgroup control files (introduced here in 2012:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abb8206cb0773)
use "KB" and not "kB"
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349).
The behavior in the kernel has not changed since the introduction, and
the current code using "kB" will therefore fail on devices with small
amounts of ram (see
https://github.com/kubernetes/kubernetes/issues/77169) running a kernel
with config flag CONFIG_HUGETLBFS=y
As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB",
"MB" and "GB" are used, so the others may be removed as well.
Here is a real world example of the files inside the
"/sys/kernel/mm/hugepages/" directory:
- "hugepages-64kB"
- "hugepages-2048kB"
- "hugepages-32768kB"
- "hugepages-1048576kB"
And the corresponding cgroup files:
- "hugetlb.64KB._____"
- "hugetlb.2MB._____"
- "hugetlb.32MB._____"
- "hugetlb.1GB._____"
Signed-off-by: Odin Ugedal <odin@ugedal.com>
The kernel will sometimes return EINVAL when writing a pid to a
cgroup.procs file. It does so when the task being added still has the
state TASK_NEW.
See: https://elixir.bootlin.com/linux/v4.8/source/kernel/sched/core.c#L8286
Co-authored-by: Danail Branekov <danailster@gmail.com>
Signed-off-by: Tom Godkin <tgodkin@pivotal.io>
Signed-off-by: Danail Branekov <danailster@gmail.com>
Cgroup namespace can be configured in `config.json` as other
namespaces. Here is an example:
```
"namespaces": [
{
"type": "pid"
},
{
"type": "network"
},
{
"type": "ipc"
},
{
"type": "uts"
},
{
"type": "mount"
},
{
"type": "cgroup"
}
],
```
Note that if you want to run a container which has shared cgroup ns with
another container, then it's strongly recommended that you set
proper `CgroupsPath` of both containers(the second container's cgroup
path must be the subdirectory of the first one). Or there might be
some unexpected results.
Signed-off-by: Yuanhong Peng <pengyuanhong@huawei.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Respect the container's cgroup path when finding the container's
cgroup mount point, which is useful in multi-tenant environments, where
containers have their own unique cgroup mounts
Signed-off-by: Danail Branekov <danailster@gmail.com>
Signed-off-by: Oliver Stenbom <ostenbom@pivotal.io>
Signed-off-by: Giuseppe Capizzi <gcapizzi@pivotal.io>
Fix duplicate entries and missing entries in getCgroupMountsHelper
Add test for testing cgroup mounts on bedrock linux
Stop relying on number of subsystems for cgroups
LGTMs: @crosbymichael @cyphar
Closes#1817
When there are complicated mount setups, there can be multiple mount
points which have the subsystem we are looking for. Instead of
counting the mountpoints, tick off subsystems until we have found them
all.
Without the 'all' flag, ignore duplicate subsystems after the first.
Signed-off-by: Daniel Dao <dqminh89@gmail.com>
The rootless cgroup manager acts as a noop for all set and apply
operations. It is just used for rootless setups. Currently this is far
too simple (we need to add opportunistic cgroup management), but is good
enough as a first-pass at a noop cgroup manager.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Runc needs to copy certain files from the top of the cgroup cpuset hierarchy
into the container's cpuset cgroup directory. Currently, runc determines
which directory is the top of the hierarchy by using the parent dir of
the first entry in /proc/self/mountinfo of type cgroup.
This creates problems when cgroup subsystems are mounted arbitrarily in
different dirs on the host.
Now, we use the most deeply nested mountpoint that contains the
container's cpuset cgroup directory.
Signed-off-by: Konstantinos Karampogias <konstantinos.karampogias@swisscom.com>
Signed-off-by: Will Martin <wmartin@pivotal.io>
The `bufio.Scanner.Scan` method returns false either by reaching the
end of the input or an error. After Scan returns false, the Err method
will return any error that occurred during scanning, except that if it
was io.EOF, Err will return nil.
We should check the error when Scan return false(out of the for loop).
Signed-off-by: Wang Long <long.wanglong@huawei.com>
Prior to this change a cgroup with a `:` character in it's path was not
parsed correctly (as occurs on some instances of systemd cgroups under
some versions of systemd, e.g. 225 with accounting).
This fixes that issue and adds a test.
Signed-off-by: Euan Kemp <euank@coreos.com>
`libcontainer/cgroups/utils.go` uses an incorrect path to the
documentation for cgroups. This updates the comment to use the correct
URL. Fixes#794.
Signed-off-by: Jim Berlage <james.berlage@gmail.com>
Avoid parsing the whole lines of mountinfo after all mountpoints
of the target subsytems are found, or when the target subsystem
is not enabled.
Signed-off-by: Tatsushi Inagaki <e29253@jp.ibm.com>
No substantial code change.
Note that some style errors reported by `golint` are not fixed due to possible compatibility issues.
Signed-off-by: Akihiro Suda <suda.kyoto@gmail.com>
Based on Golang document, %s is for "the uninterpreted bytes of the
string or slice", so %v is more appropriate.
Signed-off-by: Peng Gao <peng.gao.dut@gmail.com>
GetMounts is very cpu-expensive. I'll change other funcs in this package
to reuse code from GetCgroupMounts later.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Bug was introduced in #250
According to: http://man7.org/linux/man-pages/man5/proc.5.html
36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue
(1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11)
...
(7) optional fields: zero or more fields of the form
"tag[:value]".
The 7th field is optional. We should skip it when parsing mount info.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>