With the fix in the previous commit and criu patched with support for
cgroupv2, these tests should now pass.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case of cgroupv2 unified hierarchy, the /sys/fs/cgroup mount
is the real mount with fstype of cgroup2 (rather than a set of
external bind mounts like for cgroupv1).
So, we should not add it to the list of "external bind mounts"
on both checkpoint and restore.
Without this fix, checkpoint integration tests fail on cgroup v2.
Also, same is true for cgroup v1 + cgroupns.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Same test as the first one, just with cgroupns enabled.
Since in case of cgroupv2 `runc spec` enables cgroupns,
this case was already tested by the first checkpoint test,
so skip it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit a9e15e7e0 adds a check that stdin/out/err pipes
are restored correctly. Commit ec260653b7 copy/pastes
the same code to one more another test.
Problem is (as pointed out in commit 5369f9ade3) these tests
sometimes hang. I have also seen them fail.
Apparently, the code used to create pipes and open them to fds
is racy:
```shell
cat $fifo | cat $fifo &
pid=$!
exec 50</proc/$pid/fd/0
exec 51>/proc/$pid/fd/0
```
Since `cat | cat` is spawned asynchronously, by the time exec is used,
the second cat process (i.e. $pid) is already fork'ed but it might
not be exec'ed yet. As a result, we get this (`ls -l /proc/self/fd`):
```
lr-x------. 1 root root 64 Apr 20 02:39 50 -> /dev/pts/1
l-wx------. 1 root root 64 Apr 20 02:39 51 -> /dev/pts/1
```
or, in some cases:
```
lr-x------. 1 root root 64 Apr 20 02:45 50 -> /dev/pts/1
l-wx------. 1 root root 64 Apr 20 02:45 51 -> 'pipe:[215791]'
```
instead of expected set of pipes:
```
> lr-x------. 1 root root 64 Apr 20 02:45 50 -> 'pipe:[215791]'
> l-wx------. 1 root root 64 Apr 20 02:45 51 -> 'pipe:[215791]'
```
One possible workaround is to add `sleep 0.1` or so after cat|cat,
but it is outright ugly (besides, we already have one sleep in
the test code).
The solution is to not use any external processes to create pipes.
I admit this still looks not very comprehensible, but at least it
is easier than before, and it works.
While at it, remove code duplication, moving the setup and check
code into a pair of functions.
Finally, since the tests are working now, remove the skip.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since all the criu tests have the same requirements,
move them to setup().
While at it, remove an obviously redundant comment.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Introduce a special case for `testcontainer` to test
for container that is not present (checkpointed), use it.
Fix one place where testcontainer was not used.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
runc in this file is actually a function that does `run runc ...`,
and `run` sets variable `$status` as the exit code, so `$status`
is what should be checked.
If calling runc directly (as in `__runc ...`), then $? is legit.
While at it, remove an obsoleted comment, and an unneeded
`ret=$?` assignment (check `$?` directly).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. There is no need to try removing it recursively.
2. Do not treat ENOENT as an error (similar to fs
and systemd v1 drivers).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Run in the same environment as systemd tests.
Disable CRIU tests because:
- they all fail with cgroup v2;
- CRIU v3.14 is required and it's not yet released, and
rebuilding it from sources with patches applied (like
it is currently done in Dockerfile) is too much work.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Instead of enabling all available controllers, figure out which
ones are required, and only enable those.
2. Amend all setFoo() functions to call isFooSet(). While this might
seem unnecessary, it might actually help to uncover a bug.
Imagine someone:
- adds a cgroup.Resources.CpuFoo setting;
- modifies setCpu() to apply the new setting;
- but forgets to amend isCpuSet() accordingly <-- BUG
In this case, a test case modifying CpuFoo will help
to uncover the BUG. This is the reason why it's added.
This patch *could be* amended by enabling controllers on a best-effort
basis, i.e. :
- do not return an error early if we can't enable some controllers;
- if we fail to enable all controllers at once (usually because one
of them can't be enabled), try enabling them one by one.
Currently this is not implemented, and it's not clear whether this
would be a good way to go or not.
[v2: add/use is${Controller}Set() functions]
[v3: document neededControllers()]
[v4: drop "best-effort" part]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Rename the files
- v1.go: cgroupv1 aka legacy;
- v2.go: cgroupv2 aka unified hierarchy;
- unsupported.go: when systemd is not available.
2. Move the code that is common between v1 and v2 to common.go
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This check was removed in commit 5406833a65. Now, when this
function is called from a few places, it is no longer obvious
that the path always starts with /sys/fs/cgroup/, so reinstate
the check just to be on the safe side.
This check also ensures that elements[3:] can be used safely.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
fs2 cgroup driver was not working because it did not enable controllers
while creating cgroup directory; instead it was merely doing MkdirAll()
and gathered the list of available controllers in NewManager().
Also, cgroup should be created in Apply(), not while creating a new
manager instance.
To fix:
1. Move the createCgroupsv2Path function from systemd driver to fs2 driver,
renaming it to CreateCgroupPath. Use in Apply() from both fs2 and
systemd drivers.
2. Delay available controllers map initialization to until it is needed.
With this patch:
- NewManager() only performs minimal initialization (initializin
m.dirPath, if not provided);
- Apply() properly creates cgroup path, enabling the controllers;
- m.controllers is initialized lazily on demand.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The fs2 cgroup driver has a sanity check for path.
Since systemd driver is relying on the same path,
it makes sense to move this check to the common code.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Having map of per-subsystem paths in systemd unified cgroups
driver does not make sense and makes the code less readable.
To get rid of it, move the systemd v1-or-v2 init code to
libcontainer/factory_linux.go which already has a function
to deduce unified path out of paths map.
End result is much cleaner code. Besides, we no longer write pid
to the same cgroup file 7 times in Apply() like we did before.
While at it
- add `rootless` flag which is passed on to fs2 manager
- merge getv2Path() into GetUnifiedPath(), don't overwrite
path if it is set during initialization (on Load).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. it was not working previously because of a typo
2. when a typo is removed, important packages such
as container-selinux are not updated, so let's
just remove this flag to avoid confusion.
Fixes: 84583eb1a4
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In this very case, the code is writing to cgroup2 filesystem,
and the file name is well known and can't possibly be a symlink.
So, using securejoin is redundant.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In many places (not all of them though) we can use `unix.`
instead of `syscall.` as these are indentical.
In particular, x/sys/unix defines:
```go
type Signal = syscall.Signal
type Errno = syscall.Errno
type SysProcAttr = syscall.SysProcAttr
const ENODEV = syscall.Errno(0x13)
```
and unix.Exec() calls syscall.Exec().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Those needs to be run on the (Vagrant Fedora 31) host
(since we need real systemd running), and so we have
to have all the tools needed to compile runc and run
the tests.
The good news is Fedora packages a decent and recent release
of bats-core (1.1.0), which we can use (Debian does not),
and we can also use golang (currently 1.13.9) from Fedora.
The bad news are
1. Currently cgroups tests are only working with
RUNC_USE_SYSTEMD=yes (addressed by #2299, #2305)
2. Tests in events.bats do not work (need cgroupv2
memory.events support)
3. Fedora 31 image is 6 months old (and has broken
container-selinux policy) so we need `dnf update`,
which adds ~5 min to test time.
[v2: add -t to ssh to enforce pty]
[v3: disable events tests for cgroupv2]
[v4: update fedora packages, use a single dnf transation]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Add `cgroups_v1` and `cgroups_v2` options to `requires`.
2. Modify `check_cgroup_value` to be able to work with v2.
3. Split `test "update"` into two:
- (1) testing cgroupv1-only cpu shares and cfs
- (2) testing limits that are more or less common
between v1 and v2: memory/swap, pids, cpusets.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
... and add kmem-tcp to cgroups kmem test.
First, we already have a separate kmem test in cgroups.bats.
Second, making kmem a requirement leads to skipping all the other
test cases in the update.bats test.
Third, kmem limit is being removed from the kernel, so it makes sense
to handle it separately.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This comment was added by commit 6cd425be2b (Allow update rt_period_us
and rt_runtime_us, Nov 4 2016), and the test case was added by commit
51baedf3f3 (Add integration for update rt period and runtime,
Nov 28 2016), making the comment obsolete.
Remove it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Consolidate all the cgroup-related initialization code to
a single place, init_cgroup_paths(), so we can see which
variables are set.
2. Lazily call init_cgroup_paths() from all places that require it.
3. Don't set globals KMEM and RT_PERIOD.
4. Slightly clarlify variables naming:
- use OCI_CGROUPS_PATH for cgroupsPath in config.json
- use REL_CGROUPS_PATH for relative cgroups path
5. Do not hardcode the list of cgroup subsystems -- get it from
/proc/cgroup.
6. Preliminary support for cgroupv2 unified hierarchy (not yet working).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Consolidate two implementations of check_cgroup_value()
into one, putting it into helpers.
Remove the first parameter, deducing the variable to get
the path from by the parameter name.
This should help in future cgroupv2 support.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
If container's config.json have `"terminal": true` setting in its
"process" section, runc exec assumes that stdin (fd 0) is a terminal
and tries to use it.
This leads to the following error in case stdin is not a terminal:
> ERRO[0000] exec failed: provided file is not a console
So, even if -t/--tty is not set, exec uses stdin as a terminal.
It does not help that urfave/cli v1 parser we use does not allow
to use `-t no` or `-t false`.
Since the settings in config.json is probably for the container run/start,
not for the auxiliary process started inside a container with exec, do
not use a setting from there, only treating stdin as a terminal in case
`-t` is explicitly given.
Tests that use runc exec with a terminal are amended with -t.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When the cgroupv2 fs driver is used without setting cgroupsPath,
it picks up a path from /proc/self/cgroup. On a host with systemd,
such a path can look like (examples from my machines):
- /user.slice/user-1000.slice/session-4.scope
- /user.slice/user-1000.slice/user@1000.service/gnome-launched-xfce4-terminal.desktop-4260.scope
- /user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service
This cgroup already contains processes in it, which prevents to enable
controllers for a sub-cgroup (writing to cgroup.subtree_control fails
with EBUSY or EOPNOTSUPP).
Obviously, a parent cgroup (which does not contain tasks) should be used.
Fixes opencontainers/runc/issues/2298
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>