Commit Graph

4196 Commits

Author SHA1 Message Date
Kir Kolyshkin 32d52a0fab tests/checkpoint: enable for Fedora 31 / cgroup v2
With the fix in the previous commit and criu patched with support for
cgroupv2, these tests should now pass.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-22 11:40:28 -07:00
Kir Kolyshkin 9280e3566d checkpoint/restore: fix cgroupv2 handling
In case of cgroupv2 unified hierarchy, the /sys/fs/cgroup mount
is the real mount with fstype of cgroup2 (rather than a set of
external bind mounts like for cgroupv1).

So, we should not add it to the list of "external bind mounts"
on both checkpoint and restore.

Without this fix, checkpoint integration tests fail on cgroup v2.

Also, same is true for cgroup v1 + cgroupns.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-22 11:26:43 -07:00
Kir Kolyshkin 00a2844ab4 tests/checkpoint: add simple c/r test for cgroupns
Same test as the first one, just with cgroupns enabled.

Since in case of cgroupv2 `runc spec` enables cgroupns,
this case was already tested by the first checkpoint test,
so skip it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-22 11:06:24 -07:00
Akihiro Suda cdce577dcf
Merge pull request #2332 from kolyshkin/cgroupv2-cr
Fix/improve checkpoint integration tests
2020-04-23 00:47:18 +09:00
Kir Kolyshkin d5e68ceb0c tests/checkpoint.bats: fix test hang/failure
Commit a9e15e7e0 adds a check that stdin/out/err pipes
are restored correctly. Commit ec260653b7 copy/pastes
the same code to one more another test.

Problem is (as pointed out in commit 5369f9ade3) these tests
sometimes hang. I have also seen them fail.

Apparently, the code used to create pipes and open them to fds
is racy:

```shell
cat $fifo | cat $fifo &
pid=$!
exec 50</proc/$pid/fd/0
exec 51>/proc/$pid/fd/0
```

Since `cat | cat` is spawned asynchronously, by the time exec is used,
the second cat process (i.e. $pid) is already fork'ed but it might
not be exec'ed yet. As a result, we get this (`ls -l /proc/self/fd`):

```
lr-x------. 1 root root 64 Apr 20 02:39 50 -> /dev/pts/1
l-wx------. 1 root root 64 Apr 20 02:39 51 -> /dev/pts/1
```

or, in some cases:
```
lr-x------. 1 root root 64 Apr 20 02:45 50 -> /dev/pts/1
l-wx------. 1 root root 64 Apr 20 02:45 51 -> 'pipe:[215791]'
```

instead of expected set of pipes:

```
> lr-x------. 1 root root 64 Apr 20 02:45 50 -> 'pipe:[215791]'
> l-wx------. 1 root root 64 Apr 20 02:45 51 -> 'pipe:[215791]'
```

One possible workaround is to add `sleep 0.1` or so after cat|cat,
but it is outright ugly (besides, we already have one sleep in
the test code).

The solution is to not use any external processes to create pipes.
I admit this still looks not very comprehensible, but at least it
is easier than before, and it works.

While at it, remove code duplication, moving the setup and check
code into a pair of functions.

Finally, since the tests are working now, remove the skip.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-21 02:16:23 -07:00
Kir Kolyshkin bf172ef44f tests/checkpoint.bats: consolidate requires checks
Since all the criu tests have the same requirements,
move them to setup().

While at it, remove an obviously redundant comment.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-21 02:15:04 -07:00
Kir Kolyshkin e216457eea tests/checkpoint.bats: simplify status checks
Introduce a special case for `testcontainer` to test
for container that is not present (checkpointed), use it.

Fix one place where testcontainer was not used.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-21 01:55:54 -07:00
Kir Kolyshkin 69d599ddbd tests/checkpoint.bats: fix $? checks
runc in this file is actually a function that does `run runc ...`,
and `run` sets variable `$status` as the exit code, so `$status`
is what should be checked.

If calling runc directly (as in `__runc ...`), then $? is legit.

While at it, remove an obsoleted comment, and an unneeded
`ret=$?` assignment (check `$?` directly).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-21 01:55:54 -07:00
Mrunal Patel 46be7b612e
Merge pull request #2299 from kolyshkin/fs2-init-ctrl
cgroupv2: fix fs2 driver initialization
2020-04-20 21:27:42 -07:00
Akihiro Suda 5b38ef7173
Merge pull request #2320 from kolyshkin/vgr
CI cleanups
2020-04-20 09:43:03 +09:00
Kir Kolyshkin ab276b1c09 cgroups/fs2/Destroy: use Remove, ignore ENOENT
1. There is no need to try removing it recursively.

2. Do not treat ENOENT as an error (similar to fs
   and systemd v1 drivers).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:27:40 -07:00
Kir Kolyshkin 992d5cadfb travis: enable fs2 driver test on fedora
Run in the same environment as systemd tests.

Disable CRIU tests because:

 - they all fail with cgroup v2;

 - CRIU v3.14 is required and it's not yet released, and
   rebuilding it from sources with patches applied (like
   it is currently done in Dockerfile) is too much work.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:27:40 -07:00
Kir Kolyshkin 4b4bc995ad CreateCgroupPath: only enable needed controllers
1. Instead of enabling all available controllers, figure out which
   ones are required, and only enable those.

2. Amend all setFoo() functions to call isFooSet(). While this might
   seem unnecessary, it might actually help to uncover a bug.
   Imagine someone:
    - adds a cgroup.Resources.CpuFoo setting;
    - modifies setCpu() to apply the new setting;
    - but forgets to amend isCpuSet() accordingly <-- BUG

   In this case, a test case modifying CpuFoo will help
   to uncover the BUG. This is the reason why it's added.

This patch *could be* amended by enabling controllers on a best-effort
basis, i.e. :

 - do not return an error early if we can't enable some controllers;
 - if we fail to enable all controllers at once (usually because one
   of them can't be enabled), try enabling them one by one.

Currently this is not implemented, and it's not clear whether this
would be a good way to go or not.

[v2: add/use is${Controller}Set() functions]
[v3: document neededControllers()]
[v4: drop "best-effort" part]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:27:40 -07:00
Kir Kolyshkin bb47e35843 cgroup/systemd: reorganize
1. Rename the files
  - v1.go: cgroupv1 aka legacy;
  - v2.go: cgroupv2 aka unified hierarchy;
  - unsupported.go: when systemd is not available.

2. Move the code that is common between v1 and v2 to common.go

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:27:40 -07:00
Kir Kolyshkin de1134156b cgroups/fs2/CreateCgroupPath: nit
This slightly improves code readability.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:27:40 -07:00
Kir Kolyshkin b5c1949f2a cgroups/fs2/CreateCgroupPath: reinstate check
This check was removed in commit 5406833a65. Now, when this
function is called from a few places, it is no longer obvious
that the path always starts with /sys/fs/cgroup/, so reinstate
the check just to be on the safe side.

This check also ensures that elements[3:] can be used safely.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:27:40 -07:00
Kir Kolyshkin 813cb3eb94 cgroupv2: fix fs2 cgroup init
fs2 cgroup driver was not working because it did not enable controllers
while creating cgroup directory; instead it was merely doing MkdirAll()
and gathered the list of available controllers in NewManager().

Also, cgroup should be created in Apply(), not while creating a new
manager instance.

To fix:

1. Move the createCgroupsv2Path function from systemd driver to fs2 driver,
   renaming it to CreateCgroupPath. Use in Apply() from both fs2 and
   systemd drivers.

2. Delay available controllers map initialization to until it is needed.

With this patch:
 - NewManager() only performs minimal initialization (initializin
   m.dirPath, if not provided);
 - Apply() properly creates cgroup path, enabling the controllers;
 - m.controllers is initialized lazily on demand.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:27:40 -07:00
Kir Kolyshkin 60eaed2ed6 cgroupv2: move sanity path check to common code
The fs2 cgroup driver has a sanity check for path.
Since systemd driver is relying on the same path,
it makes sense to move this check to the common code.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:27:40 -07:00
Kir Kolyshkin dbeff89491 cgroupv2/systemd: privatize UnifiedManager
... and its Cgroup field. There is no sense to keep it public.

This was generated by gorename.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:27:40 -07:00
Kir Kolyshkin 88c13c0713 cgroupv2: use SecureJoin in systemd driver
It seems that some paths are coming from user and are therefore
untrusted.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:20:22 -07:00
Kir Kolyshkin 9c80cd672d cgroupv2: rm legacy Paths from systemd driver
Having map of per-subsystem paths in systemd unified cgroups
driver does not make sense and makes the code less readable.

To get rid of it, move the systemd v1-or-v2 init code to
libcontainer/factory_linux.go which already has a function
to deduce unified path out of paths map.

End result is much cleaner code. Besides, we no longer write pid
to the same cgroup file 7 times in Apply() like we did before.

While at it
 - add `rootless` flag which is passed on to fs2 manager
 - merge getv2Path() into GetUnifiedPath(), don't overwrite
   path if it is set during initialization (on Load).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 16:19:51 -07:00
Kir Kolyshkin b6cc3975de travis: rm BUILDTAGS
It is not needed since commit 89c108b1be.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 15:35:11 -07:00
Kir Kolyshkin 5f0424c94b Vagrantfile: rm disabling weak deps
1. it was not working previously because of a typo

2. when a typo is removed, important packages such
   as container-selinux are not updated, so let's
   just remove this flag to avoid confusion.

Fixes: 84583eb1a4
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-19 15:34:09 -07:00
Akihiro Suda cd5f4fd93c
Merge pull request #2325 from kolyshkin/nits-2
Nits
2020-04-20 01:43:15 +09:00
Kir Kolyshkin 480bca91be cgroups/fs2: move type decl to beginning
It was weird having it somewhere in the middle.

No code change, just moving it around.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-18 18:43:41 -07:00
Kir Kolyshkin 353e91770b cgroups/fs2: do not use securejoin
In this very case, the code is writing to cgroup2 filesystem,
and the file name is well known and can't possibly be a symlink.
So, using securejoin is redundant.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-18 18:43:41 -07:00
Kir Kolyshkin 58f970a01f cgroups/fscommon: use errors.Is
This is a forgotten hunk from PR #2291.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-18 16:16:49 -07:00
Kir Kolyshkin af6b9e7fa9 nit: do not use syscall package
In many places (not all of them though) we can use `unix.`
instead of `syscall.` as these are indentical.

In particular, x/sys/unix defines:

```go
type Signal = syscall.Signal
type Errno = syscall.Errno
type SysProcAttr = syscall.SysProcAttr

const ENODEV      = syscall.Errno(0x13)
```

and unix.Exec() calls syscall.Exec().

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-18 16:16:49 -07:00
Akihiro Suda bf0a8e1747
Merge pull request #2322 from lifubang/forceCgroupNS
cgroupv2: default join cgroup namespace in runc example
2020-04-18 05:47:33 +09:00
lifubang d0f9b9ce42 default join cgroup namespace in runc example
Signed-off-by: lifubang <lifubang@acmcoder.com>
2020-04-17 21:37:50 +08:00
Aleksa Sarai e4981c91b5
merge branch 'pr-2317'
Ted Yu (1):
  Defer netns.Close() after error check

LGTMs: @AkihiroSuda @cyphar
Closes #2317
2020-04-16 23:35:07 +10:00
Ted Yu 7a978e354a Defer netns.Close() after error check
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
2020-04-15 18:33:20 -07:00
Akihiro Suda 9f6a2d4ddc
Merge pull request #2305 from kolyshkin/fs2-fix-default
cgroupv2: fix fs2 driver default path
2020-04-16 10:16:48 +09:00
Mrunal Patel 191def7029
Merge pull request #2308 from kolyshkin/exec-no-tty
runc exec: don't enable terminal unless -t is set
2020-04-15 14:43:50 -07:00
Mrunal Patel 56aca5aa50
Merge pull request #2295 from kolyshkin/integration-cgroups
Initial integration tests for cgroupv2
2020-04-14 16:01:32 -07:00
Michael Crosby 5c6216b1ed
Merge pull request #2278 from iwankgb/memory.numa_stats
Exposing memory.numa_stats
2020-04-14 11:32:51 -04:00
Kir Kolyshkin 84583eb1a4 Enable integration tests in cgroupv2 env
Those needs to be run on the (Vagrant Fedora 31) host
(since we need real systemd running), and so we have
to have all the tools needed to compile runc and run
the tests.

The good news is Fedora packages a decent and recent release
of bats-core (1.1.0), which we can use (Debian does not),
and we can also use golang (currently 1.13.9) from Fedora.

The bad news are

 1. Currently cgroups tests are only working with
    RUNC_USE_SYSTEMD=yes (addressed by #2299, #2305)

 2. Tests in events.bats do not work (need cgroupv2
    memory.events support)

 3. Fedora 31 image is 6 months old (and has broken
    container-selinux policy) so we need `dnf update`,
    which adds ~5 min to test time.

[v2: add -t to ssh to enforce pty]
[v3: disable events tests for cgroupv2]
[v4: update fedora packages, use a single dnf transation]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-13 18:40:08 -07:00
Kir Kolyshkin 0965c970fa tests/integration: disable swap tests for v2
Swap setting for cgroupv2 is currently broken, so let's temporarily
disable this part of test.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-13 16:34:45 -07:00
Kir Kolyshkin 483f9a0c50 tests/integration: add some cgroup v2 tests
1. Add `cgroups_v1` and `cgroups_v2` options to `requires`.

2. Modify `check_cgroup_value` to be able to work with v2.

3. Split `test "update"` into two:

   - (1) testing cgroupv1-only cpu shares and cfs
   - (2) testing limits that are more or less common
         between v1 and v2: memory/swap, pids, cpusets.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-13 16:34:45 -07:00
Kir Kolyshkin 3dfa5434fc tests/integration/update.bats: simplify file creation
There's no need for an intermediate variable

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-13 16:34:45 -07:00
Kir Kolyshkin b8b46419ce tests/integration: rm kmem from upgrade tests
... and add kmem-tcp to cgroups kmem test.

First, we already have a separate kmem test in cgroups.bats.

Second, making kmem a requirement leads to skipping all the other
test cases in the update.bats test.

Third, kmem limit is being removed from the kernel, so it makes sense
to handle it separately.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-13 16:34:45 -07:00
Kir Kolyshkin ba3ee7fe04 tests/integration/update.bats: rm obsoleted comment
This comment was added by commit 6cd425be2b (Allow update rt_period_us
and rt_runtime_us, Nov 4 2016), and the test case was added by commit
51baedf3f3 (Add integration for update rt period and runtime,
Nov 28 2016), making the comment obsolete.

Remove it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-13 16:34:45 -07:00
Kir Kolyshkin 3f6a31b71e tests/integration: simplify cgroup paths init
1. Consolidate all the cgroup-related initialization code to
   a single place, init_cgroup_paths(), so we can see which
   variables are set.

2. Lazily call init_cgroup_paths() from all places that require it.

3. Don't set globals KMEM and RT_PERIOD.

4. Slightly clarlify variables naming:
    - use OCI_CGROUPS_PATH for cgroupsPath in config.json
    - use REL_CGROUPS_PATH for relative cgroups path

5. Do not hardcode the list of cgroup subsystems -- get it from
   /proc/cgroup.

6. Preliminary support for cgroupv2 unified hierarchy (not yet working).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-13 16:34:45 -07:00
Kir Kolyshkin 3ae9358054 tests/integration: check_cgroup_value: simplify
Consolidate two implementations of check_cgroup_value()
into one, putting it into helpers.

Remove the first parameter, deducing the variable to get
the path from by the parameter name.

This should help in future cgroupv2 support.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-13 16:34:45 -07:00
Michael Crosby 13431e0ece
Merge pull request #2312 from tedyu/cgrp-path-rollback
Properly remove intermediate directory
2020-04-13 12:28:32 -04:00
Ted Yu 614bb96676 cgroupv2/systemd: Properly remove intermediate directory
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
2020-04-13 08:32:08 -07:00
Kir Kolyshkin 939bed2a3e runc exec: don't enable terminal unless -t is set
If container's config.json have `"terminal": true` setting in its
"process" section, runc exec assumes that stdin (fd 0) is a terminal
and tries to use it.

This leads to the following error in case stdin is not a terminal:

> ERRO[0000] exec failed: provided file is not a console

So, even if -t/--tty is not set, exec uses stdin as a terminal.
It does not help that urfave/cli v1 parser we use does not allow
to use `-t no` or `-t false`.

Since the settings in config.json is probably for the container run/start,
not for the auxiliary process started inside a container with exec, do
not use a setting from there, only treating stdin as a terminal in case
`-t` is explicitly given.

Tests that use runc exec with a terminal are amended with -t.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-11 14:22:07 -07:00
Mrunal Patel ccbb3364d4
Merge pull request #2304 from AkihiroSuda/travis-do-not-ignore-cgroup2-failures
travis: move `cgroup-v2` out of `allow_failures`
2020-04-09 14:10:37 -07:00
Michael Crosby d65ba5fa0c
Merge pull request #2303 from KentaTada/remove-unneeded-syscall-import
libcontainer: remove unneeded import
2020-04-09 14:43:24 -04:00
Kir Kolyshkin ea36045fe1 cgroupv2: fix fs2 driver default path
When the cgroupv2 fs driver is used without setting cgroupsPath,
it picks up a path from /proc/self/cgroup. On a host with systemd,
such a path can look like (examples from my machines):

 - /user.slice/user-1000.slice/session-4.scope
 - /user.slice/user-1000.slice/user@1000.service/gnome-launched-xfce4-terminal.desktop-4260.scope
 - /user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service

This cgroup already contains processes in it, which prevents to enable
controllers for a sub-cgroup (writing to cgroup.subtree_control fails
with EBUSY or EOPNOTSUPP).

Obviously, a parent cgroup (which does not contain tasks) should be used.

Fixes opencontainers/runc/issues/2298

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-09 10:47:19 -07:00