Commit Graph

4552 Commits

Author SHA1 Message Date
Adrian Reber 610c5ad75c
Factor out checkpointing with external namespace code
To checkpoint and restore a container with an external network namespace
(like with Podman and CNI), runc tells CRIU to ignore the network
namespace during checkpoint and restore.

This commit moves that code to their own functions to be able to reuse
the same code path for external PID namespaces which are necessary for
checkpointing and restoring containers out of a pod in cri-o.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-07-27 10:14:07 +02:00
Kir Kolyshkin d65df61dc5
Merge pull request #2521 from zvier/master
cleancode: clean code for utils_linux.go
2020-07-23 12:58:24 -07:00
zvier 92e2175de1 cleancode: clean code for utils_linux.go
Signed-off-by: Jeff Zvier <zvier20@gmail.com>
2020-07-23 06:12:27 +08:00
Kir Kolyshkin 86d9399c80
Merge pull request #2524 from adrianreber/fix-travis
Fix .travis.yml warnings
2020-07-22 11:16:24 -07:00
Adrian Reber b7683d6b0f
Fix .travis.yml warnings
Travis reports following warnings which are fixed with this commit.

   root: deprecated key sudo (The key `sudo` has no effect anymore.)
   root: missing os, using the default linux
   root: key matrix is an alias for jobs, using jobs

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-07-21 10:27:48 +02:00
Aleksa Sarai f8749ba098
merge branch 'pr-2509'
Kir Kolyshkin (2):
  tests/int/checkpoint: fds and pids cleanup
  tests/int/checkpoint: don't remove readonly flag

LGTMs: @mrunalp @AkihiroSuda @cyphar
Closes #2509
2020-07-20 13:03:38 +10:00
Kir Kolyshkin f9850afa91
Merge pull request #2518 from XiaodongLoong/redundant_chroot_param
remove redundant parameter of chroot function
2020-07-15 17:26:24 -07:00
Xiaodong Liu af283b3f47 remove redundant the parameter of chroot function
Signed-off-by: Xiaodong Liu <liuxiaodong@loongson.cn>
2020-07-15 16:22:07 +08:00
Mrunal Patel b7d8f3bf0d
Merge pull request #2516 from ide-rea/fix-typo
fix small typo
2020-07-13 09:04:31 -07:00
Mrunal Patel 47fbafb7bc
Merge pull request #2510 from kolyshkin/criu-el7
tests/centos7: add criu
2020-07-13 07:51:08 -07:00
Xiaoyu Zhang 76b05e6d13 fix small typo
Signed-off-by: Xiaoyu Zhang <mateuszhang@tencent.com>
2020-07-11 16:36:32 +08:00
Mrunal Patel cf1273abf4
Merge pull request #2498 from kolyshkin/v1-code-cleanups
libct/cgroups/fs: code cleanups
2020-07-09 15:58:06 -07:00
Mrunal Patel 545ebdd14a
Merge pull request #2511 from kolyshkin/fedora-dnf-fix
tests/fedora32: retry dnf
2020-07-08 21:20:05 -07:00
Kir Kolyshkin fbf047bf2f
Merge pull request #2501 from XiaodongLoong/systemderror-fix
fix TestPidsSystemd and TestRunWithKernelMemorySystemd test error
2020-07-08 20:39:39 -07:00
Xiaodong Liu f57bb2fe3d fix TestPidsSystemd and TestRunWithKernelMemorySystemd test error
Signed-off-by: Xiaodong Liu <liuxiaodong@loongson.cn>
2020-07-09 09:36:03 +08:00
Mrunal Patel ce54a9d4d7
Merge pull request #2514 from rhatdan/windows
Allow libcontainer/configs to be imported on Windows
2020-07-08 14:00:54 -07:00
Kir Kolyshkin 6d5125f8b4 tests/int/checkpoint: don't remove readonly flag
This should not longer be necessary (in theory, at least),
let's see how it goes.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-08 12:56:25 -07:00
Kir Kolyshkin 9806eb5567
Merge pull request #2513 from lsm5/custom-PREFIX-in-Makefile
allow customizable PREFIX variable
2020-07-08 12:54:11 -07:00
Daniel J Walsh d78ee47154
Allow libcontainer/configs to be imported on Windows
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-07-08 15:20:37 -04:00
Kir Kolyshkin 5517d1d71d
Merge pull request #2505 from XiaodongLoong/redundant-copy-src
fix redundant source code copy issue
2020-07-08 07:37:55 -07:00
Kir Kolyshkin ffe9f0b0fb Vagrantfile.centos7: do not ignore script failures
Add `set -e -u -o pipefail` so the script will fail early
if there's an error.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-08 07:32:41 -07:00
Lokesh Mandvekar bc1a9c11a2 allow customizable PREFIX variable
This change would let me specify my own PREFIX so that I can reuse
Makefile targets for building rpm packages.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2020-07-08 09:20:03 -04:00
Kir Kolyshkin a73ce38d16 cgroupv1/FindCgroupMountpoint: add a fast path
In case cgroupPath is under the default cgroup prefix, let's try to
guess the mount point by adding the subsystem name to the default
prefix, and resolving the resulting path in case it's a symlink.

In most cases, given the default cgroup setup, this trick
should result in returning the same result faster, and avoiding
/proc/self/mountinfo parsing which is relatively slow and problematic.

Be very careful with the default path, checking it is
 - a directory;
 - a mount point;
 - has cgroup fstype.

If something is not right, fall back to parsing mountinfo.

While at it, remove the obsoleted comment about mountinfo parsing.  The
comment belongs to findCgroupMountpointAndRootFromReader(), but rather
than moving it there, let's just remove it, since it does not add any
value in understanding the current code.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-07 13:57:33 -07:00
Kir Kolyshkin c27b8e7fe7 tests/fedora32: retry dnf
Fedora mirrors are not very stable recently, leading to CI failures
that usually look like this:

> sudo: make: command not found

In fact it's caused by dnf failure to read metadata from mirrors:

> Errors during downloading metadata for repository 'updates':
>    - Downloading successful, but checksum doesn't match. Calculated: <....>
> Error: Failed to download metadata for repo 'updates': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried

The error went undetected due to lack of exit code check.

This commit:
 - adds `set -e -u -o pipefail` so the script will fail early;
 - adds a retry loop with a sleep around dnf invocation.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-07 12:31:52 -07:00
Kir Kolyshkin 92f498210a tests/centos7: add criu
Enable criu tests on centos 7 by using criu from Adrian's repo
(https://copr.fedorainfracloud.org/coprs/adrian/criu-el7/)

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-07 11:45:41 -07:00
Kir Kolyshkin 98c7c01df9 tests/int/checkpoint: require cgroupns
Otherwise the test will fail on e.g. CentOS 7.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-07 11:24:36 -07:00
Kir Kolyshkin c1adc99a20 cgroup/fs: rework Apply()
In manager.Apply() method, a path to each subsystem is obtained by
calling d.path(sys.Name()), and the sys.Apply() is called that does
the same call to d.path() again.

d.path() is an expensive call, so rather than to call it twice, let's
reuse the result.

This results the number of times we parse mountinfo during container
start from 62 to 34 on my setup.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-07 10:58:37 -07:00
Kir Kolyshkin 417f5ff40d tests/int/checkpoint: fds and pids cleanup
1. Do not use hardcoded fd numbers, instead relying on bash feature of
   assigning an fd to a variable.

   This looks very weird, but the rule of thumb here is:
   - if this is in exec, use {var} (i.e. no $);
   - otherwise, use as normal ($var or ${var}).

2. Add killing the background processes and closing the fds to teardown.
   This is helpful in case of a test failure, in order to not affect the
   subsequent tests.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-07 10:54:23 -07:00
Kir Kolyshkin 335f0806c0 tests/int/delete: cgroupv1 with sub-cgroups removal case
This is similar to what we did before for v2.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 21:08:04 -07:00
Aleksa Sarai 819fcc687e
merge branch 'pr-2495'
Kir Kolyshkin (1):
  cgroups/fs/path: optimize

LGTMs: @mrunalp @cyphar
Closes #2495
2020-07-07 11:51:06 +10:00
Kir Kolyshkin 2a322e91ec cgroupv1: remove subsystemSet.Get()
Instead of iterating over m.paths, iterate over subsystems and look up
the path for each. This is faster since a map lookup is faster than
iterating over the names in Get. A quick benchmark shows that the new
way is 2.5x faster than the old one.

Note though that this is not done to make things faster, as savings are
negligible, but to make things simpler by removing some code.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 18:31:46 -07:00
Kir Kolyshkin daf30cb7ca cgroups/fs: rm getSubsystems
It does not add any value.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 18:29:14 -07:00
Kir Kolyshkin 2e22579946 libct/cgroups/fs.GetStats: drop PathExists check
Half of controllers' GetStats just return nil, and most of the others
ignore ENOENT on files, so it will be cheaper to not check that the
path exists in the main GetStats method, offloading that to the
controllers.

Drop PathExists check from GetStats, add it to those controllers'
GetStats where it was missing.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 18:02:17 -07:00
Kir Kolyshkin 11fb94965c cgroups/fs: rm Remove method from controllers
To my surprise, those are not used anywhere in the code.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 18:02:17 -07:00
Kir Kolyshkin 19be8e5ba5 libct/cgroups.RemovePaths: speedup
Using os.RemoveAll has the following two issues:

 1. it tries to remove all files, which does not make sense for cgroups;
 2. it tries rm(2) which fails to directories, and then rmdir(2).

Let's reuse our RemovePath instead, and add warnings and errors logging.

PS I am somewhat hesitant to remove the weird checking my means of stat,
as it might break something. Unfortunately, neither commit 6feb7bda04
nor the PR it contains [1] do not explain what kind of weird errors were
seen from os.RemoveAll. Most probably our code won't return any bogus
errors, but let's keep the old code to be on the safe side.

[1] https://github.com/docker-archive/libcontainer/pull/308

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 17:54:44 -07:00
Kir Kolyshkin 3f14242e0a libct/cgroups: move RemovePath from fs2
This is to be used by RemovePaths.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 17:54:44 -07:00
Kir Kolyshkin 254d23b964 libc/cgroups: empty map in RemovePaths
RemovePaths() deletes elements from the paths map for paths that has
been successfully removed.

Although, it does not empty the map itself (which is needed that AFAIK
Go garbage collector does not shrink the map), but all its callers do.

Move this operation from callers to RemovePaths.

No functional change, except the old map should be garbage collected now.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 17:54:44 -07:00
Mrunal Patel 30dc54a995
Merge pull request #2503 from giuseppe/cgroup-fixes
cgroup, systemd: cleanup cgroups
2020-07-06 15:14:29 -07:00
Mrunal Patel 3f81131845
Merge pull request #2490 from kolyshkin/dev-opt
libct/cgroups: add SkipDevices to Resources
2020-07-06 14:28:30 -07:00
Giuseppe Scrivano 32034481ea
cgroup, systemd: cleanup cgroups
some hierarchies were created directly by .Apply() on top of systemd
managed cgroups.  systemd doesn't manage these and as a result we leak
these cgroups.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-07-06 23:06:16 +02:00
Mrunal Patel 46a304b592
Merge pull request #2502 from tjucoder/master
make sure pty.Close() will be called and fix comment
2020-07-06 11:49:20 -07:00
Mrunal Patel e638eda0cb
Merge pull request #2496 from kolyshkin/freeze-nits
libct/cgroups/fs: simplify/speedup freezer code
2020-07-06 11:30:01 -07:00
Xiaodong Liu a4cb88f307 redundant souce code copy
There is a docker -v flag for test in Makefile

Signed-off-by: Xiaodong Liu <liuxiaodong@loongson.cn>
2020-07-06 19:03:26 +08:00
Giuseppe Scrivano 2deaeab08f
cgroup: store the result of IsRunningSystemd
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-07-05 12:42:27 +02:00
tjucoder ab35cfe23c make sure pty.Close() will be called and fix comment
Signed-off-by: tjucoder <chinesecoder@foxmail.com>
2020-07-05 16:37:21 +08:00
Kir Kolyshkin 62a30709d2 cgroups/fs/path: optimize
The result of cgroupv1.FindCgroupMountpoint() call (which is relatively
expensive) is only used in case raw.innerPath is absolute, so it only
makes sense to call it in that case.

This drastically reduces the number of calls to FindCgroupMountpoint
during container start (from 116 to 62 in my setup).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-03 14:07:27 -07:00
Kir Kolyshkin 46b26bc05d cgroups/fs/Freeze: simplify
In here, defer looks like an overkill, since the code is very simple and
we already have an error path.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-03 14:02:57 -07:00
Kir Kolyshkin cd479f9d14 cgroupv1/freezer: don't use subsystemSet.Get()
Iterating over the list of subsystems and comparing their names to get an
instance of fs.cgroupFreezer is useless and a waste of time, since it is
a shallow type (i.e. does not have any data/state) and we can create an
instance in place.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-03 14:00:44 -07:00
Akihiro Suda 3cb1909c70
Merge pull request #2493 from thaJeztah/bump_ebpf
vendor: update cilium/ebpf v0.0.0-20200702112145-1c8d4c9ef775
2020-07-03 11:43:59 +09:00
Kir Kolyshkin 108ee85b82 libct/cgroups: add SkipDevices to Resources
The kubelet uses libct/cgroups code to set up cgroups. It creates a
parent cgroup (kubepods) to put the containers into.

The problem (for cgroupv2 that uses eBPF for device configuration) is
the hard requirement to have devices cgroup configured results in
leaking an eBPF program upon every kubelet restart.  program. If kubelet
is restarted 64+ times, the cgroup can't be configured anymore.

Work around this by adding a SkipDevices flag to Resources.

A check was added so that if SkipDevices is set, such a "container"
can't be started (to make sure it is only used for non-containers).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-02 15:19:31 -07:00