TL;DR: this allows to show logs from failed runc restore.
Bats scripts are run with `set -e`. This is well known and obvious,
and yet there are a few errors with respect to that, including a few
"gems" by yours truly.
1. bats scripts are run with `set -e`, meaning that `[ $? -eq 0 ]` is
useless since the execution won't ever reach this line in case of
non-zero exit code from a preceding command. So, remove all such
checks, they are useless and misleading.
2. bats scripts are run with `set -e`, meaning that `ret=$?` is useless
since the execution won't ever reach this line in case of non-zero
exit code from a preceding command.
In particular, the code that calls runc restore needs to save the exit
code, show the errors in the log, and only when check the exit code and
fail if it's non-zero. It can not use `run` (or `runc` which uses `run`)
because of shell redirection that we need to set up.
The solution, implemented in this patch, is to use code like this:
```bash
ret=0
__runc ... || ret=$?
show_logs
[ $ret -eq 0 ]
```
In case __runc exits with non-zero exit code, `ret=$?` is executed, and
it always succeeds, so we won't fail just yet and have a chance to show
logs before checking the value of $ret.
In case __runc succeeds, `ret=$?` is never executed, so $ret will still
be zero (this is the reason why it needs to be set explicitly).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
I have noticed that `go vet` from golang 1.13 ignores the vendor/
subdir, downloading all the modules when invoked in Travis CI env.
As the other go commands, in 1.13 it needs explicit -mod=vendor
flag, so let's provide one.
PS once golang 1.13 is unsupported, we will drop it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
If the CRIU binary is in a non $PATH location and passed to runc via
'--criu /path/to/criu', this information has not been passed to go-criu
and since the switch to use go-criu for CRIU version detection, non
$PATH CRIU usage was broken. This uses the newly added go-criu interface
to pass the location of the binary to go-criu.
Signed-off-by: Adrian Reber <areber@redhat.com>
...by checking the default path first.
Quick benchmark shows it's about 5x faster on an idle system, and the
gain should be much more on a system doing mounts etc.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(mode&S_IFCHR == S_IFCHR) is the wrong way of checking the type of an
inode because the S_IF* bits are actually not a bitmask and instead must
be checked using S_IF*. This bug was neatly hidden behind a (major == 0)
sanity-check but that was removed by [1].
In addition, add a test that makes sure that HostDevices() doesn't give
rubbish results -- because we broke this and fixed this before[2].
[1]: 24388be71e ("configs: use different types for .Devices and .Resources.Devices")
[2]: 3ed492ad33 ("Handle non-devices correctly in DeviceFromPath")
Fixes: b0d014d0e1 ("libcontainer: one more switch from syscall to x/sys/unix")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Trying to checkpoint a container out of pod in cri-o fails with:
Error (criu/namespaces.c:1081): Can't dump a pid namespace without the process init
Starting with the upcoming CRIU release 3.15, CRIU can be told to ignore
the PID namespace during checkpointing and to restore processes into an
existing network namespace.
With the changes from this commit and CRIU 3.15 it is possible to
checkpoint a container out of a pod in cri-o.
Signed-off-by: Adrian Reber <areber@redhat.com>
To checkpoint and restore a container with an external network namespace
(like with Podman and CNI), runc tells CRIU to ignore the network
namespace during checkpoint and restore.
This commit moves that code to their own functions to be able to reuse
the same code path for external PID namespaces which are necessary for
checkpointing and restoring containers out of a pod in cri-o.
Signed-off-by: Adrian Reber <areber@redhat.com>
Travis reports following warnings which are fixed with this commit.
root: deprecated key sudo (The key `sudo` has no effect anymore.)
root: missing os, using the default linux
root: key matrix is an alias for jobs, using jobs
Signed-off-by: Adrian Reber <areber@redhat.com>
This change would let me specify my own PREFIX so that I can reuse
Makefile targets for building rpm packages.
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
In case cgroupPath is under the default cgroup prefix, let's try to
guess the mount point by adding the subsystem name to the default
prefix, and resolving the resulting path in case it's a symlink.
In most cases, given the default cgroup setup, this trick
should result in returning the same result faster, and avoiding
/proc/self/mountinfo parsing which is relatively slow and problematic.
Be very careful with the default path, checking it is
- a directory;
- a mount point;
- has cgroup fstype.
If something is not right, fall back to parsing mountinfo.
While at it, remove the obsoleted comment about mountinfo parsing. The
comment belongs to findCgroupMountpointAndRootFromReader(), but rather
than moving it there, let's just remove it, since it does not add any
value in understanding the current code.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Fedora mirrors are not very stable recently, leading to CI failures
that usually look like this:
> sudo: make: command not found
In fact it's caused by dnf failure to read metadata from mirrors:
> Errors during downloading metadata for repository 'updates':
> - Downloading successful, but checksum doesn't match. Calculated: <....>
> Error: Failed to download metadata for repo 'updates': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
The error went undetected due to lack of exit code check.
This commit:
- adds `set -e -u -o pipefail` so the script will fail early;
- adds a retry loop with a sleep around dnf invocation.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>