This fixes the following bug:
> $ GO111MODULE=off make
> go build "-mod=vendor" -buildmode=pie -tags "seccomp selinux apparmor" -ldflags "-X main.gitCommit="19ba7688cb4e0922d53029e2f7c1f2af45d40938-dirty" -X main.version=1.0.0-rc10+dev " -o runc .
> build flag -mod=vendor only valid when using modules
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since go has its own way to track dependencies and rebuild if needed,
and it is efficient enough, let's drop using SOURCES variable, mark
all targets as PHONY and let golang do its job.
The primary motivation for this was concern about using find on every
make invocation to build the list of all sources.
Some unscientific performance analisys:
Before:
> $ time make
> make: 'runc' is up to date.
>
> real 0m0.202s
> user 0m0.177s
> sys 0m0.031s
After:
> $ time make
> go build -mod=vendor -buildmode=pie -tags "seccomp selinux apparmor" -ldflags "-X main.gitCommit="5a8210a58bd0f07cc987e6201b4174e5b93fa115" -X main.version=1.0.0-rc10+dev " -o runc .
>
> real 0m0.149s
> user 0m0.315s
> sys 0m0.106s
So, it is slightly faster using the wall clock, uses more CPU, but
we can be sure the binary is always up to date.
This also fixes the Makefile to mark all targets as PHONY. The list
was generated by `grep -E '^[a-z-]+:' Makefile | sed 's/:.*//'`.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This was added by commit 993cbf9db but since some time ago (go 1.13
for sure, but may be earlier) is no longer needed since all the tools
are correctly skipping vendor subdir.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since we already have to build everything and run integration tests
on the Vagrant Fedora 31 host (in order to test how runc talks to
systemd), let's do the same for unit tests (otherwise we build
everything twice).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since we carry vendor/ subdir, let's actually use it. Should speed up CI
a bit, possibly also making it a tad more stable.
This is actually implemented in go 1.14 already (i.e. it turns mod=vendor
automatically if it sees vendor/ dir), but we still use go 1.13.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
It's hard to read otherwise (at least for me).
While at it, replace ${FOO} with $(FOO) -- both are
identical, but the second style looks to be used more.
No functional change.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
There are way to many arguments to go build, and they are repeatedly
used across the makefile. Separate them out to GO_BUILD and
GO_BUILD_STATIC variables.
While at it, let's be consistem about the style and use $(FOO) everywhere
(there is no difference from ${FOO}).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
To make a bind mount read-only, it needs to be remounted. This is what
the code removed does, but it is not needed here.
We have to deal with three cases here:
1. cgroup v2 unified mode. In this case the mount is real mount with
fstype=cgroup2, and there is no need to have a bind mount on top,
as we pass readonly flag to the mount as is.
2. cgroup v1 + cgroupns (enableCgroupns == true). In this case the
"mount" is in fact a set of real mounts with fstype=cgroup, and
they are all performed in mountCgroupV1, with readonly flag
added if needed.
3. cgroup v1 as is (enableCgroupns == false). In this case
mountCgroupV1() calls mountToRootfs() again with an argument
from the list obtained from getCgroupMounts(), i.e. a bind
mount with the same flags as the original mount has (plus
unix.MS_BIND | unix.MS_REC), and mountToRootfs() does remounting
(under the case "bind":).
So, the code which this patch is removing is not needed -- it
essentially does nothing in case 3 above (since the bind mount
is already remounted readonly), and in cases 1 and 2 it
creates an unneeded extra bind mount on top of a real one (or set of
real ones).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The bats testing framework we use for integration test is not maintained
since 2015 and was superceded by bats-core [1]. More to say, we were
using an unreleased version and relying on some features of it
(unfortunately I don't remember now what are those features exactly).
As Debian still packages very old version of bats from the old repo,
so let's Use recent bats-core from the new, supposedly better maintained,
github repo.
[1] https://github.com/sstephenson/bats/pull/269
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
With the fix in the previous commit and criu patched with support for
cgroupv2, these tests should now pass.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case of cgroupv2 unified hierarchy, the /sys/fs/cgroup mount
is the real mount with fstype of cgroup2 (rather than a set of
external bind mounts like for cgroupv1).
So, we should not add it to the list of "external bind mounts"
on both checkpoint and restore.
Without this fix, checkpoint integration tests fail on cgroup v2.
Also, same is true for cgroup v1 + cgroupns.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Same test as the first one, just with cgroupns enabled.
Since in case of cgroupv2 `runc spec` enables cgroupns,
this case was already tested by the first checkpoint test,
so skip it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit a9e15e7e0 adds a check that stdin/out/err pipes
are restored correctly. Commit ec260653b7 copy/pastes
the same code to one more another test.
Problem is (as pointed out in commit 5369f9ade3) these tests
sometimes hang. I have also seen them fail.
Apparently, the code used to create pipes and open them to fds
is racy:
```shell
cat $fifo | cat $fifo &
pid=$!
exec 50</proc/$pid/fd/0
exec 51>/proc/$pid/fd/0
```
Since `cat | cat` is spawned asynchronously, by the time exec is used,
the second cat process (i.e. $pid) is already fork'ed but it might
not be exec'ed yet. As a result, we get this (`ls -l /proc/self/fd`):
```
lr-x------. 1 root root 64 Apr 20 02:39 50 -> /dev/pts/1
l-wx------. 1 root root 64 Apr 20 02:39 51 -> /dev/pts/1
```
or, in some cases:
```
lr-x------. 1 root root 64 Apr 20 02:45 50 -> /dev/pts/1
l-wx------. 1 root root 64 Apr 20 02:45 51 -> 'pipe:[215791]'
```
instead of expected set of pipes:
```
> lr-x------. 1 root root 64 Apr 20 02:45 50 -> 'pipe:[215791]'
> l-wx------. 1 root root 64 Apr 20 02:45 51 -> 'pipe:[215791]'
```
One possible workaround is to add `sleep 0.1` or so after cat|cat,
but it is outright ugly (besides, we already have one sleep in
the test code).
The solution is to not use any external processes to create pipes.
I admit this still looks not very comprehensible, but at least it
is easier than before, and it works.
While at it, remove code duplication, moving the setup and check
code into a pair of functions.
Finally, since the tests are working now, remove the skip.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since all the criu tests have the same requirements,
move them to setup().
While at it, remove an obviously redundant comment.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Introduce a special case for `testcontainer` to test
for container that is not present (checkpointed), use it.
Fix one place where testcontainer was not used.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
runc in this file is actually a function that does `run runc ...`,
and `run` sets variable `$status` as the exit code, so `$status`
is what should be checked.
If calling runc directly (as in `__runc ...`), then $? is legit.
While at it, remove an obsoleted comment, and an unneeded
`ret=$?` assignment (check `$?` directly).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. There is no need to try removing it recursively.
2. Do not treat ENOENT as an error (similar to fs
and systemd v1 drivers).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Run in the same environment as systemd tests.
Disable CRIU tests because:
- they all fail with cgroup v2;
- CRIU v3.14 is required and it's not yet released, and
rebuilding it from sources with patches applied (like
it is currently done in Dockerfile) is too much work.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Instead of enabling all available controllers, figure out which
ones are required, and only enable those.
2. Amend all setFoo() functions to call isFooSet(). While this might
seem unnecessary, it might actually help to uncover a bug.
Imagine someone:
- adds a cgroup.Resources.CpuFoo setting;
- modifies setCpu() to apply the new setting;
- but forgets to amend isCpuSet() accordingly <-- BUG
In this case, a test case modifying CpuFoo will help
to uncover the BUG. This is the reason why it's added.
This patch *could be* amended by enabling controllers on a best-effort
basis, i.e. :
- do not return an error early if we can't enable some controllers;
- if we fail to enable all controllers at once (usually because one
of them can't be enabled), try enabling them one by one.
Currently this is not implemented, and it's not clear whether this
would be a good way to go or not.
[v2: add/use is${Controller}Set() functions]
[v3: document neededControllers()]
[v4: drop "best-effort" part]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Rename the files
- v1.go: cgroupv1 aka legacy;
- v2.go: cgroupv2 aka unified hierarchy;
- unsupported.go: when systemd is not available.
2. Move the code that is common between v1 and v2 to common.go
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This check was removed in commit 5406833a65. Now, when this
function is called from a few places, it is no longer obvious
that the path always starts with /sys/fs/cgroup/, so reinstate
the check just to be on the safe side.
This check also ensures that elements[3:] can be used safely.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
fs2 cgroup driver was not working because it did not enable controllers
while creating cgroup directory; instead it was merely doing MkdirAll()
and gathered the list of available controllers in NewManager().
Also, cgroup should be created in Apply(), not while creating a new
manager instance.
To fix:
1. Move the createCgroupsv2Path function from systemd driver to fs2 driver,
renaming it to CreateCgroupPath. Use in Apply() from both fs2 and
systemd drivers.
2. Delay available controllers map initialization to until it is needed.
With this patch:
- NewManager() only performs minimal initialization (initializin
m.dirPath, if not provided);
- Apply() properly creates cgroup path, enabling the controllers;
- m.controllers is initialized lazily on demand.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The fs2 cgroup driver has a sanity check for path.
Since systemd driver is relying on the same path,
it makes sense to move this check to the common code.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Having map of per-subsystem paths in systemd unified cgroups
driver does not make sense and makes the code less readable.
To get rid of it, move the systemd v1-or-v2 init code to
libcontainer/factory_linux.go which already has a function
to deduce unified path out of paths map.
End result is much cleaner code. Besides, we no longer write pid
to the same cgroup file 7 times in Apply() like we did before.
While at it
- add `rootless` flag which is passed on to fs2 manager
- merge getv2Path() into GetUnifiedPath(), don't overwrite
path if it is set during initialization (on Load).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>