Commit Graph

4246 Commits

Author SHA1 Message Date
Ted Yu db29dce076 Close fd in case fd.Write() returns error
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
2020-05-02 20:06:08 -07:00
Mrunal Patel dd8d48ede8
Merge pull request #2358 from kolyshkin/fs2-nit
cgroups/fs2: don't always parse /proc/self/cgroup
2020-04-29 08:45:26 -07:00
Kir Kolyshkin c3b0b13fe9 cgroups/fs2: don't always parse /proc/self/cgroup
Function defaultPath always parses /proc/self/cgroup, but
the resulting value is not always used.

Avoid unnecessary reading/parsing by moving the code
to just before its use.

Modify the test case accordingly.

[v2: test: use UnifiedMountpoint, skip test if not on v2]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-28 22:16:36 -07:00
Kir Kolyshkin 051d6705a7
Merge pull request #2363 from AkihiroSuda/vagrant-f32
Vagrantfile: use Fedora 32 (and remove unused Podman)

LGTMs: @cyphar @kolyshkin
2020-04-28 22:01:44 -07:00
Akihiro Suda 85c44b190e Vagrantfile: use Fedora 32
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-29 12:36:03 +09:00
Akihiro Suda c18485ada6
Merge pull request #2359 from cyphar/terminal-docs-subreaper
docs: terminals: mention subreaper requirement
2020-04-29 10:53:01 +09:00
Kir Kolyshkin 0a4dcc0203
Merge pull request #2331 from lifubang/StartTransientUnit
check that StartTransientUnit/StopUnit succeeds

LGTMs: @AkihiroSuda @kolyshkin 
Closes #2313, #2309
2020-04-28 10:47:52 -07:00
Aleksa Sarai eea0fbfec1
docs: terminals: mention subreaper requirement
I realised that the terminal documentation which covers detached
terminals fails to mention that callers need to make themselves a
subreaper. Probably a good idea to mention this. I've also included a
minor comparison to LXC.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2020-04-28 22:53:59 +10:00
lifubang bfa1b2aab3 check that StartTransientUnit and StopUnit succeeds
Signed-off-by: lifubang <lifubang@acmcoder.com>
2020-04-28 15:46:28 +08:00
Mrunal Patel 80e2d1f145
Merge pull request #2357 from kolyshkin/makefile-2
Makefile fixes and improvements
2020-04-27 21:21:25 -07:00
Mrunal Patel a1f007e067
Merge pull request #2340 from AkihiroSuda/fix-2339
fs2: fix cgroup.subtree_control EPERM on rootless + add CI
2020-04-27 21:20:23 -07:00
Kir Kolyshkin 772d090930 Makefile: rm RELEASE_DIR and SHELL
RELEASE_DIR is only used once, so it doesn't make sense to have it.

SHELL was introduced in commit 54390f89a7 and was used
implicitly (since Makefile contained some bash-specific code),
but is no longer needed since commit ed68ee1e10.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-27 14:17:18 -07:00
Kir Kolyshkin 731947d5ec Makefile: fix/clean install-man
Target `install-man` was not dependent on `man`, meaning no man pages
were installed unless one called `make man` beforehand. Fix this.

Remove many man-related variables, only leaving MANDIR, which is
an installation directory for man pages.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-27 14:17:18 -07:00
Kir Kolyshkin df72e8989c Makefile: rm uninstall* targets
These targets are not very reliable and, depending on environment
variables, migth result in data loss. For example:

 make DESTDIR=`pwd`/tmp install
 ...
 make uninstall

The first make command will install $CURDIR/tmp/usr/local/bin/runc,
while the last command will remove /usr/local/bin/runc.

One way to support uninstall would be to write a temp file during
installation, which would contain the files we have installed.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-27 14:17:18 -07:00
Kir Kolyshkin a036e890b9 Makefile: add -mod=vendor to go test
Otherwise, in case go < 1.14 is used, all the go deps are downloaded
instead of using vendor subdir.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-27 14:16:22 -07:00
Kir Kolyshkin 2fe9e31aa9 Makefile: don't use -mod=vendor if GO111MODULE=off
This fixes the following bug:

> $ GO111MODULE=off make
> go build "-mod=vendor" -buildmode=pie  -tags "seccomp selinux apparmor" -ldflags "-X main.gitCommit="19ba7688cb4e0922d53029e2f7c1f2af45d40938-dirty" -X main.version=1.0.0-rc10+dev " -o runc .
> build flag -mod=vendor only valid when using modules

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-27 13:17:20 -07:00
Kir Kolyshkin 19ba7688cb Makefile: test, localtest: no need to invoke make
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-27 13:04:32 -07:00
Kir Kolyshkin fc54f6d7db Makefile: rm $(SOURCES), mark targets as PHONY
Since go has its own way to track dependencies and rebuild if needed,
and it is efficient enough, let's drop using SOURCES variable, mark
all targets as PHONY and let golang do its job.

The primary motivation for this was concern about using find on every
make invocation to build the list of all sources.

Some unscientific performance analisys:

Before:
> $ time make
> make: 'runc' is up to date.
>
> real	0m0.202s
> user	0m0.177s
> sys	0m0.031s

After:
> $ time make
> go build -mod=vendor -buildmode=pie  -tags "seccomp selinux apparmor" -ldflags "-X main.gitCommit="5a8210a58bd0f07cc987e6201b4174e5b93fa115" -X main.version=1.0.0-rc10+dev " -o runc .
>
> real	0m0.149s
> user	0m0.315s
> sys	0m0.106s

So, it is slightly faster using the wall clock, uses more CPU, but
we can be sure the binary is always up to date.

This also fixes the Makefile to mark all targets as PHONY. The list
was generated by `grep -E '^[a-z-]+:' Makefile | sed 's/:.*//'`.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-27 12:53:44 -07:00
Kir Kolyshkin b7dadf0f7b Makefile: rm $(allpackages)
This was added by commit 993cbf9db but since some time ago (go 1.13
for sure, but may be earlier) is no longer needed since all the tools
are correctly skipping vendor subdir.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-27 12:24:39 -07:00
Akihiro Suda 60c647e3b8 fs2: fix cgroup.subtree_control EPERM on rootless + add CI
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-27 13:30:15 +09:00
Mrunal Patel 53fb4a5e2d
Merge pull request #2342 from kolyshkin/vagrant-rm-ct
travis: run vagrant tests on the host
2020-04-26 21:13:53 -07:00
Kir Kolyshkin b19f9cecfe
Merge pull request #2343 from lifubang/updateSystemdScope
fix data inconsistency when using runc update in systemd driven cgroup
2020-04-24 23:34:19 -07:00
Akihiro Suda 0fd8d468ea
Merge pull request #2318 from lifubang/linuxResources
cgroupv2: use default allowed devices when linux resources is null
2020-04-25 09:00:23 +09:00
Akihiro Suda baa200264b
Merge pull request #2327 from kolyshkin/cpt-err
checkpoint: don't print error if --pre-dump is set
2020-04-25 08:56:09 +09:00
Kir Kolyshkin 084144a64a travis: run vagrant tests on the host
Since we already have to build everything and run integration tests
on the Vagrant Fedora 31 host (in order to test how runc talks to
systemd), let's do the same for unit tests (otherwise we build
everything twice).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-24 11:25:51 -07:00
Mrunal Patel 634e51b52c
Merge pull request #2335 from kolyshkin/cgroupv2-cpt
Fix cgroupv2 checkpoint/restore
2020-04-24 08:47:36 -07:00
lifubang 10ba72a61f add integration test for runc update with systemd
Signed-off-by: lifubang <lifubang@acmcoder.com>
2020-04-24 16:58:29 +08:00
Akihiro Suda 49ca1fd074
Merge pull request #2347 from kolyshkin/v2-allow-all-devs
cgroupv2: allow to set EnableAllDevices=true
2020-04-24 16:09:40 +09:00
Akihiro Suda 78ff2797b5
Merge pull request #2334 from kolyshkin/makefile
Makefile nits
2020-04-24 11:25:30 +09:00
Mrunal Patel c420a3ec7f
Merge pull request #2324 from kolyshkin/criu-freezer
libcontainer: fix Checkpoint wrt cgroupv2
2020-04-23 19:24:38 -07:00
Akihiro Suda 5b4bff966f
Merge pull request #2336 from kolyshkin/bats-core-2
Dockerfile: use bats-core
2020-04-24 11:24:26 +09:00
Kir Kolyshkin 440244268b
Merge pull request #2330 from KentaTada/use-linuxnamespace-const
libcontainer: use consts of Namespace from runtime-spec
2020-04-23 18:58:29 -07:00
Kir Kolyshkin fbeed52283 Makefile: add -mod=vendor
Since we carry vendor/ subdir, let's actually use it. Should speed up CI
a bit, possibly also making it a tad more stable.

This is actually implemented in go 1.14 already (i.e. it turns mod=vendor
automatically if it sees vendor/ dir), but we still use go 1.13.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-23 18:00:25 -07:00
Kir Kolyshkin 1fe709a0bf Makefile: use $(FOO) not ${FOO}
The first style seems to be prevalent.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-23 18:00:25 -07:00
Kir Kolyshkin d09a6ea95e Makefile: split long lines
It's hard to read otherwise (at least for me).

While at it, replace ${FOO} with $(FOO) -- both are
identical, but the second style looks to be used more.

No functional change.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-23 18:00:25 -07:00
Kir Kolyshkin 64ec355716 Makefile: abstract go build flags
There are way to many arguments to go build, and they are repeatedly
used across the makefile. Separate them out to GO_BUILD and
GO_BUILD_STATIC variables.

While at it, let's be consistem about the style and use $(FOO) everywhere
(there is no difference from ${FOO}).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-23 18:00:25 -07:00
Kir Kolyshkin 55d5c99ca7 libct/mountToRootfs: rm useless code
To make a bind mount read-only, it needs to be remounted. This is what
the code removed does, but it is not needed here.

We have to deal with three cases here:

1. cgroup v2 unified mode. In this case the mount is real mount with
   fstype=cgroup2, and there is no need to have a bind mount on top,
   as we pass readonly flag to the mount as is.

2. cgroup v1 + cgroupns (enableCgroupns == true). In this case the
   "mount" is in fact a set of real mounts with fstype=cgroup, and
   they are all performed in mountCgroupV1, with readonly flag
   added if needed.

3. cgroup v1 as is (enableCgroupns == false). In this case
   mountCgroupV1() calls mountToRootfs() again with an argument
   from the list obtained from getCgroupMounts(), i.e. a bind
   mount with the same flags as the original mount has (plus
   unix.MS_BIND | unix.MS_REC), and mountToRootfs() does remounting
   (under the case "bind":).

So, the code which this patch is removing is not needed -- it
essentially does nothing in case 3 above (since the bind mount
is already remounted readonly), and in cases 1 and 2 it
creates an unneeded extra bind mount on top of a real one (or set of
real ones).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-23 16:49:12 -07:00
Kir Kolyshkin 20959b1666 libcontainer/integration/checkpoint_test: simplify
Since commit 9280e3566d it is not longer needed to have `cgroup2'
mount.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-23 15:22:32 -07:00
lifubang 1d4ccc8e0c fix data inconsistent when runc update in systemd driven cgroup v1
Signed-off-by: lifubang <lifubang@acmcoder.com>
2020-04-23 19:32:57 +08:00
lifubang 7682a2b2a5 fix data inconsistent when runc update in systemd driven cgroup v2
Signed-off-by: lifubang <lifubang@acmcoder.com>
2020-04-23 19:32:07 +08:00
Aleksa Sarai dbe44cbb10
merge branch 'pr-2348'
Kenta Tada (1):
  libcontainer: use x/sys/unix instead of the hardcoded value

LGTMs: @AkihiroSuda @cyphar
Closes #2348
2020-04-23 18:10:47 +10:00
Aleksa Sarai fb99bbc7cd
merge branch 'pr-2326'
Akihiro Suda (1):
  MAINTAINERS: add Kir Kolyshkin

LGTMs: @AkihiroSuda @hqhq @mrunalp @cyphar @crosbymichael
Closes #2326
2020-04-23 18:09:12 +10:00
Kenta Tada 4474795388 libcontainer: use x/sys/unix instead of the hardcoded value
PR_SET_CHILD_SUBREAPER is defined in x/sys/unix.

Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
2020-04-23 10:49:51 +09:00
Kir Kolyshkin d4bc7c10ec Dockerfile: use bats-core
The bats testing framework we use for integration test is not maintained
since 2015 and was superceded by bats-core [1]. More to say, we were
using an unreleased version and relying on some features of it
(unfortunately I don't remember now what are those features exactly).

As Debian still packages very old version of bats from the old repo,
so let's Use recent bats-core from the new, supposedly better maintained,
github repo.

[1] https://github.com/sstephenson/bats/pull/269

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-22 15:34:44 -07:00
Kir Kolyshkin 32d52a0fab tests/checkpoint: enable for Fedora 31 / cgroup v2
With the fix in the previous commit and criu patched with support for
cgroupv2, these tests should now pass.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-22 11:40:28 -07:00
Kir Kolyshkin 9280e3566d checkpoint/restore: fix cgroupv2 handling
In case of cgroupv2 unified hierarchy, the /sys/fs/cgroup mount
is the real mount with fstype of cgroup2 (rather than a set of
external bind mounts like for cgroupv1).

So, we should not add it to the list of "external bind mounts"
on both checkpoint and restore.

Without this fix, checkpoint integration tests fail on cgroup v2.

Also, same is true for cgroup v1 + cgroupns.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-22 11:26:43 -07:00
Kir Kolyshkin 00a2844ab4 tests/checkpoint: add simple c/r test for cgroupns
Same test as the first one, just with cgroupns enabled.

Since in case of cgroupv2 `runc spec` enables cgroupns,
this case was already tested by the first checkpoint test,
so skip it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-22 11:06:24 -07:00
Kir Kolyshkin 75a92ea615 cgroupv2: allow to set EnableAllDevices=true
In this case we just do not install any eBPF rules
checking the devices.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-22 11:05:36 -07:00
Akihiro Suda cdce577dcf
Merge pull request #2332 from kolyshkin/cgroupv2-cr
Fix/improve checkpoint integration tests
2020-04-23 00:47:18 +09:00
Kir Kolyshkin d5e68ceb0c tests/checkpoint.bats: fix test hang/failure
Commit a9e15e7e0 adds a check that stdin/out/err pipes
are restored correctly. Commit ec260653b7 copy/pastes
the same code to one more another test.

Problem is (as pointed out in commit 5369f9ade3) these tests
sometimes hang. I have also seen them fail.

Apparently, the code used to create pipes and open them to fds
is racy:

```shell
cat $fifo | cat $fifo &
pid=$!
exec 50</proc/$pid/fd/0
exec 51>/proc/$pid/fd/0
```

Since `cat | cat` is spawned asynchronously, by the time exec is used,
the second cat process (i.e. $pid) is already fork'ed but it might
not be exec'ed yet. As a result, we get this (`ls -l /proc/self/fd`):

```
lr-x------. 1 root root 64 Apr 20 02:39 50 -> /dev/pts/1
l-wx------. 1 root root 64 Apr 20 02:39 51 -> /dev/pts/1
```

or, in some cases:
```
lr-x------. 1 root root 64 Apr 20 02:45 50 -> /dev/pts/1
l-wx------. 1 root root 64 Apr 20 02:45 51 -> 'pipe:[215791]'
```

instead of expected set of pipes:

```
> lr-x------. 1 root root 64 Apr 20 02:45 50 -> 'pipe:[215791]'
> l-wx------. 1 root root 64 Apr 20 02:45 51 -> 'pipe:[215791]'
```

One possible workaround is to add `sleep 0.1` or so after cat|cat,
but it is outright ugly (besides, we already have one sleep in
the test code).

The solution is to not use any external processes to create pipes.
I admit this still looks not very comprehensible, but at least it
is easier than before, and it works.

While at it, remove code duplication, moving the setup and check
code into a pair of functions.

Finally, since the tests are working now, remove the skip.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-21 02:16:23 -07:00