Commit Graph

4051 Commits

Author SHA1 Message Date
Kir Kolyshkin 4c5c3fb960 Support for setting systemd properties via annotations
In case systemd is used to set cgroups for the container,
it creates a scope unit dedicated to it (usually named
`runc-$ID.scope`).

This patch adds an ability to set arbitrary systemd properties
for the systemd unit via runtime spec annotations.

Initially this was developed as an ability to specify the
`TimeoutStopUSec` property, but later generalized to work with
arbitrary ones.

Example usage: add the following to runtime spec (config.json):

```
	"annotations": {
		"org.systemd.property.TimeoutStopUSec": "uint64 123456789",
		"org.systemd.property.CollectMode":"'inactive-or-failed'"
	},
```

and start the container (e.g. `runc --systemd-cgroup run $ID`).

The above will set the following systemd parameters:
* `TimeoutStopSec` to 2 minutes and 3 seconds,
* `CollectMode` to "inactive-or-failed".

The values are in the gvariant format (see [1]). To figure out
which type systemd expects for a particular parameter, see
systemd sources.

In particular, parameters with `USec` suffix require an `uint64`
typed argument, while gvariant assumes int32 for a numeric values,
therefore the explicit type is required.

NOTE that systemd receives the time-typed parameters as *USec
but shows them (in `systemctl show`) as *Sec. For example,
the stop timeout should be set as `TimeoutStopUSec` but
is shown as `TimeoutStopSec`.

[1] https://developer.gnome.org/glib/stable/gvariant-text.html

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-02-17 16:07:19 -08:00
Mrunal Patel 81ef5024f8
Merge pull request #2213 from Zyqsempai/2166-convert-cpu-weight-poperly
Added conversion for cpu.weight v2
2020-02-17 07:49:39 -08:00
Boris Popovschi 7c439cc6f6 Added conversion for cpu.weight v2
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2020-02-12 11:32:34 +02:00
Andrei Vagin 269ea385a4 restore: fix a race condition in process.Wait()
Adrian reported that the checkpoint test stated failing:
=== RUN   TestCheckpoint
--- FAIL: TestCheckpoint (0.38s)
    checkpoint_test.go:297: Did not restore the pipe correctly:

The problem here is when we start exec.Cmd, we don't call its wait
method. This means that we don't wait cmd.goroutines ans so we don't
know when all data will be read from process pipes.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-10 10:21:08 -08:00
wanghuaiqing f27c4e15f6 Fix the value corresponding to rlimitmap [key]
These values depend on the specific arch

Signed-off-by: wanghuaiqing <wanghuaiqing@loongson.cn>
2020-02-07 13:02:14 +08:00
Aleksa Sarai dc7d0bfa0f
travis: update configuration
Update the set of Go versions (and use 1.x to always test the latest
release), as well as making the cgroupv2 tests allowable failures (the
vagrant setup seems to break pretty often, causing flaky failures).

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2020-02-05 13:41:28 +11:00
Boris Popovschi 3b992087b8 Fix skip message for cgroupv2
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2020-02-03 14:27:12 +02:00
Aleksa Sarai e6555cc01a
merge branch 'pr-2184'
Kenta Tada (1):
  README.md: modify the explanation of make flags

LGTMs: @hqhq @cyphar
Closes #2184
2020-02-03 22:41:07 +11:00
Kenta Tada e03859022a README.md: modify the explanation of make flags
Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
2020-02-03 15:03:26 +09:00
Aleksa Sarai ff107ee0c1
merge branch 'pr-2190'
Amye Scavarda Perrin (2):
  Update README.md
  Adding .pdf of audit

LGTMs: @caniszczyk @cyphar
Closes #2190
2020-01-31 11:17:42 +11:00
Amye Scavarda Perrin 7d23d1e172
Update README.md
Signed-off-by: Amye Scavarda Perrin <amye@linuxfoundation.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2020-01-31 10:59:57 +11:00
Amye Scavarda Perrin 0061cad878
Adding .pdf of audit
Signed-off-by: Amye Scavarda Perrin <amye@linuxfoundation.org>
2020-01-31 10:59:43 +11:00
Mrunal Patel 2b5730a5a6
Merge pull request #2221 from inductor/feature/fix_path_security
Fix path for security report line
2020-01-27 14:40:21 -08:00
Mrunal Patel e4c4935a78
Merge pull request #2217 from cyphar/release-rc10
VERSION: release 1.0.0~rc10
2020-01-27 14:39:52 -08:00
Kohei Ota ed4a3e9bc6 Apply review
Signed-off-by: Kohei Ota <kela@inductor.me>
2020-01-26 23:03:13 +09:00
Kohei Ota c8ba985325 Fix path for security report line
Signed-off-by: Kohei Ota <kela@inductor.me>
2020-01-26 16:13:05 +09:00
Aleksa Sarai e4de2b2555
VERSION: back to development
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2020-01-23 03:19:29 +11:00
Aleksa Sarai dc9208a330
VERSION: update to 1.0.0~rc10
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2020-01-23 03:19:15 +11:00
Mrunal Patel 2fc03cc11c
Merge pull request #2207 from cyphar/fix-double-volume-attack
rootfs: do not permit /proc mounts to non-directories
2020-01-22 08:06:10 -08:00
Aleksa Sarai 3291d66b98
rootfs: do not permit /proc mounts to non-directories
mount(2) will blindly follow symlinks, which is a problem because it
allows a malicious container to trick runc into mounting /proc to an
entirely different location (and thus within the attacker's control for
a rename-exchange attack).

This is just a hotfix (to "stop the bleeding"), and the more complete
fix would be finish libpathrs and port runc to it (to avoid these types
of attacks entirely, and defend against a variety of other /proc-related
attacks). It can be bypased by someone having "/" be a volume controlled
by another container.

Fixes: CVE-2019-19921
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2020-01-17 14:00:30 +11:00
Aleksa Sarai f6fb7a0338
merge branch 'pr-2133'
Julia Nedialkova (1):
  Handle ENODEV when accessing the freezer.state file

LGTMs: @crosbymichael @cyphar
Closes #2133
2020-01-17 02:07:19 +11:00
Boris Popovschi 5b96f314ba Exchanged deprecated systemd resources with the appropriate for cgroupv2
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2020-01-15 18:09:33 +02:00
Boris Popovschi cf9b7c33e1 Fix MAJ:MIN io.stat parsing order
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2020-01-15 14:39:14 +02:00
Qiang Huang 709377ca55
Merge pull request #2198 from AkihiroSuda/criu-master
temporarily disable CRIU tests
2020-01-14 18:57:19 +08:00
Akihiro Suda 55f8c254be temporarily disable CRIU tests
Ubuntu kernel is temporarily broken: https://github.com/opencontainers/runc/pull/2198#issuecomment-571124087

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-14 11:18:44 +09:00
Akihiro Suda 5c20ea1472 fix merging #2177 and #2169
A new method was added to the cgroup interface when #2177 was merged.

After #2177 got merged, #2169 was merged without rebase (sorry!) and compilation was failing:

  libcontainer/cgroups/fs2/fs2.go:208:22: container.Cgroup undefined (type *configs.Config has no field or method Cgroup)

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-14 11:13:25 +09:00
Mrunal Patel 5cc0deaf7a
Merge pull request #2169 from AkihiroSuda/split-fs
cgroup2: split fs2 from fs
2020-01-13 16:23:27 -08:00
Michael Crosby 2b52db7527
Merge pull request #2177 from devimc/topic/libcontainer/kata-containers
libcontainer: export and add new methods to allow cgroups manipulation
2020-01-02 11:47:12 -05:00
Michael Crosby a88592a634
Merge pull request #2185 from liggitt/exec-race
Fix race checking for process exit and waiting for exec fifo
2019-12-26 10:41:07 -05:00
Jordan Liggitt 8541d9cf3d Fix race checking for process exit and waiting for exec fifo
Signed-off-by: Jordan Liggitt <liggitt@google.com>
2019-12-18 18:48:18 +00:00
Jordan Liggitt 52951a7c19 Fix race in tty integration test with slow startup
Signed-off-by: Jordan Liggitt <liggitt@google.com>
2019-12-18 16:54:54 +00:00
Julio Montes 8ddd892072 libcontainer: add method to get cgroup config from cgroup Manager
`configs.Cgroup` contains the configuration used to create cgroups. This
configuration must be saved to disk, since it's required to restore the
cgroup manager that was used to create the cgroups.
Add method to get cgroup configuration from cgroup Manager to allow API users
save it to disk and restore a cgroup manager later.

fixes #2176

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-12-17 22:46:03 +00:00
Julio Montes cd7c59d042 libcontainer: export createCgroupConfig
A `config.Cgroups` object is required to manipulate cgroups v1 and v2 using
libcontainer.
Export `createCgroupConfig` to allow API users to create `config.Cgroups`
objects using directly libcontainer API.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-12-17 22:46:03 +00:00
Aleksa Sarai 7496a96825
merge branch 'pr-2086'
* Kurnia D Win (1):
  fix permission denied

LGTMs: @crosbymichael @cyphar
Closes #2086
2019-12-17 20:49:52 +11:00
Aleksa Sarai 201b063745
merge branch 'pr-2141'
Radostin Stoyanov (1):
  criu: Ensure other users cannot read c/r files

LGTMs: @crosbymichael @cyphar
Closes #2141
2019-12-07 09:32:58 +11:00
Michael Crosby e1b5af0652
Merge pull request #2161 from AkihiroSuda/makefile-overrride-docker
Makefile: allow overriding `docker` command
2019-12-06 10:42:24 -05:00
Akihiro Suda ec49f98d72 fs2: support legacy device spec (to pass CI)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-12-06 15:53:07 +09:00
Akihiro Suda 88e8350de2 cgroup2: split fs2 from fs
split fs2 package from fs, as mixing up fs and fs2 is very likely to result in
unmaintainable code.

Inspired by containerd/cgroups#109

Fix #2157

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-12-06 15:42:10 +09:00
Aleksa Sarai 5e63695384
merge branch 'pr-2174'
Sascha Grunert (1):
  Expose network interfaces via runc events

LGTMs: @cyphar @mrunalp
Closes #2174
2019-12-06 13:07:44 +11:00
Michael Crosby 8bb10af481
Merge pull request #2165 from AkihiroSuda/travis-f31
.travis.yml: add Fedora 31 vagrant box (for cgroup2)
2019-12-05 16:26:51 -05:00
Sascha Grunert 41a20b5852
Expose network interfaces via runc events
The libcontainer network statistics are unreachable without manually
creating a libcontainer instance. To retrieve them via the CLI interface
of runc, we now expose them as well.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-12-05 13:20:51 +01:00
Akihiro Suda 48b055c40a Makefile: allow overriding `docker` command
e.g. `make CONTAINER_ENGINE="sudo podman" unittest` (for ease of cgroup2 testing)

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-12-03 23:59:14 +09:00
Aleksa Sarai c35c2c9cec
merge branch 'pr-2172'
Sascha Grunert (1):
  Make event types public

LGTMs: @crosbymichael @cyphar
Closes #2172
2019-12-03 02:10:37 +11:00
Sascha Grunert 42690e6853
Make event types public
The event types are now part of a dedicated public `types` package
within runc to be able to unmarshal the output `runc events` directly.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-11-26 14:47:31 +01:00
Qiang Huang 2186cfa3cd
Merge pull request #2168 from AkihiroSuda/ebpf-fix-rlimit
cgroup2: ebpf: increase RLIM_MEMLOCK to avoid BPF_PROG_LOAD error
2019-11-16 11:33:40 +08:00
Akihiro Suda faf1e44ea9 cgroup2: ebpf: increase RLIM_MEMLOCK to avoid BPF_PROG_LOAD error
Fix #2167

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-11-07 15:43:27 +09:00
Mrunal Patel 46def4cc4c
Merge pull request #2154 from jpeach/2008-remove-static-build-tag
Remove the static_build build tag.
2019-11-04 17:10:59 -08:00
Michael Crosby b133feaeeb
Merge pull request #2145 from AkihiroSuda/ebpf
cgroup2: port over eBPF device controller from crun
2019-10-31 13:10:55 -04:00
Akihiro Suda ccd4436fc4 .travis.yml: add Fedora 31 vagrant box (for cgroup2)
As the baby step, only unit tests are executed.

Failing tests are currently skipped and will be fixed in follow-up PRs.

Fix #2124

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-10-31 16:53:01 +09:00
Akihiro Suda faf673ee45 cgroup2: port over eBPF device controller from crun
The implementation is based on https://github.com/containers/crun/blob/0.10.2/src/libcrun/ebpf.c

Although ebpf.c is originally licensed under LGPL-3.0-or-later, the author
Giuseppe Scrivano agreed to relicense the file in Apache License 2.0:
https://github.com/opencontainers/runc/issues/2144#issuecomment-543116397

See libcontainer/cgroups/ebpf/devicefilter/devicefilter_test.go for tested configurations.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-10-31 14:01:46 +09:00