Commit Graph

1065 Commits

Author SHA1 Message Date
Ed King 5c0af14bf8 Return from goroutine when it should terminate
Signed-off-by: Craig Furman <cfurman@pivotal.io>
2018-01-23 10:46:31 +00:00
Will Martin 8d3e6c9826 Avoid race when opening exec fifo
When starting a container with `runc start` or `runc run`, the stub
process (runc[2:INIT]) opens a fifo for writing. Its parent runc process
will open the same fifo for reading. In this way, they synchronize.

If the stub process exits at the wrong time, the parent runc process
will block forever.

This can happen when racing 2 runc operations against each other: `runc
run/start`, and `runc delete`. It could also happen for other reasons,
e.g. the kernel's OOM killer may select the stub process.

This commit resolves this race by racing the opening of the exec fifo
from the runc parent process against the stub process exiting. If the
stub process exits before we open the fifo, we return an error.

Another solution is to wait on the stub process. However, it seems it
would require more refactoring to avoid calling wait multiple times on
the same process, which is an error.

Signed-off-by: Craig Furman <cfurman@pivotal.io>
2018-01-22 17:03:02 +00:00
Antonio Murdaca cd1e7abee2
libcontainer: expose annotations in hooks
Annotations weren't passed to hooks. This patch fixes that by passing
annotations to stdin for hooks.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2018-01-11 16:54:01 +01:00
vikaschoudhary16 d5b4a3eddb Fix race against systemd
- T0: runc triggers a systemd unit creation asynchronously from [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L298)
- T1: runc then moves ahead and starts creating cgroup paths(.scope directories), [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L348). Kernel creates .scope directory and cgroup.procs file(along with other default files) in the directory automatically, in an atomic manner.
- T3: systemd execution thread which was invoked at time `T0`, is still in the process of unit creation. systemd also trying to create cgroup paths and deletes the `.scope` directory which is created at time `T1` by runc from [here](https://github.com/systemd/systemd/blob/v219/src/shared/cgroup-util.c#L1630) in the code

Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>
2018-01-08 09:37:26 -05:00
Mrunal Patel e6516b3d5d
Merge pull request #1678 from sboeuf/sboeuf/subreaper
libcontainer: Do not wait for signalled processes if subreaper is set
2017-12-15 08:47:07 -08:00
Michael Crosby 7f24b40cc5
Merge pull request #1675 from tklauser/apparmor-no-cgo
RFC: libcontainer: remove dependency on libapparmor
2017-12-15 11:23:35 -05:00
Tobias Klauser db093f621f libcontainer: remove dependency on libapparmor
libapparmor is integrated in libcontainer using cgo but is only used to
call a single function: aa_change_onexec. It turns out this function is
simple enough (writing a string to a file in /proc/<n>/attr/...) to be
re-implemented locally in libcontainer in plain Go.

This allows to drop the dependency on libapparmor and the corresponding
cgo integration.

Fixes #1674

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2017-12-15 09:59:58 +01:00
Sebastien Boeuf bb912eb00c libcontainer: Do not wait for signalled processes if subreaper is set
When a subreaper is enabled, it might expect to reap a process and
retrieve its exit code. That's the reason why this patch is giving
the possibility to define the usage of a subreaper as a consumer of
libcontainer. Relying on this information, libcontainer will not
wait for signalled processes in case a subreaper has been set.

Fixes #1677

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2017-12-14 10:37:38 -08:00
Mrunal Patel c6e4a1ebeb
Merge pull request #1665 from Mashimiao/gidmapping-valid-fix
specconv: avoid skipping gidmappings applied when uidmappings is empty
2017-12-11 09:50:54 -08:00
Mrunal Patel b028413c35
Merge pull request #1655 from Mashimiao/add-propagation-more
support unbindable,runbindable for rootfs propagation
2017-12-11 09:21:41 -08:00
Allen Sun fec6b0fea5 Update criu_opts_linux.go
Signed-off-by: Allen Sun <shlallen1990@gmail.com>
2017-12-05 15:16:26 +08:00
Michael Crosby 91e9795013
Merge pull request #1654 from dqminh/only-linux
remove placeholder for non-linux platforms
2017-11-30 09:51:47 -05:00
Ma Shimiao 57edfbbaf2 specconv: avoid skipping gidmappings applied when uidmappings is empty
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-11-30 16:24:36 +08:00
Aleksa Sarai e8149af291
merge branch 'pr-1661'
Ensure container tests do not write on the host

LGTMs: @hqhq @cyphar
Closes #1661
2017-11-27 20:10:48 +11:00
Danail Branekov 0495fece57 Ensure container tests do not write on the host
TestGetContainerStateAfterUpdate creates its state.json file on the current
directory which turns out to be the host runc directory. Thus whenever
the test completes it leaves the state.json file behind thus
a) poluting the local git repository
b) changing the host file system violating the principle of doing
everything in an isolated container environment

This change would create a new temporary (in-container) directory and use it as
linuxContainer.root

Signed-off-by: Tom Godkin <tgodkin@pivotal.io>
2017-11-27 10:43:10 +02:00
Daniel Dao 8898b6b446 remove placeholder for non-linux platforms
runc currently only support Linux platform, and since we dont intend to expose
the support to other platform, removing all other platforms placeholder code.

`libcontainer/configs` still being used in
https://github.com/moby/moby/blob/master/daemon/daemon_windows.go so
keeping it for now.

After this, we probably should also rename files to drop linux suffices
if possible.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2017-11-24 18:14:51 +00:00
Daniel, Dao Quang Minh fb871d9cd0
Merge pull request #1664 from tklauser/drop-freebsd
libcontainer: drop FreeBSD support
2017-11-24 18:08:21 +00:00
Tobias Klauser 4d27f20db0 libcontainer: drop FreeBSD support
runc is not supported on FreeBSD, so remove all FreeBSD specific bits.

As suggested by @crosbymichael in #1653

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2017-11-24 14:51:05 +01:00
Danail Branekov 38d1e6ec27 Delete xattr related code
Selinux related code has been moved to the selinux package
(https://github.com/opencontainers/selinux) and therefore xattr related
code can be deleted from libcontainer

Signed-off-by: Danail Branekov <danailster@gmail.com>
2017-11-21 12:49:28 +02:00
Ma Shimiao 17db6560be support unbindable,runbindable for rootfs propagation
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-11-17 16:14:15 +08:00
Seth Jennings bca53e7b49 systemd: adjust CPUQuotaPerSecUSec to compensate for systemd internal handling
Signed-off-by: Seth Jennings <sjenning@redhat.com>
2017-11-15 20:20:06 -06:00
Vincent Demeester 3ca4c78b1a
Import docker/docker/pkg/mount into runc
This will help get rid of docker/docker dependency in runc 👼

Signed-off-by: Vincent Demeester <vincent@sbr.pm>
2017-11-08 16:25:58 +01:00
Michael Crosby 2f010ecf19
Merge pull request #1622 from vdemeester/import-symlink-from-docker
Remove pkg/symlink from docker/docker and use cyphar/filepath-securejoin
2017-11-08 10:07:00 -05:00
Akihiro Suda 0aac2368e4 specconv.Example(): add /proc/scsi to masked paths
Port over https://github.com/moby/moby/pull/35399

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2017-11-04 17:38:14 +00:00
Michael Crosby 0232e38342
Merge pull request #1629 from masters-of-cats/busybox-inflation
Avoid disk usage explosion when copying busybox
2017-11-01 09:15:22 -04:00
Danail Branekov fdbb9e3e55 Avoid disk usage explosion when copying busybox
When running runc tests with temp directory with size 500M copying
busybox without preserving hardlinks causes the folder to inflate to
roughly 330M. Copying busybox twice in certain tests causes the /tmp
directory to overfill. Using `-a` preserves links which busybox uses to
implement its choice of binary to run.

Signed-off-by: Tom Godkin <tgodkin@pivotal.io>
2017-11-01 09:52:05 +00:00
Vincent Demeester 594501475e
Use cyphar/filepath-securejoin instead of docker pkg/symlink
runc shouldn't depend on docker and be more self-contained.
Removing github.com/pkg/symlink dep is the first step to not depend on docker anymore

Signed-off-by: Vincent Demeester <vincent@sbr.pm>
2017-10-31 16:53:45 +01:00
Lorenzo Fontana 780f8ef567
Specconv: Test create command hooks and seccomp setup
Signed-off-by: Lorenzo Fontana <lo@linux.com>
2017-10-28 21:46:46 +02:00
Mrunal Patel 9a1186d128 Merge pull request #1619 from fntlnz/spec-linux-testing
WIP: Better testsuite for specconv
2017-10-25 15:23:19 -07:00
Lorenzo Fontana c0e6e12f9d
Test Cgroup creation and memory allocations
Signed-off-by: Lorenzo Fontana <lo@linux.com>
2017-10-25 01:58:10 +02:00
Aleksa Sarai ff5075c33f
init: correctly handle unmapped stdio with multiple mappings
Previously we would handle the "unmapped stdio" case by just doing a
simple check, however this didn't handle cases where the overflow_uid
was actually mapped in the user namespace. Instead of doing some
userspace checks, just try to do the fchown(2) and ignore EINVAL
(unmapped) or EPERM (lacking privilege over inode) errors.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-10-25 00:12:21 +11:00
Qiang Huang 74a1729647 Merge pull request #1607 from crosbymichael/term-err
libcontainer: handler errors from terminate
2017-10-20 15:15:38 +08:00
Qiang Huang e8b9b92f57 Merge pull request #1206 from YuPengZTE/devMD026
trailing punctuation in header
2017-10-20 14:47:09 +08:00
Mrunal Patel 80ee9e50b5 Merge pull request #1616 from mheon/seccomp_fix_breakage
Fix breaking change in Seccomp profile behavior
2017-10-19 14:15:04 -07:00
Aleksa Sarai c05f6368af
merge branch 'pr-1615'
libcontainer: intelrdt: fix a GetStats() issue

LGTMs: @crosbymichael @cyphar
Closes #1615
2017-10-19 03:41:16 +11:00
Matthew Heon e9193ba6e6 Fix breaking change in Seccomp profile behavior
Multiple conditions were previously allowed to be placed upon the
same syscall argument. Restore this behavior.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2017-10-18 11:53:56 -04:00
Qiang Huang 3409d5c555 Merge pull request #1606 from cyphar/rootfs-propagation-no-pivot
specconv: emit an error when using MS_PRIVATE with --no-pivot
2017-10-18 09:52:04 +08:00
Xiaochen Shen d89217515b libcontainer: intelrdt: fix a GetStats() issue
This fixes a GetStats() issue introduced in #1590:
If Intel RDT is enabled by hardware and kernel, but intelRdt is not
specified in original config, GetStats() will return error unexpectedly
because we haven't called Apply() to create intelrdt group or attach
tasks for this container. As a result, runc events command will have no
output.

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
2017-10-17 17:37:07 +08:00
Tobias Klauser 0eed453b21 libcontainer: use Major/Minor from x/sys/unix
The Major and Minor functions were added for Linux in golang/sys@85d1495
which is already vendored in. Use these functions instead of the local
re-implementation.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2017-10-17 09:06:42 +02:00
Aleksa Sarai 9b13f5cc7f
merge branch 'pr-1453'
propagate argv0 when re-execing from /proc/self/exe

LGTMs: @crosbymichael @cyphar
Closes #1453
2017-10-17 03:12:22 +11:00
Michael Crosby ff4481dbf6 Merge pull request #1540 from cloudfoundry-incubator/rootless-cgroups
Support cgroups with limits as rootless
2017-10-16 12:03:49 -04:00
Petros Angelatos 8098828680
propagate argv0 when re-execing from /proc/self/exe
This allows runc to be used as a target for docker's reexec module that
depends on a correct argv0 to select which process entrypoint to invoke.
Without this patch, when runc re-execs argv0 is set to "/proc/self/exe"
and the reexec module doesn't know what to do with it.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
2017-10-16 14:00:26 +02:00
Tobias Klauser d2bc081420 libcontainer: merge common syscall implementations
There are essentially two possible implementations for Setuid/Setgid on
Linux, either using SYS_SETUID32/SYS_SETGID32 or SYS_SETUID/SYS_SETGID,
depending on the architecture (see golang/go#1435 for why Setuid/Setgid
aren currently implemented for Linux neither in syscall nor in
golang.org/x/sys/unix).

Reduce duplication by merging the currently implemented variants and
adjusting the build tags accordingly.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2017-10-16 11:11:18 +02:00
Aleksa Sarai 6d30f7a01b
merge branch 'pr-1424'
Update Travis config to use trusty-backports libseccomp
  Add integration tests for multi-argument Seccomp filters
  Vendor updated libseccomp-golang for bugfix

LGTMs: @crosbymichael @cyphar
Closes #1424
2017-10-16 03:01:37 +11:00
Aleksa Sarai d2ac52fe52
merge branch 'pr-1475'
Add support for mips/mips64
  Put signalMap in a separate file, so it may be arch-specific

LGTMs: @crosbymichael @cyphar
Closes #1475
2017-10-16 02:59:34 +11:00
Aleksa Sarai 2430a98e64
merge branch 'pr-1500'
rootfs: switch ms_private remount of oldroot to ms_slave

LGTMs: @crosbymichael @hqhq
Closes opencontainers/runc#1500
2017-10-14 09:32:59 +11:00
Sebastien Boeuf acb93c9c62 libcontainer: cgroups: Write freezer state after every state check
This commit ensures we write the expected freezer cgroup state after
every state check, in case the state check does not give the expected
result. This can happen when a new task is created and prevents the
whole cgroup to be FROZEN, leaving the state into FREEZING instead.

This patch prevents the case of an infinite loop to happen.

Fixes https://github.com/opencontainers/runc/issues/1609

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2017-10-12 07:07:28 -07:00
Matthew Heon bbc847a457 Add integration tests for multi-argument Seccomp filters
Signed-off-by: Matthew Heon <mheon@redhat.com>
2017-10-10 15:49:08 -04:00
Michael Crosby bfe3058fc9 Make process check more forgiving
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-10-10 15:36:19 -04:00
Steven Hartland eb68b900bc Prevent invalid errors from terminate
Both Process.Kill() and Process.Wait() can return errors that don't impact the correct behaviour of terminate.

Instead of letting these get returned and logged, which causes confusion, silently ignore them.

Currently the test needs to be a string test as the errors are private to the runtime packages, so its our only option.

This can be seen if init fails during the setns.

Signed-off-by: Steven Hartland <steven.hartland@multiplay.co.uk>
2017-10-10 15:32:46 -04:00