Commit Graph

3483 Commits

Author SHA1 Message Date
Steven Hartland eb68b900bc Prevent invalid errors from terminate
Both Process.Kill() and Process.Wait() can return errors that don't impact the correct behaviour of terminate.

Instead of letting these get returned and logged, which causes confusion, silently ignore them.

Currently the test needs to be a string test as the errors are private to the runtime packages, so its our only option.

This can be seen if init fails during the setns.

Signed-off-by: Steven Hartland <steven.hartland@multiplay.co.uk>
2017-10-10 15:32:46 -04:00
Michael Crosby 4693fae411 Merge pull request #1590 from xiaochenshen/rdt-cat-support-update-command
libcontainer: intelrdt: add update command support
2017-10-10 15:25:22 -04:00
Aleksa Sarai d4f0f9a52b
specconv: emit an error when using MS_PRIVATE with --no-pivot
Due to the semantics of chroot(2) when it comes to mount namespaces, it
is not generally safe to use MS_PRIVATE as a mount propgation when using
chroot(2). The reason for this is that this effectively results in a set
of mount references being held by the chroot'd namespace which the
namespace cannot free. pivot_root(2) does not have this issue because
the @old_root can be unmounted by the process.

Ultimately, --no-pivot is not really necessary anymore as a commonly
used option since f8e6b5af5e ("rootfs: make pivot_root not use a
temporary directory") resolved the read-only issue. But if someone
really needs to use it, MS_PRIVATE is never a good idea.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-10-08 17:50:55 +11:00
Michael Crosby f53ad9cec9 Merge pull request #1604 from AkihiroSuda/cwd
libcontainer: create Cwd when it does not exist
2017-10-05 11:15:10 -04:00
Will Martin ca4f427af1 Support cgroups with limits as rootless
Signed-off-by: Ed King <eking@pivotal.io>
Signed-off-by: Gabriel Rosenhouse <grosenhouse@pivotal.io>
Signed-off-by: Konstantinos Karampogias <konstantinos.karampogias@swisscom.com>
2017-10-05 11:22:54 +01:00
Akihiro Suda 2edd36fdff libcontainer: create Cwd when it does not exist
The benefit for doing this within runc is that it works well with
userns.
Actually, runc already does the same thing for mount points.

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2017-10-05 05:31:46 +00:00
Aleksa Sarai dc1552a6f3
merge branch 'pr-1275'
Set initial console size based on process spec

LGTMs: @crosbymichael @cyphar
Closes #1275
2017-10-05 02:33:30 +11:00
Konstantinos Karampogias 605dc5c811 Set initial console size based on process spec
Signed-off-by: Will Martin <wmartin@pivotal.io>
Signed-off-by: Petar Petrov <pppepito86@gmail.com>
Signed-off-by: Ed King <eking@pivotal.io>
Signed-off-by: Roberto Jimenez Sanchez <jszroberto@gmail.com>
Signed-off-by: Thomas Godkin <tgodkin@pivotal.io>
2017-10-04 12:32:16 +01:00
Daniel, Dao Quang Minh 0351df1c5a Merge pull request #1600 from crosbymichael/console
Bump console and sys deps
2017-09-26 10:15:10 +01:00
Michael Crosby f364c1a58c Set ClearONLCR in tests
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-25 13:35:22 -04:00
Michael Crosby 9ba16b6d5a Update console and golang/sys deps
This bumps the console and golang/sys deps for runc.

The major change is that the console package does not clear ONLCR within
the package and leaves it up to the client to handle this if they
please.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-25 13:23:50 -04:00
Daniel, Dao Quang Minh 2ae0fa7187 Merge pull request #1599 from tklauser/unconvert
libcontainer: remove unnecessary type conversions
2017-09-25 16:38:43 +01:00
Tobias Klauser d713652bda libcontainer: remove unnecessary type conversions
Generated using github.com/mdempsky/unconvert

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2017-09-25 10:41:57 +02:00
Qiang Huang 79ad714374 Merge pull request #1598 from euank/ragent
libcontainer: default mount propagation correctly
2017-09-25 11:55:29 +08:00
Euan Kemp 4301b440d6 libcontainer: default mount propagation correctly
The code in prepareRoot (e385f67a0e/libcontainer/rootfs_linux.go (L599-L605))
attempts to default the rootfs mount to `rslave`. However, since the spec
conversion has already defaulted it to `rprivate`, that code doesn't
actually ever do anything.

This changes the spec conversion code to accept "" and treat it as 0.

Implicitly, this makes rootfs propagation default to `rslave`, which is
a part of fixing the moby bug https://github.com/moby/moby/issues/34672

Alternate implementatoins include changing this defaulting to be
`rslave` and removing the defaulting code in prepareRoot, or skipping
the mapping entirely for "", but I think this change is the cleanest of
those options.

Signed-off-by: Euan Kemp <euan.kemp@coreos.com>
2017-09-22 13:36:23 -07:00
Michael Crosby e385f67a0e Merge pull request #1597 from s7v7nislands/unused_var
Delete unused variable
2017-09-22 09:53:11 -04:00
leitwolf7 e6e2439261 Merge branch 'master' into fix-integration 2017-09-21 22:25:58 +02:00
s7v7nislands 4155902a82 Delete unused variable
Signed-off-by: Xiaobing.Jiang <s7v7nislands@gmail.com>
2017-09-22 04:21:02 +08:00
Xiaochen Shen 65918b02a9 intelrdt: add update command support
Add runc update command support for Intel RDT/CAT.

for example:
runc update --l3-cache-schema "L3:0=f;1=f" <container-id>

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
2017-09-20 01:59:06 +08:00
Xiaochen Shen 2549545df5 intelrdt: always init IntelRdtManager if Intel RDT is enabled
In current implementation:
Either Intel RDT is not enabled by hardware and kernel, or intelRdt is
not specified in original config, we don't init IntelRdtManager in the
container to handle intelrdt constraint. It is a tradeoff that Intel RDT
has hardware limitation to support only limited number of groups.

This patch makes a minor change to support update command:
Whether or not intelRdt is specified in config, we always init
IntelRdtManager in the container if Intel RDT is enabled. If intelRdt is
not specified in original config, we just don't Apply() to create
intelrdt group or attach tasks for this container.

In update command, we could re-enable through IntelRdtManager.Apply()
and then update intelrdt constraint.

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
2017-09-20 01:37:31 +08:00
Michael Crosby 593914b8bd Merge pull request #1593 from s7v7nislands/drop_go1.5
Drop support golang 1.5
2017-09-12 15:22:00 -04:00
s7v7nislands 00ad8e1e56 Drop support golang 1.5
Signed-off-by: Xiaobing Jiang <s7v7nislands@gmail.com>
2017-09-12 20:56:51 +08:00
Qiang Huang 68e00e906b Merge pull request #1586 from crosbymichael/set-cgroups
Apply cgroups earlier
2017-09-12 12:13:29 +08:00
Aleksa Sarai f1e19e9744
merge branch 'pr-1579'
Disable systemd in static build

LGTMs: @crosbymichael @cyphar
Closes #1579
2017-09-12 08:01:24 +10:00
Aleksa Sarai f756d904ce
merge branch 'pr-1577'
Add `-installsuffix netgo` in static build
  Use `netgo` for static build

LGTMs: @crosbymichael @cyphar
Closes #1577
2017-09-12 08:00:00 +10:00
Yong Tang e9944d0f4c Disable systemd in static build
This fix tries to address the warnings caused by static build
with go 1.9. As systemd needs dlopen/dlclose, the following warnings
will be generated for static build in go 1.9:
```
root@f4b077232050:/go/src/github.com/opencontainers/runc# make static
CGO_ENABLED=1 go build  -tags "seccomp cgo static_build" -ldflags "-w -extldflags -static -X main.gitCommit="1c81e2a794c6e26a4c650142ae8893c47f619764" -X main.version=1.0.0-rc4+dev " -o runc .
/tmp/go-link-113476657/000007.o: In function `_cgo_a5acef59ed3f_Cfunc_dlopen':
/tmp/go-build/github.com/opencontainers/runc/vendor/github.com/coreos/pkg/dlopen/_obj/cgo-gcc-prolog:76: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
```

This fix disables systemd when `static_build` flag is on (apply_nosystemd.go
is used instead).

This fix also fixes a small bug in `apply_nosystemd.go` for return value.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2017-09-11 18:38:22 +00:00
Mrunal Patel d5b43c3981 Merge pull request #1455 from dqminh/epoll-io
tty: move IO of master pty to be done with epoll
2017-09-11 11:32:42 -07:00
Yong Tang ec42eaa427 Add `-installsuffix netgo` in static build
This fix adds `-installsuffix netgo` in static build in combination
of `-tags netgo`. See following for the reason:
https://github.com/golang/go/issues/9369#issuecomment-69864440

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2017-09-11 18:20:19 +00:00
Yong Tang 337c3fb88c Use `netgo` for static build
This fix adds `netgo` to tags for static build so that
the following warning could be addressed:
```
/tmp/go-link-355596637/000000.o: In function `_cgo_b0c710f30cfd_C2func_getaddrinfo':
/tmp/go-build/net/_obj/cgo-gcc-prolog:46: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
```

The above warning appears when building `make static` with
go 1.9.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2017-09-11 18:20:19 +00:00
Michael Crosby 8b47a242a9 Merge pull request #1529 from giuseppe/rootless-improvements
Support multiple users/groups mapped for the rootless case
2017-09-11 14:01:31 -04:00
Aleksa Sarai eb5bd4fa6a
tests: add tests for rootless multi-mapping configurations
Enable several previously disabled tests (for the idmap execution mode)
for rootless containers, in addition to making all tests use the
additional mappings. At the moment there's no strong need to add any
additional tests purely for rootless_idmap.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-09-09 12:45:33 +10:00
Aleksa Sarai d0aec23c7e
tests: generalise rootless runner
This is necessary in order to add proper opportunistic tests, and is a
placeholder until we add tests for new{uid,gid}map configurations.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-09-09 12:45:33 +10:00
Aleksa Sarai 1a5fdc1c5f
init: support setting -u with rootless containers
Now that rootless containers have support for multiple uid and gid
mappings, allow --user to work as expected. If the user is not mapped,
an error occurs (as usual).

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-09-09 12:45:33 +10:00
Aleksa Sarai 969bb49cc3
nsenter: do not resolve path in nsexec context
With the addition of our new{uid,gid}map support, we used to call
execvp(3) from inside nsexec. This would mean that the path resolution
for the binaries would happen in nsexec. Move the resolution to the
initial setup code, and pass the absolute path to nsexec.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-09-09 12:45:33 +10:00
Aleksa Sarai 6097ce74d8
nsenter: correctly handle newgidmap path for rootless containers
After quite a bit of debugging, I found that previous versions of this
patchset did not include newgidmap in a rootless setting. Fix this by
passing it whenever group mappings are applied, and also providing some
better checking for try_mapping_tool. This commit also includes some
stylistic improvements.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-09-09 12:45:32 +10:00
Giuseppe Scrivano 3282f5a7c1
tests: fix for rootless multiple uids/gids
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2017-09-09 12:45:32 +10:00
Giuseppe Scrivano d8b669400a
rootless: allow multiple user/group mappings
Take advantage of the newuidmap/newgidmap tools to allow multiple
users/groups to be mapped into the new user namespace in the rootless
case.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
[ rebased to handle intelrdt changes. ]
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-09-09 12:45:32 +10:00
Giuseppe Scrivano fdf85e35b3
main: honor XDG_RUNTIME_DIR for rootless containers
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2017-09-09 12:44:34 +10:00
Mrunal Patel 13fa5d2953 Merge pull request #1588 from s7v7nislands/delete_unused
Delete unused function
2017-09-08 17:34:00 -07:00
Michael Crosby b82d07e816 Merge pull request #1587 from Mashimiao/fix-namespace-empty
Fixes #1585 config.Namespaces is empty when accessed
2017-09-08 10:50:16 -04:00
Michael Crosby 9755e0065f Merge pull request #1589 from xiaochenshen/rdt-cat-bug-fix
libcontainer: intelrdt: use init() to avoid race condition
2017-09-08 10:41:45 -04:00
Xiaochen Shen 88d22fde40 libcontainer: intelrdt: use init() to avoid race condition
This is the follow-up PR of #1279 to fix remaining issues:

Use init() to avoid race condition in IsIntelRdtEnabled().
Add also rename some variables and functions.

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
2017-09-08 17:15:31 +08:00
s7v7nislands c795b8690b Delete unused function
Signed-off-by: Xiaobing Jiang <s7v7nislands@gmail.com>
2017-09-08 10:35:46 +08:00
Ma Shimiao c3d20e7817 Fixes #1585 config.Namespaces is empty when accessed
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-09-08 09:30:07 +08:00
Mrunal Patel deb9d7fd96 Merge pull request #1569 from cyphar/delay-seccomp
init: delay seccomp application as late as possible
2017-09-07 13:27:37 -07:00
Mrunal Patel 7e036aa0b0 Merge pull request #1541 from adrianreber/lazy
checkpoint: support lazy migration
2017-09-07 13:25:04 -07:00
Michael Crosby 7062c7556b Apply cgroups earlier
This applies cgroups earlier for container creation before the init
process starts running and forking off any additional processes.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-07 11:27:33 -04:00
Mrunal Patel 5274430fee Merge pull request #1279 from xiaochenshen/rdt-cat-resource-manager-v1
libcontainer: add support for Intel RDT/CAT in runc
2017-09-06 14:36:02 -07:00
Adrian Reber ec260653b7 lazy-migration: add test case
The lazy-pages test case is not as straight forward as the other test
cases. This is related to the fact that restoring requires a different
name if restored on the same host. During 'runc checkpoint' the
container is not destroyed before all memory pages have been transferred
to the destination and thus the same container name cannot be used.

As real world usage will rather migrate a container from one system to
another than lazy migrate a container on the same host this is only
problematic for this test case.

Another reason is that it requires starting 'runc checkpoint' and 'criu
lazy-pages' in the background as those process need to be running to
start the final restore 'runc restore'.

CRIU upstream is currently discussing to automatically start 'criu
lazy-pages' which would simplify the lazy-pages test case a bit.

The handling and checking of the background processes make the test case
not the most elegant as at one point a 'sleep 2' is required to make
sure that 'runc checkpoint' had time to do its thing before looking at
log files.

Before running the actual test criu is called in feature checking mode
to make sure lazy migration is in the test case criu enabled. If not,
the test is skipped.

Signed-off-by: Adrian Reber <areber@redhat.com>
2017-09-06 12:35:39 +00:00
Adrian Reber 60ae7091de checkpoint: support lazy migration
With the help of userfaultfd CRIU supports lazy migration. Lazy
migration means that memory pages are only transferred from the
migration source to the migration destination on page fault.

This enables to reduce the downtime during process or container
migration to a minimum as the memory does not need to be transferred
during migration.

Lazy migration currently depends on userfaultfd being available on the
current Linux kernel and if the used CRIU version supports lazy
migration. Both dependencies can be checked by querying CRIU via RPC if
the lazy migration feature is available. Using feature checking instead
of version comparison enables runC to use CRIU features from the
criu-dev branch. This way the user can decide if lazy migration should
be available by choosing the right kernel and CRIU branch.

To use lazy migration the CRIU process during dump needs to dump
everything besides the memory pages and then it opens a network port
waiting for remote page fault requests:

 # runc checkpoint httpd --lazy-pages --page-server 0.0.0.0:27 \
  --status-fd /tmp/postcopy-pipe

In this example CRIU will hang/wait once it has opened the network port
and wait for network connection. As runC waits for CRIU to finish it
will also hang until the lazy migration has finished. To know when the
restore on the destination side can start the '--status-fd' parameter is
used:

 #️ runc checkpoint --help | grep status
  --status-fd value   criu writes \0 to this FD once lazy-pages is ready

The parameter '--status-fd' is directly from CRIU and this way the
process outside of runC which controls the migration knows exactly when
to transfer the checkpoint (without memory pages) to the destination and
that the restore can be started.

On the destination side it is necessary to start CRIU in 'lazy-pages'
mode like this:

 # criu lazy-pages --page-server --address 192.168.122.3 --port 27 \
  -D checkpoint

and tell runC to do a lazy restore:

 # runc restore -d --image-path checkpoint --work-path checkpoint \
  --lazy-pages httpd

If both processes on the restore side have the same working directory
'criu lazy-pages' creates a unix domain socket where it waits for
requests from the actual restore. runC starts CRIU restore in lazy
restore mode and talks to 'criu lazy-pages' that it wants to restore
memory pages on demand. CRIU continues to restore the process and once
the process is running and accesses the first non-existing memory page
the 'criu lazy-pages' server will request the page from the source
system. Thus all pages from the source system will be transferred to the
destination system. Once all pages have been transferred runC on the
source system will end and the container will have finished migration.

This can also be combined with CRIU's pre-copy support. The combination
of pre-copy and post-copy (lazy migration) provides the possibility to
migrate containers with minimal downtimes.

Some additional background about post-copy migration can be found in
these articles:

 https://lisas.de/~adrian/?p=1253
 https://lisas.de/~adrian/?p=1183

Signed-off-by: Adrian Reber <areber@redhat.com>
2017-09-06 12:35:38 +00:00