Commit Graph

3167 Commits

Author SHA1 Message Date
Aleksa Sarai cbc4f9865a
libcontainer: rewrite cmsg to use sys/unix
The original implementation is in C, which increases cognitive load and
possibly might cause us problems in the future. Since sys/unix is better
maintained than the syscall standard library switching makes more sense.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-30 16:03:21 +11:00
Aleksa Sarai 85de7ec363
vendor: add golang.org/x/sys/unix@9a7256cb28ed514b4e1e5f68959914c4c28a92e0
It turns out that the standard "syscall" library is not recommended for
new programs. runC will need to eventually move to this, but for now
include it in vendor so we can use it for new features.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-29 22:39:38 +11:00
Mrunal Patel 653207bc29 Merge pull request #774 from cyphar/rootless-containers
Rootless Containers
2017-03-27 11:58:03 -07:00
Aleksa Sarai ba38383a39
tests: add rootless integration tests
This adds targets for rootless integration tests, as well as all of the
required setup in order to get the tests to run. This includes quite a
few changes, because of a lot of assumptions about things running as
root within the bats scripts (which is not true when setting up rootless
containers).

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:22 +11:00
Aleksa Sarai 2ce33574d0
integration: added root requires
This is in preperation of allowing us to run the integration test suite
on rootless containers.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:21 +11:00
Aleksa Sarai d04cbc49d2
rootless: add autogenerated rootless config from `runc spec`
Since this is a runC-specific feature, this belongs here over in
opencontainers/ocitools (which is for generic OCI runtimes).

In addition, we don't create a new network namespace. This is because
currently if you want to set up a veth bridge you need CAP_NET_ADMIN in
both network namespaces' pinned user namespace to create the necessary
interfaces in each network namespace.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:21 +11:00
Aleksa Sarai 76aeaf8181
libcontainer: init: fix unmapped console fchown
If the stdio of the container is owned by a group which is not mapped in
the user namespace, attempting to fchown the file descriptor will result
in EINVAL. Counteract this by simply not doing an fchown if the group
owner of the file descriptor has no host mapping according to the
configured GIDMappings.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:21 +11:00
Aleksa Sarai f0876b0427
libcontainer: configs: add proper HostUID and HostGID
Previously Host{U,G}ID only gave you the root mapping, which isn't very
useful if you are trying to do other things with the IDMaps.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:20 +11:00
Aleksa Sarai baeef29858
rootless: add rootless cgroup manager
The rootless cgroup manager acts as a noop for all set and apply
operations. It is just used for rootless setups. Currently this is far
too simple (we need to add opportunistic cgroup management), but is good
enough as a first-pass at a noop cgroup manager.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:20 +11:00
Aleksa Sarai d2f49696b0
runc: add support for rootless containers
This enables the support for the rootless container mode. There are many
restrictions on what rootless containers can do, so many different runC
commands have been disabled:

* runc checkpoint
* runc events
* runc pause
* runc ps
* runc restore
* runc resume
* runc update

The following commands work:

* runc create
* runc delete
* runc exec
* runc kill
* runc list
* runc run
* runc spec
* runc state

In addition, any specification options that imply joining cgroups have
also been disabled. This is due to support for unprivileged subtree
management not being available from Linux upstream.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:45:24 +11:00
Aleksa Sarai 6bd4bd9030
*: handle unprivileged operations and !dumpable
Effectively, !dumpable makes implementing rootless containers quite
hard, due to a bunch of different operations on /proc/self no longer
being possible without reordering everything.

!dumpable only really makes sense when you are switching between
different security contexts, which is only the case when we are joining
namespaces. Unfortunately this means that !dumpable will still have
issues in this instance, and it should only be necessary to set
!dumpable if we are not joining USER namespaces (new kernels have
protections that make !dumpable no longer necessary). But that's a topic
for another time.

This also includes code to unset and then re-set dumpable when doing the
USER namespace mappings. This should also be safe because in principle
processes in a container can't see us until after we fork into the PID
namespace (which happens after the user mapping).

In rootless containers, it is not possible to set a non-dumpable
process's /proc/self/oom_score_adj (it's owned by root and thus not
writeable). Thus, it needs to be set inside nsexec before we set
ourselves as non-dumpable.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:45:19 +11:00
Michael Crosby ef9a4b3155 Merge pull request #1383 from wking/automatic-git-validation-commit-range
.travis.yml: Don't require FETCH_HEAD (partial fix for failing master tests)
2017-03-22 13:57:37 -07:00
W. Trevor King d1fb97fb91 .travis.yml: Don't require FETCH_HEAD
Master builds only have a 'git clone ...' [1] so FETCH_HEAD isn't
defined and git-validation crashes [2].  We don't want to be
hard-coding a range here, and should update git-validation to handle
these cases automatically.

Also echo TRAVIS_* variables during testing to make debugging
git-validation easier.

[1]: https://travis-ci.org/opencontainers/runc/jobs/213508696#L243
[2]: https://travis-ci.org/opencontainers/runc/jobs/213508696#L347

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-03-21 15:26:20 -07:00
Aleksa Sarai 6b574d5759
merge branch 'pr-1377'
LGTMs: @cyphar @hqhq @crosbymichael
Closes #1377
2017-03-22 07:25:06 +11:00
Michael Crosby a4c49f5617 Merge pull request #1382 from vbatts/fix_travis_var
travis: use alternate commit range
2017-03-21 10:46:33 -07:00
Vincent Batts 36b61ae590
travis: use alternate commit range
Signed-off-by: Vincent Batts <vbatts@redhat.com>
2017-03-21 09:45:43 -04:00
Mrunal Patel 75f8da7c88 Bump up runc version to v1.0.0-rc3
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-20 19:26:03 -07:00
Qiang Huang 3f3dbc50c1 Merge pull request #1380 from dqminh/fix-empty-cap-panic
fix panic regression when config doesnt have caps
2017-03-20 20:23:24 -05:00
Daniel Dao 09c72cea69
fix panic regression when config doesnt have caps
When process config doesnt specify capabilities anywhere, we should not panic
because setting capabilities are optional.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2017-03-21 00:45:26 +00:00
Michael Crosby 767783a631 Merge pull request #1375 from hqhq/use_uint64_for_resources
Use uint64 for resources to keep consistency with runtime-spec
2017-03-20 12:47:21 -07:00
Michael Crosby dbfc5be208 Merge pull request #1374 from cyphar/revert-1373
Revert "fix minor issue"
2017-03-20 10:24:42 -07:00
Qiang Huang 8430cc4f48 Use uint64 for resources to keep consistency with runtime-spec
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-20 18:51:39 +08:00
Aleksa Sarai c651512ad8
Revert "fix minor issue"
This reverts commit d4091ef151.

d4091ef151 ("fix minor issue") doesn't actually make any sense, and
actually makes the code more confusing.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-20 12:28:43 +11:00
Qiang Huang d270940363 Merge pull request #1356 from crosbymichael/console-socket
Add separate console socket
2017-03-18 04:03:03 -05:00
Mrunal Patel c266f1470c Merge pull request #1373 from moypray/minor
fix minor issue
2017-03-16 12:15:46 -07:00
Wentao Zhang d4091ef151 fix minor issue
When failed to attach veth pair, should remove the veth device

Signed-off-by: Wentao Zhang <zhangwentao234@huawei.com>
2017-03-17 03:18:44 +08:00
Michael Crosby 957ef9cc73 Remove terminal info
This maybe a nice extra but it adds complication to the usecase.  The
contract is listen on the socket and you get an fd to the pty master and
that is that.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-03-16 10:23:59 -07:00
Michael Crosby 00a0ecf554 Add separate console socket
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-03-16 10:23:59 -07:00
Daniel, Dao Quang Minh 697cb97cb7 Merge pull request #1370 from mrunalp/update_spec_rc5
Update runtime spec to rc5
2017-03-16 13:23:07 +00:00
Mrunal Patel 4f903a21c4 Remove ambient build tag
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-15 11:38:43 -07:00
Mrunal Patel 4f9cb13b64 Update runtime spec to 1.0.0.rc5
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-15 11:38:37 -07:00
Daniel, Dao Quang Minh 31980a53ae Merge pull request #1366 from hqhq/remove_ExecFifoPath
Remove unused ExecFifoPath
2017-03-09 18:13:34 +00:00
Qiang Huang b7932a2e07 Remove unused ExecFifoPath
In container process's Init function, we use
fd + execFifoFilename to open exec fifo, so this
field in init config is never used.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-09 10:58:16 +08:00
Qiang Huang df4d872dd9 Merge pull request #1327 from CarltonSemple/lxd-fix
Update devices_unix.go for LXD
2017-03-08 19:34:31 -06:00
Michael Crosby 4815f67a5f Merge pull request #1363 from hqhq/allow_single_cont_oper
Only allow single container operation
2017-03-08 10:43:14 -08:00
Carlton-Semple 0590736890 Added comment linking to LXD issue 2825
Signed-off-by: Carlton-Semple <carlton.semple@ibm.com>
2017-03-08 10:25:37 -05:00
Qiang Huang e0c7b6ceb7 Only allow single container operation
As per the discussions in #1156 , we think it's a bad
idea to allow multi container operations in runc. So
revert it.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-08 10:02:39 +08:00
Mrunal Patel 66781a7810 Merge pull request #1362 from crosbymichael/remove-alex
Remove lk4d4 as a maintainer
2017-03-07 16:09:29 -08:00
Michael Crosby d81f5a6b18 Remove lk4d4 as a maintainer
Closes #1361

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-03-07 13:12:52 -08:00
Mrunal Patel a0da8e28e9 Merge pull request #1360 from hqhq/remove_unused_systemd_func
Remove unused function in systemd cgroup
2017-03-07 11:39:34 -08:00
Qiang Huang 8773c5f9a6 Remove unused function in systemd cgroup
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-07 15:11:37 +08:00
Michael Crosby 49a33c41f8 Merge pull request #1344 from xuxinkun/fixCPUQuota20170224
fix cpu.cfs_quota_us changed when systemd daemon-reload using systemd.
2017-03-06 10:02:28 -08:00
xuxinkun c44aec9b23 fix cpu.cfs_quota_us changed when systemd daemon-reload using systemd.
Signed-off-by: xuxinkun <xuxinkun@gmail.com>
2017-03-06 20:08:30 +11:00
Daniel, Dao Quang Minh 291bf60110 Merge pull request #1354 from crosbymichael/dup-io
Don't fchown when inheriting io
2017-03-03 14:22:04 +00:00
Michael Crosby eebdb644f9 Don't fchown when inheriting io
This is a fix for rootless containers and general io handling.  The
higher level systems must preparte the IO for the container in the
detach case and make sure it is setup correctly for the container's
process.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-03-02 10:06:10 -08:00
Aleksa Sarai dcbcdf2470
merge branch 'pr-1353'
Closes #1353
LGTMs: @hqhq @cyphar
2017-03-01 19:57:17 +11:00
CuiHaozhi f82a38e160 container can be in stopped status from create process.
Signed-off-by: CuiHaozhi <cuihaozhi@chinacloud.com.cn>
2017-02-28 22:21:43 +08:00
Michael Crosby c50d024500 Merge pull request #1280 from datawolf/user
user: fix the parameter error
2017-02-27 11:22:58 -08:00
Daniel, Dao Quang Minh 770e37fb32 Merge pull request #1350 from hqhq/fix_kmem_accouting
Fix kmem accouting when use with cgroupsPath
2017-02-27 15:25:41 +00:00
Qiang Huang fe898e7862 Fix kmem accouting when use with cgroupsPath
Fixes: #1347
Fixes: #1083

The root cause of #1083 is because we're joining an
existed cgroup whose kmem accouting is not initialized,
and it has child cgroup or tasks in it.

Fix it by checking if the cgroup is first time created,
and we should enable kmem accouting if the cgroup is
craeted by libcontainer with or without kmem limit
configed. Otherwise we'll get issue like #1347

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-02-25 10:58:18 -08:00