Commit Graph

436 Commits

Author SHA1 Message Date
Aleksa Sarai 1448fe9568 libcontainer: cgroups: add support for kmem.tcp limits
Kernel TCP memory has its own special knobs inside the cgroup.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-03-20 22:03:52 +11:00
Mrunal Patel 54a6e56004 Merge pull request #647 from rajasec/valid-id
Fixing valid-id in regex
2016-03-18 09:38:56 -07:00
Jessica Frazelle 2c5b10189c
remove deadcode
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-03-17 13:36:28 -07:00
Alexander Morozov bbde9c426f Merge pull request #646 from crosbymichael/pid-host-block
Destroy container along with processes before stdio
2016-03-17 09:51:59 -07:00
Mrunal Patel 93d1a1a6ea Set Delegate to true for cgroups transient units
This is required because we manage some of the cgroups ourselves.
This recommendation came from talking with systemd devs about
some of the issues that we see when using the systemd cgroups driver.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-03-16 09:44:27 -07:00
Michael Crosby fdb100d247 Destroy container along with processes before stdio
We need to make sure the container is destroyed before closing the stdio
for the container.  This becomes a big issues when running in the host's
pid namespace because the other processes could have inherited the stdio
of the initial process.  The call to close will just block as they still
have the io open.

Calling destroy before closing io, especially in the host pid namespace
will cause all additional processes to be killed in the container's
cgroup.  This will allow the io to be closed successfuly.

This change makes sure the order for destroy and close is correct as
well as ensuring that if any errors encoutered during start or exec will
be handled by terminating the process and destroying the container.  We
cannot use defers here because we need to enforce the correct ordering
on destroy.

This also sets the subreaper setting for runc so that when running in
pid host, runc can wait on the addiontal processes launched by the
container, useful on destroy, but also good for reaping the additional
processes that were launched.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-15 13:17:11 -07:00
Mrunal Patel 69fe79de10 Merge pull request #637 from crosbymichael/flush-logs
Ensure logs are flushed
2016-03-15 11:05:10 -07:00
Michael Crosby 732a0fb440 Merge pull request #638 from hqhq/hq_fix_bootstrapData
Fix encoding gid mappings
2016-03-14 11:55:12 -07:00
Mrunal Patel 459efccb0a Merge pull request #576 from avagin/cr
Call Prestart hooks before restoring processes
2016-03-14 11:21:29 -07:00
Michael Crosby 8f206929b2 Ensure logs are flushed
This ensures that anything written to the logs are synced as they
happen.

This also changes the error message of the libcontainer error.  The
original idea was to have this extra information in the message but it
makes it hard to parse and if the caller needed this information they
can just get it from the error type.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-14 11:06:16 -07:00
rajasec d4be3405c7 Fixing valid-id in regex
Signed-off-by: rajasec <rajasec79@gmail.com>
2016-03-14 08:48:41 +05:30
Aleksa Sarai 64286b443d libcontainer: cgroups: add tests for pids.max in PidsStats
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-03-13 14:16:38 +11:00
Aleksa Sarai 2b1e086f62 libcontainer: cgroups: add pids.max to PidsStats
In order to allow nice usage statistics (in terms of percentages and
other such data), add the value of pids.max to the PidsStats struct
returned from the pids cgroup controller.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-03-13 04:53:20 +11:00
Qiang Huang 2f2c83a2a0 Fix encoding gid mappings
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-03-12 13:18:42 +08:00
Tonis Tiigi 04da969aa8 Clear groups after entering userns
Clears supplementary groups that have effect on the
mount permissions before joining the user specified
groups happens.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-03-10 22:23:38 -08:00
Mrunal Patel 1beb2410db Merge pull request #633 from crosbymichael/bump-spec-v4
Bump spec v0.4
2016-03-10 16:42:46 -08:00
Michael Crosby 4bef923fdb Merge pull request #630 from crosbymichael/revert-exit-status
Revert "Return proper exit code for exec errors"
2016-03-10 14:42:30 -08:00
Michael Crosby 20422c9bd9 Update libcontainer to support rlimit per process
This updates runc and libcontainer to handle rlimits per process and set
them correctly for the container.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-10 14:35:16 -08:00
Andrey Vagin 080eac3d2a nsexec: don't use CLONE_PARENT and CLONE_NEWPID together
The rhel6 kernel returns EINVAL in this case

Known issue:
* CT with userns doesn't work

This is a copy of
d31e97fa28
to address https://github.com/opencontainers/runc/issues/613

Signed-off-by: Andrey Vagin <avagin@virtuozzo.com>
Signed-off-by: Andrew Fernandes <andrew@fernandes.org>
2016-03-10 14:28:10 -05:00
Michael Crosby 213c1a1a4a Revert "Return proper exit code for exec errors"
This reverts commit 6bb653a6e8.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-10 11:00:48 -08:00
Michael Crosby 3af08519d0 Merge pull request #616 from hqhq/hq_remove_dup_headfile
Remove duplicated included head file
2016-03-08 10:54:31 -08:00
Michael Crosby 8cc43a6c69 Merge pull request #618 from cloudfoundry-incubator/serialize-hooks
Serialize CommandHooks to state so that PostStop hooks execute during 'runc delete'
2016-03-08 10:51:54 -08:00
Mrunal Patel 5b439d8c48 Merge pull request #491 from hqhq/hq_cleanup_systemd_apply
Cleanup systemd apply
2016-03-08 08:32:02 -08:00
Mrunal Patel 7b6c4c418d Merge pull request #621 from estesp/remove-dead-code
Remove no longer used uid/gid mapping functions
2016-03-04 08:53:41 -08:00
Phil Estes 3cd0987dca Remove no longer used uid/gid mapping functions
Now that all the user namespace code is moved into C, these routines are
no longer used.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2016-03-04 11:21:34 -05:00
Phil Estes 178bad5e71 Properly setuid/setgid after entering userns
The re-work of namespace entering lost the setuid/setgid that was part
of the Go-routine based process exec in the prior code. A side issue was
found with setting oom_score_adj before execve() in a userns that is
also solved here.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2016-03-04 11:12:26 -05:00
Mrunal Patel a55b03e85a Merge pull request #620 from estesp/runinuserns-nonlinux
Stub RunningInUserNS for non-Linux
2016-03-04 07:17:38 -08:00
Phil Estes 009d2835cf Stub RunningInUserNS for non-Linux
Add a stub for non-Linux that always returns false

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2016-03-03 16:33:43 -05:00
Michael Crosby 3cc90bd2d8 Add support for process overrides of settings
This commit adds support to libcontainer to allow caps, no new privs,
apparmor, and selinux process label to the process struct so that it can
be used together of override the base settings on the container config
per individual process.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-03 11:41:33 -08:00
Ed King b8d48474a9 Serialize CommandHooks to state
This is needed to make 'runc delete' correctly run the post-stop hooks.

Signed-off-by: Julian Friedman <julz.friedman@uk.ibm.com>
Signed-off-by: Ed King <eking@pivotal.io>
2016-03-03 16:57:51 +00:00
Qiang Huang 87e05b84e2 Remove duplicated included head file
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-03-03 11:08:18 +08:00
Mrunal Patel b86570a4d4 Merge pull request #610 from rajasec/checkpoint-state
Eliminating checkpoint state in container
2016-03-02 13:34:07 -08:00
Mrunal Patel 51fb686147 Merge pull request #609 from hustcat/centos6
Fix build error on centos6
2016-03-02 12:18:10 -08:00
Rajasekaran 0bda2c6af5 Eliminating checkpoint state in container
Signed-off-by: Rajasekaran <rajasec79@gmail.com>
2016-03-02 22:32:27 +05:30
Ido Yariv 78f5148c67 Fix handling of unsupported namespaces
currentState() always adds all possible namespaces to the state,
regardless of whether they are supported.
If orderNamespacePaths detects an unsupported namespace, an error is
returned that results in initialization failure.

Fix this by only adding paths of supported namespaces to the state.

Signed-off-by: Ido Yariv <ido@wizery.com>
2016-03-02 10:16:51 -05:00
Ye Yin 394fb55d85 Fix build error on centos6
Signed-off-by: Ye Yin <eyniy@qq.com>
2016-03-02 18:32:19 +08:00
Mrunal Patel b1872a068e Merge pull request #454 from mlaventure/libcontainer-pidns
Move setns within nsexec
2016-02-29 15:34:19 -08:00
Mrunal Patel 5f8fd8e04e Merge pull request #600 from duglin/FixTest
Fix to allow for build in different path
2016-02-29 11:01:31 -08:00
Mrunal Patel 6fc66fea48 Merge pull request #601 from LK4D4/fix_stats_race
Fix race between Apply and GetStats
2016-02-29 11:01:09 -08:00
Michael Crosby 5a701e9c13 Merge pull request #579 from rajasec/flagchange
Adding linux label to test file
2016-02-29 10:56:19 -08:00
Michael Crosby cda03a7ef1 Merge pull request #598 from rajasec/readme-swap
Updating swapiness value in README
2016-02-29 10:55:00 -08:00
Alexander Morozov e5906f7ed5 Fix race between Apply and GetStats
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2016-02-29 08:50:42 -08:00
Doug Davis 3e46977ec1 Fix to allow for build in different path
The path in the stacktrace might not be:
  "github.com/opencontainers/runc/libcontainer/stacktrace"
For example, for me its:
  "_/go/src/github.com/opencontainers/runc/libcontainer/stacktrace"
so I changed the check to make sure the tail end of the path matches instead
of the entire thing

Signed-off-by: Doug Davis <dug@us.ibm.com>
2016-02-29 06:45:33 -08:00
Kenfe-Mickael Laventure 6325ab96e7 Call Prestart hook after namespaces have been set
This simply move the call to the Prestart hooks to be made once we
receive the procReady message from the client.

This is necessary as we had to move the setns calls within nsexec in
order to be accomodate joining namespaces that only affect future
children (e.g. NEWPID).

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2016-02-28 12:26:53 -08:00
Kenfe-Mickael Laventure 08c3c6ebe2 Refactor nsexec
Cut nsexec in smaller chunk routines to make it more readable.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh 002b6c2fe8 Reorder and remove unused imports in nsexec.c
Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh 42d5d04801 Sets custom namespaces for init processes
An init process can join other namespaces (pidns, ipc etc.). This leverages
C code defined in nsenter package to spawn a process with correct namespaces
and clone if necessary.

This moves all setns and cloneflags related code to nsenter layer, which mean
that we dont use Go os/exec to create process with cloneflags and set
uid/gid_map or setgroups anymore. The necessary data is passed from Go to C
using a netlink binary-encoding format.

With this change, setns and init processes are almost the same, which brings
some opportunity for refactoring.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
[mickael.laventure@docker.com: adapted to apply on master @ d97d5e]
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@docker.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh d6bf4049f8 OrderNamespacePaths gets correct order of ns
This adds orderNamespacePaths to get correct order of namespaces for the
bootstrap program to join.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh 2d32210620 Integration tests for joining namespaces
Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh 4217b9c121 Do not override the specified userns path
Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 11:59:48 -08:00