Commit Graph

297 Commits

Author SHA1 Message Date
Michael Crosby 556f798a19 Fix various state bugs for pause and destroy
There were issues where a process could die before pausing completed
leaving the container in an inconsistent state and unable to be
destoryed.  This makes sure that if the container is paused and the
process is dead it will unfreeze the cgroup before removing them.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-01-21 16:43:33 -08:00
Mrunal Patel 27132f2e51 Merge pull request #486 from duglin/removeHardCode
Remove some hard coded strings
2016-01-21 14:53:17 -08:00
Qiang Huang 8bbe901045 Fix comment of swap limit
Set `-1` doesn't mean disable swap, disable swap means you
can't use swap memory, set `-1` really means you can use
unlimited swap memory.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-01-21 14:02:03 +08:00
Mrunal Patel 41d9d26513 Add support for just joining in apply using cgroup paths
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-01-20 14:23:05 -05:00
Doug Davis 49dfa1b62d Remove some hard coded strings
Signed-off-by: Doug Davis <dug@us.ibm.com>
2016-01-19 19:02:31 -08:00
Mrunal Patel e91b055623 Merge pull request #476 from hqhq/hq_embed_resource
Embed Resources for backward compatibility
2016-01-19 14:59:39 -08:00
Michael Crosby 5637f38b8a Merge pull request #471 from jfrazelle/add-seccomp-enabled-check
add seccomp.IsEnabled() function
2016-01-19 14:52:51 -08:00
Michael Crosby 9c41e8388c
Handle seccomp proc parsing errors
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-01-19 11:43:49 -08:00
Qiang Huang f048eaf87a Embed Resources for backward compatibility
Fixes: docker/docker#19329

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-01-19 19:08:14 +08:00
Jessica Frazelle 41edbeb25e
add seccomp.IsEnabled() function
This is much like apparmor.IsEnabled() function and a nice helper.

Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-01-18 10:44:31 -08:00
Jessica Frazelle ecf03fafa5
cleanup old hack dir
looks like this was left around from the libcontainer days ;)

Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-01-15 16:39:38 -08:00
Alexander Morozov 54b07da69e Merge pull request #475 from mrunalp/set_cwd
Make cwd required
2016-01-15 13:54:35 -08:00
Alexander Morozov 6c9532f063 Merge pull request #461 from ahmetalpbalkan/selinux-setenforce
selinux: add SelinuxSetEnforceMode implementation
2016-01-15 13:01:27 -08:00
Alexander Morozov f2f8f0e4e6 Merge pull request #462 from hqhq/hq_fix_libcontainer_readme
Update README of libcontainer
2016-01-15 13:00:44 -08:00
Mrunal Patel 6259f09e97 Merge pull request #426 from gitido/pressure_level
libcontainer: Add support for memcg pressure notifications
2016-01-14 16:23:07 -08:00
Mrunal Patel 269a717555 Make cwd required
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-01-14 19:06:56 -05:00
Alexander Morozov 8962f371d6 Merge pull request #472 from dadgar/b-find-cgroup-mount
Only validate post-hyphen field length on cgroup mounts
2016-01-14 15:08:11 -08:00
Alexander Morozov 3b42992948 Merge pull request #455 from hallyn/tty01
Do not allow access to /dev/tty{0,1}
2016-01-14 14:35:46 -08:00
Qiang Huang d87ac4a2ca Update README of libcontainer
Fixes: #438

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-01-14 14:53:29 +08:00
Alex Dadgar a42f3236d5 Only validate post-hyphen field length on cgroup mounts
Signed-off-by: Alex Dadgar <alex.dadgar@gmail.com>
2016-01-13 11:28:49 -08:00
Mrunal Patel 4c767d7046 Merge pull request #446 from cyphar/18-add-pids-controller
cgroup: add PIDs cgroup controller support
2016-01-11 16:56:00 -08:00
Aleksa Sarai 103853ead7 libcontainer: set cgroup config late
Due to the fact that the init is implemented in Go (which seemingly
randomly spawns new processes and loves eating memory), most cgroup
configurations are required to have an arbitrary minimum dictated by the
init. This confuses users and makes configuration more annoying than it
should. An example of this is pids.max, where Go spawns multiple
processes that then cause init to violate the pids cgroup constraint
before the container can even start.

Solve this problem by setting the cgroup configurations as late as
possible, to avoid hitting as many of the resources hogged by the Go
init as possible. This has to be done before seccomp rules are applied,
as the parent and child must synchronise in order for the parent to
correctly set the configurations (and writes might be blocked by seccomp).

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:35 +11:00
Aleksa Sarai a95483402e libcontainer: cgroups: loudly fail with Set
It is vital to loudly fail when a user attempts to set a cgroup limit
(rather than using the system default). Otherwise the user will assume
they have security they do not actually have. This mirrors the original
Apply() (that would set cgroup configs) semantics.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:35 +11:00
Aleksa Sarai f36ed4b174 libcontainer: cgroups: don't Set in Apply
Apply and Set are two separate operations, and it doesn't make sense to
group the two together (especially considering that the bootstrap
process is added to the cgroup as well). The only exception to this is
the memory cgroup, which requires the configuration to be set before
processes can join.

One of the weird cases to deal with is systemd. Systemd sets some of the
cgroup configuration options, but not all of them. Because memory is a
special case, we need to explicitly set memory in the systemd Apply().
Otherwise, the rest can be safely re-applied in .Set() as usual.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:35 +11:00
Aleksa Sarai db3159c9d9 libcontainer: cgroups: add pids controller support
Add support for the pids cgroup controller to libcontainer, a recent
feature that is available in Linux 4.3+.

Unfortunately, due to the init process being written in Go, it can spawn
an an unknown number of threads due to blocked syscalls. This results in
the init process being unable to run properly, and thus small pids.max
configs won't work properly.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-12 10:06:32 +11:00
Alexander Morozov c0cad6aa5e Merge pull request #451 from cyphar/fix-infinite-recursion
cgroups: fs: fix cgroup.Parent path sanitisation
2016-01-11 08:52:26 -08:00
Mrunal Patel d43108184e Merge pull request #458 from hallyn/userns
Handle running nested in a user namespace
2016-01-11 08:41:46 -08:00
Aleksa Sarai bf899fef45 cgroups: fs: fix cgroup.Parent path sanitisation
Properly sanitise the --cgroup-parent path, to avoid potential issues
(as it starts creating directories and writing to files as root). In
addition, fix an infinite recursion due to incomplete base cases.

It might be a good idea to move pathClean to a separate library (which
deals with path safety concerns, so all of runC and Docker can take
advantage of it).

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-11 23:10:35 +11:00
Alexander Morozov 910752f1f5 Merge pull request #463 from jimmidyson/non-recursive-pids
Revert to non-recursive GetPids, add recursive GetAllPids
2016-01-08 13:55:00 -08:00
Serge Hallyn c0ad40c5e6 Do not create devices when in user namespace
When we launch a container in a new user namespace, we cannot create
devices, so we bind mount the host's devices into place instead.

If we are running in a user namespace (i.e. nested in a container),
then we need to do the same thing.  Add a function to detect that
and check for it before doing mknod.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
---
 Changelog - add a comment clarifying what's going on with the
	     uidmap file.
2016-01-08 12:54:08 -08:00
Jimmi Dyson 91c7024e52 Revert to non-recursive GetPids, add recursive GetAllPids
Signed-off-by: Jimmi Dyson <jimmidyson@gmail.com>
2016-01-08 19:42:25 +00:00
Ahmet Alp Balkan c8b5e150f1 selinux: add SelinuxSetEnforceMode implementation
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
2016-01-08 16:48:30 +00:00
Mrunal Patel 749928a0a1 Merge pull request #421 from rajasec/selinux-compileflag
Adding selinux label
2016-01-07 17:57:54 -08:00
Serge Hallyn 2e13570679 Do not allow access to /dev/tty{0,1}
These are the real host devices, container should not generally
have or need them.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2016-01-06 18:42:17 -08:00
Mrunal Patel f03b7f8317 Merge pull request #419 from rajasec/selinux-teststepfix
make localtest failure with selinux enabled
2016-01-06 12:44:03 -08:00
Mrunal Patel 4fda64bc07 Merge pull request #452 from hqhq/hq_bindmount_whitelist
Add white list for bind mount check
2016-01-06 11:16:10 -08:00
Qiang Huang 9c1242ecba Add white list for bind mount chec
Fixes: #400

It would be useful to use fuse to isolate proc info.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-01-06 14:48:40 +08:00
Mrunal Patel fa24ebf26c Merge pull request #311 from crosbymichael/destory-state
Implement Container States
2016-01-04 09:59:28 -08:00
Kai Qiang WU(Kennan) c71d8e69f1 Fix typo word in SPEC.md
Signed-off-by: Kai Qiang WU(Kennan) <wkq5325@gmail.com>
2015-12-30 00:30:58 +00:00
Ido Yariv 55a8d686a9 libcontainer: Add support for memcg pressure notifications
It may be desirable to receive memory pressure levels notifications
before the container depletes all memory. This may be useful for
handling cases where the system thrashes when reaching the container's
memory limits.

Signed-off-by: Ido Yariv <ido@wizery.com>
2015-12-28 13:36:55 -05:00
Mrunal Patel 4124ba9468 Revert "cgroups: add pids controller support"
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-12-19 07:48:48 -08:00
Mrunal Patel bc465742ac Merge pull request #58 from cyphar/18-add-pids-controller
cgroups: add pids controller support
2015-12-18 19:55:51 -08:00
Aleksa Sarai 14ed8696c1 libcontainer: set cgroup config late
Due to the fact that the init is implemented in Go (which seemingly
randomly spawns new processes and loves eating memory), most cgroup
configurations are required to have an arbitrary minimum dictated by the
init. This confuses users and makes configuration more annoying than it
should. An example of this is pids.max, where Go spawns multiple
processes that then cause init to violate the pids cgroup constraint
before the container can even start.

Solve this problem by setting the cgroup configurations as late as
possible, to avoid hitting as many of the resources hogged by the Go
init as possible. This has to be done before seccomp rules are applied,
as the parent and child must synchronise in order for the parent to
correctly set the configurations (and writes might be blocked by seccomp).

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:48 +11:00
Aleksa Sarai 88e6d489f6 libcontainer: cgroups: loudly fail with Set
It is vital to loudly fail when a user attempts to set a cgroup limit
(rather than using the system default). Otherwise the user will assume
they have security they do not actually have. This mirrors the original
Apply() (that would set cgroup configs) semantics.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:47 +11:00
Aleksa Sarai 8a740d5391 libcontainer: cgroups: don't Set in Apply
Apply and Set are two separate operations, and it doesn't make sense to
group the two together (especially considering that the bootstrap
process is added to the cgroup as well). The only exception to this is
the memory cgroup, which requires the configuration to be set before
processes can join.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:47 +11:00
Aleksa Sarai 37789f5bf1 libcontainer: cgroups: add pids controller support
Add support for the pids cgroup controller to libcontainer, a recent
feature that is available in Linux 4.3+.

Unfortunately, due to the init process being written in Go, it can spawn
an an unknown number of threads due to blocked syscalls. This results in
the init process being unable to run properly, and thus small pids.max
configs won't work properly.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:38 +11:00
Michael Crosby 766e4c5250 Merge pull request #437 from clnperez/nlahdrlen-fix-for-gccgo
Add NLA_HDRLEN workaround for gccgo
2015-12-18 15:57:26 -08:00
Christy Perez ced8e5e7ba Caclulate NLA_HDRLEN as gccgo workaround
syscall.NLA_HDRLEN is not in gccgo (as of 5.3), so in the meantime
use the #defines taken from linux/netlink.h.

See https://github.com/golang/go/issues/13629

Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>
2015-12-17 17:36:47 -06:00
Michael Crosby 4415446c32 Add state pattern for container state transition
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add state status() method

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Allow multiple checkpoint on restore

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Handle leave-running state

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Fix state transitions for inprocess

Because the tests use libcontainer in process between the various states
we need to ensure that that usecase works as well as the out of process
one.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Remove isDestroyed method

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Handling Pausing from freezer state

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

freezer status

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

Fixing review comments

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

Added comment when freezer not available

Signed-off-by: Rajasekaran <rajasec79@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Conflicts:
	libcontainer/container_linux.go

Change checkFreezer logic to isPaused()

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Remove state base and factor out destroy func

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add unit test for state transitions

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-12-17 13:55:38 -08:00
Qiang Huang 9d6ce7168a Merge pull request #434 from mrunalp/resources
Move the cgroups setting into a Resources struct
2015-12-17 09:34:29 +08:00