Commit Graph

421 Commits

Author SHA1 Message Date
Alexander Morozov 910752f1f5 Merge pull request #463 from jimmidyson/non-recursive-pids
Revert to non-recursive GetPids, add recursive GetAllPids
2016-01-08 13:55:00 -08:00
Serge Hallyn c0ad40c5e6 Do not create devices when in user namespace
When we launch a container in a new user namespace, we cannot create
devices, so we bind mount the host's devices into place instead.

If we are running in a user namespace (i.e. nested in a container),
then we need to do the same thing.  Add a function to detect that
and check for it before doing mknod.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
---
 Changelog - add a comment clarifying what's going on with the
	     uidmap file.
2016-01-08 12:54:08 -08:00
Jimmi Dyson 91c7024e52 Revert to non-recursive GetPids, add recursive GetAllPids
Signed-off-by: Jimmi Dyson <jimmidyson@gmail.com>
2016-01-08 19:42:25 +00:00
Ahmet Alp Balkan c8b5e150f1 selinux: add SelinuxSetEnforceMode implementation
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
2016-01-08 16:48:30 +00:00
xlgao-zju cdc53051a3 update date in README
Signed-off-by: xlgao-zju <xlgao@zju.edu.cn>
2016-01-08 10:48:11 +08:00
Mrunal Patel 749928a0a1 Merge pull request #421 from rajasec/selinux-compileflag
Adding selinux label
2016-01-07 17:57:54 -08:00
Serge Hallyn 2e13570679 Do not allow access to /dev/tty{0,1}
These are the real host devices, container should not generally
have or need them.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2016-01-06 18:42:17 -08:00
Mrunal Patel f03b7f8317 Merge pull request #419 from rajasec/selinux-teststepfix
make localtest failure with selinux enabled
2016-01-06 12:44:03 -08:00
Mrunal Patel 4fda64bc07 Merge pull request #452 from hqhq/hq_bindmount_whitelist
Add white list for bind mount check
2016-01-06 11:16:10 -08:00
Qiang Huang 9c1242ecba Add white list for bind mount chec
Fixes: #400

It would be useful to use fuse to isolate proc info.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-01-06 14:48:40 +08:00
Mrunal Patel fa24ebf26c Merge pull request #311 from crosbymichael/destory-state
Implement Container States
2016-01-04 09:59:28 -08:00
Kai Qiang WU(Kennan) c71d8e69f1 Fix typo word in SPEC.md
Signed-off-by: Kai Qiang WU(Kennan) <wkq5325@gmail.com>
2015-12-30 00:30:58 +00:00
Ido Yariv 55a8d686a9 libcontainer: Add support for memcg pressure notifications
It may be desirable to receive memory pressure levels notifications
before the container depletes all memory. This may be useful for
handling cases where the system thrashes when reaching the container's
memory limits.

Signed-off-by: Ido Yariv <ido@wizery.com>
2015-12-28 13:36:55 -05:00
Mrunal Patel 4124ba9468 Revert "cgroups: add pids controller support"
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-12-19 07:48:48 -08:00
Mrunal Patel bc465742ac Merge pull request #58 from cyphar/18-add-pids-controller
cgroups: add pids controller support
2015-12-18 19:55:51 -08:00
Aleksa Sarai 14ed8696c1 libcontainer: set cgroup config late
Due to the fact that the init is implemented in Go (which seemingly
randomly spawns new processes and loves eating memory), most cgroup
configurations are required to have an arbitrary minimum dictated by the
init. This confuses users and makes configuration more annoying than it
should. An example of this is pids.max, where Go spawns multiple
processes that then cause init to violate the pids cgroup constraint
before the container can even start.

Solve this problem by setting the cgroup configurations as late as
possible, to avoid hitting as many of the resources hogged by the Go
init as possible. This has to be done before seccomp rules are applied,
as the parent and child must synchronise in order for the parent to
correctly set the configurations (and writes might be blocked by seccomp).

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:48 +11:00
Aleksa Sarai 88e6d489f6 libcontainer: cgroups: loudly fail with Set
It is vital to loudly fail when a user attempts to set a cgroup limit
(rather than using the system default). Otherwise the user will assume
they have security they do not actually have. This mirrors the original
Apply() (that would set cgroup configs) semantics.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:47 +11:00
Aleksa Sarai 8a740d5391 libcontainer: cgroups: don't Set in Apply
Apply and Set are two separate operations, and it doesn't make sense to
group the two together (especially considering that the bootstrap
process is added to the cgroup as well). The only exception to this is
the memory cgroup, which requires the configuration to be set before
processes can join.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:47 +11:00
Aleksa Sarai 37789f5bf1 libcontainer: cgroups: add pids controller support
Add support for the pids cgroup controller to libcontainer, a recent
feature that is available in Linux 4.3+.

Unfortunately, due to the init process being written in Go, it can spawn
an an unknown number of threads due to blocked syscalls. This results in
the init process being unable to run properly, and thus small pids.max
configs won't work properly.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2015-12-19 11:30:38 +11:00
Michael Crosby 766e4c5250 Merge pull request #437 from clnperez/nlahdrlen-fix-for-gccgo
Add NLA_HDRLEN workaround for gccgo
2015-12-18 15:57:26 -08:00
Christy Perez ced8e5e7ba Caclulate NLA_HDRLEN as gccgo workaround
syscall.NLA_HDRLEN is not in gccgo (as of 5.3), so in the meantime
use the #defines taken from linux/netlink.h.

See https://github.com/golang/go/issues/13629

Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>
2015-12-17 17:36:47 -06:00
Michael Crosby 4415446c32 Add state pattern for container state transition
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add state status() method

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Allow multiple checkpoint on restore

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Handle leave-running state

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Fix state transitions for inprocess

Because the tests use libcontainer in process between the various states
we need to ensure that that usecase works as well as the out of process
one.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Remove isDestroyed method

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Handling Pausing from freezer state

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

freezer status

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

Fixing review comments

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

Added comment when freezer not available

Signed-off-by: Rajasekaran <rajasec79@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Conflicts:
	libcontainer/container_linux.go

Change checkFreezer logic to isPaused()

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Remove state base and factor out destroy func

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add unit test for state transitions

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-12-17 13:55:38 -08:00
Qiang Huang 9d6ce7168a Merge pull request #434 from mrunalp/resources
Move the cgroups setting into a Resources struct
2015-12-17 09:34:29 +08:00
Mrunal Patel 55a49f2110 Move the cgroups setting into a Resources struct
This allows us to distinguish cases where a container
needs to just join the paths or also additionally
set cgroups settings. This will help in implementing
cgroupsPath support in the spec.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-12-16 15:53:31 -05:00
David Calavera 77c36f4b34 Move linux only Process.InitializeIO behind the linux build flag.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-12-15 15:12:29 -05:00
David Calavera 977991d36f Replace docker units package with new docker/go-units.
It's the same library but it won't live in docker/docker anymore.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-12-14 20:45:30 -05:00
Mrunal Patel 11f8fdca33 Merge pull request #430 from crosbymichael/pipes
Move STDIO initialization to libcontainer.Process
2015-12-11 14:30:42 -08:00
Alexander Morozov cb04f03854 Merge pull request #336 from hqhq/hq_parent_cgroup_systemd
systemd: support cgroup parent with specified slice
2015-12-11 10:13:47 -08:00
xlgao-zju ff29daafc0 fix minor typo
Signed-off-by: xlgao-zju <xlgao@zju.edu.cn>
2015-12-11 21:37:32 +08:00
Michael Crosby 29b139f702 Move STDIO initialization to libcontainer.Process
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-12-10 16:11:49 -08:00
Mrunal Patel 0267ad05b0 Merge pull request #340 from dqminh/replace-env-netlink
nsexec: replace usage of environment variable with netlink message
2015-12-09 14:21:45 -08:00
Michael Crosby 9c9aac5385 Export console New func
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-12-09 11:59:10 -08:00
Daniel, Dao Quang Minh 7d423cb7a1 setns: replace env with netlink for bootstrap data
replace passing of pid and console path via environment variable with passing
them with netlink message via an established pipe.

this change requires us to set _LIBCONTAINER_INITTYPE and
_LIBCONTAINER_INITPIPE as the env environment of the bootstrap process as we
only send the bootstrap data for setns process right now. When init and setns
bootstrap process are unified (i.e., init use nsexec instead of Go to clone new
process), we can remove _LIBCONTAINER_INITTYPE.

Note:
- we read nlmsghdr first before reading the content so we can get the total
  length of the payload and allocate buffer properly instead of allocating
  one large buffer.

- check read bytes vs the wanted number. It's an error if we failed to read
  the desired number of bytes from the pipe into the buffer.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2015-12-03 18:03:48 +00:00
Qiang Huang 7695a0ddb0 systemd: support cgroup parent with specified slice
Pick up #119
Fixes: docker/docker#16681

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-12-02 23:57:02 -05:00
Mrunal Patel 3317785f56 Merge pull request #420 from runcom/cgroups-unsupported
libcontainer: configs: create cgroup_unsupported.go in order to build on darwin as well
2015-11-30 09:20:23 -08:00
Alexander Morozov decba54d78 Merge pull request #424 from runcom/fix-go-vet
libcontainer: network_linux.go: fix go vet
2015-11-30 09:06:41 -08:00
Antonio Murdaca 3029587085 libcontainer: network_linux.go: fix go vet
This patch fixes the following go vet warnings:
```
libcontainer/network_linux.go:96: github.com/vishvananda/netlink.Device
composite literal uses unkeyed fields
libcontainer/network_linux.go:114: github.com/vishvananda/netlink.Device
composite literal uses unkeyed fields
```

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2015-11-30 12:31:18 +01:00
Rajasekaran 49ff2711e1 Fixing xattr test step issue
Signed-off-by: Rajasekaran <rajasec79@gmail.com>
2015-11-29 09:24:42 +05:30
rajasec a6614ba40f Fixing TestSetFilecon in selinux test step
Signed-off-by: rajasec <rajasec79@gmail.com>
2015-11-28 13:51:46 +05:30
Antonio Murdaca 112493115f libcontainer: configs: create cgroup_unsupported.go in order to build on darwin as well
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2015-11-27 10:28:29 +01:00
rajasec 9f4d5340f4 Adding selinux label
Signed-off-by: rajasec <rajasec79@gmail.com>
2015-11-26 19:44:51 +05:30
rajasec ce68f7aef7 make localtest failure with selinux enabled
Signed-off-by: rajasec <rajasec79@gmail.com>
2015-11-24 23:24:30 +05:30
Daniel, Dao Quang Minh d914bf7347 setns: add bootstrap data
add bootstrap data to setns process. If we have any bootstrap data then copy it
to the bootstrap process (i.e. nsexec) using the sync pipe. This will allow us
to eventually replace environment variable usage with more structured data
to setup namespaces, write pid/gid map, setgroup etc.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2015-11-22 11:36:58 +00:00
rajasec 949d822675 Adding error conditions when apparmor disabled
Signed-off-by: rajasec <rajasec79@gmail.com>

Add the changes to errors in lower case

Signed-off-by: rajasec <rajasec79@gmail.com>
2015-11-22 13:14:18 +05:30
Antonio Murdaca 400e05fe5b libcontainer: configs: extend unsupported os
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2015-11-19 18:24:34 +01:00
Alexander Morozov 776791463d Merge pull request #357 from ashahab-altiscale/350-container-in-container
Bind mount device nodes on EPERM
2015-11-16 14:54:02 -08:00
Qiang Huang 96f0eefa1a Fix comment to be consistent with the code
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-11-16 19:16:27 +08:00
Abin Shahab 28c9d0252c Userns container in containers
Enables launching userns containers by catching EPERM errors for writing
to devices cgroups, and for mknod invocations.

Signed-off-by: Abin Shahab <ashahab@altiscale.com>
2015-11-15 14:42:35 -08:00
Alexander Morozov 48fdc50d09 Merge pull request #398 from crosbymichael/seccomp-trace
Add seccomp trace support
2015-11-13 10:54:18 -08:00
Alexander Morozov bda4ca2f8f Merge pull request #388 from hqhq/hq_cgroup_cleanups
Some cgroup cleanups
2015-11-13 09:06:18 -08:00
Michael Crosby caca840972 Add seccomp trace support
Closes #347

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-11-12 17:03:53 -08:00
Michael Crosby 2be14dc963 Merge pull request #392 from mrunalp/poststart
Add poststart hooks
2015-11-12 16:34:38 -08:00
Michael Crosby 879dfdd980 Fix race setting process opts
When starting and quering for pids a container can start and exit before
this is set.  So set the opts after the process is started and while
libcontainer still has the container's process blocking on the pipe.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-11-06 16:51:59 -08:00
Mrunal Patel 452e8a73c5 Integrate poststart hooks with spec
* Call poststart hooks after the container is started
* Tie in with spec configuration

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-11-06 18:03:32 -05:00
Mrunal Patel bb2d3cd1be Add Poststart hook to libcontainer config
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-11-06 18:02:50 -05:00
Qiang Huang 209c8d9979 Add some comments about cgroup
We fixed some bugs and introduced some code hard to be
understood, add some comments for them.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-11-05 19:12:53 +08:00
Qiang Huang 8c98ae27ac Refactor cgroupData
The former cgroup entry is confusing, separate it to parent
and name.
Rename entry `c` to `config`.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-11-05 19:12:53 +08:00
Qiang Huang a263afaf6c Rename parent and data
'parent' function is confusing with parent cgroup, it's actually
parent path, so rename it to parentPath.

The name 'data' is too common to be identified, rename it to cgroupData
which is exactly what it is.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-11-05 19:12:53 +08:00
John Howard a919bd3f67 Windows: Refactor Container interface
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-11-02 15:12:16 -08:00
Mrunal Patel c42a2952c4 Merge pull request #361 from jhowardmsft/jjh/criu_opts
Windows: Factor down criu_opts
2015-11-02 15:05:27 -08:00
Mrunal Patel 7caef5626b Merge pull request #359 from jhowardmsft/jjh/state_struct
Windows: Refactor state struct
2015-11-02 15:04:12 -08:00
Mrunal Patel cf73b32eeb Merge pull request #343 from hqhq/hq_unify_behavior_for_memory
Unify behavior for memory cgroup
2015-11-02 14:58:31 -08:00
Michael Crosby 26eb6a1bcd Merge pull request #377 from rhatdan/label
Docker needs to know whether the user requested a relabel
2015-11-02 14:55:27 -08:00
Doug Davis e5dc12a0c9 Add more context around some error cases
Signed-off-by: Doug Davis <dug@us.ibm.com>
2015-10-30 10:55:48 -07:00
Dan Walsh 69c3ea4e17 Docker needs to know whether the user requested a relabel
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2015-10-28 15:44:38 -04:00
John Howard fe1cce69b3 Windows: Refactor state struct
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-10-26 14:45:20 -07:00
Mrunal Patel 6c36d666a1 Merge pull request #365 from jhowardmsft/jjh/devices
Windows: Tidy libcontainer\devices
2015-10-24 19:36:26 -07:00
Mrunal Patel 0d155ba0fb Merge pull request #362 from jhowardmsft/jjh/configs-cgroup
Windows: Refactor configs/cgroup.go
2015-10-24 19:34:54 -07:00
Mrunal Patel 6d85c27599 Merge pull request #364 from jhowardmsft/jjh/fs-build-tags
Fixes build tags on cgroups\fs\*.go
2015-10-24 19:33:52 -07:00
John Howard 37675129ba Windows: Tidy libcontainer\devices
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-10-23 13:50:24 -07:00
Alexander Morozov 34fe03fa8a Merge pull request #238 from adrianreber/master
Add criu related debug output
2015-10-23 13:44:03 -07:00
John Howard fb5a8febce Fixes build tags on cgroups\fs\*.go
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-10-23 13:41:10 -07:00
Mrunal Patel b741e3dc9d Merge pull request #337 from alban/alban/stdio
libcontainer/SPEC.md: fix /dev/stdio symlinks
2015-10-23 13:40:56 -07:00
John Howard 8690e9cc8c Windows: Refactor configs/cgroup.go
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-10-23 13:08:18 -07:00
John Howard 78351a8e3d Windows: Factor down criu_opts
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-10-23 12:58:59 -07:00
Mrunal Patel bed70ca579 Merge pull request #358 from rajasec/exit-typo
Fixing typo in the comment for exit
2015-10-23 11:12:17 -07:00
Alexander Morozov 97929bd6dd Merge pull request #335 from crosbymichael/cgroup-order
Add name to cgroup subsystem and set order
2015-10-23 10:38:29 -07:00
yangshukui e5ef8d239a Add the conversion of architectures for seccomp config
Signed-off-by: yangshukui <yangshukui@huawei.com>
2015-10-23 10:17:39 +08:00
rajasec 58e3cde8f3 Fixing typo in the comment for exit
Signed-off-by: rajasec <rajasec79@gmail.com>
2015-10-22 19:08:03 +05:30
Alban Crequy f381717120 libcontainer/SPEC.md: fix /dev/stdio symlinks
The spec uses symlinks to "/proc/1/..." but the implementation uses
"/proc/self/...": see setupDevSymlinks (libcontainer/rootfs_linux.go).

The implementation is more correct, so I'm changing the spec to match
the implementation.

Signed-off-by: Alban Crequy <alban.crequy@coreos.com>
2015-10-21 11:10:24 +02:00
Qiang Huang 34cff6f2f3 Correct intuition for setupDev
Minor fix, the former setupDev=true means not setup dev,
which is contrary to intuition, just correct it.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-10-21 16:06:26 +08:00
Qiang Huang 194e0e4db6 Unify behavior for memory cgroup
We have a rule that for optional cgroups, don't fail if some
of them are not mounted, but we want it fail hard when a
user specifies an option and we are unable to fulfill the
request.

Memory cgroup should also follow this rule.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-10-20 14:01:48 +08:00
Michael Crosby ba2ce3b25a Cgroup set order for systemd
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-10-19 13:32:45 -07:00
Michael Crosby 2554f49d5e Use array instead of map for cgroup subsystems
Also add cpuset as the first in the list to address issues setting the
pid in any cgroup before the cpuset is populated.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-10-15 15:24:53 -07:00
Michael Crosby 02fdc70837 Add Name() to cgroup subsystems
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-10-15 15:19:23 -07:00
Mrunal Patel 3be7f87b1b Merge pull request #334 from hqhq/hq_set_cpus_mems_first
Set cpuset.cpus and cpuset.mems before join the cgroup
2015-10-15 14:33:28 -07:00
Qiang Huang be6764508e Set cpuset.cpus and cpuset.mems before join the cgroup
It can avoid unnecessary task migrataion, see this scenario:
 - container init task is on cpu 1, and we assigned it to cpu 1,
   but parent cgroup's cpuset.cpus=2
 - we created the cgroup dir and inherited cpuset.cpus from parent as 2
 - write container init task's pid to cgroup.procs
 - [it's possibile the container init task migrated to cpu 2 here]
 - set cpuset.cpus as assigned to cpu 1
 - [the container init task has to be migrated back to cpu 1]

So we should set cpuset.cpus and cpuset.mems before writing pids
to cgroup.procs to aviod such problem.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-10-15 11:16:56 +08:00
Alexander Morozov 6c198ae2d0 Reorder checks in Walk to avoid panics
Also added test for host PID namespace

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-10-13 15:06:57 -07:00
Alexander Morozov 6dad176d01 Get PIDs from cgroups recursively
Also lookup cgroup for systemd is changed to "device" to be consistent
with fs implementation.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-10-13 10:19:01 -07:00
Adrian Reber c42ef59bf9 Add criu related debug output
While testing different versions of criu it helps to know which
criu binary with which options is currently used. Therefore additional
debug output to display these information is added.

v2: increase readability of printed out criu options

Signed-off-by: Adrian Reber <adrian@lisas.de>
2015-10-13 10:41:00 +02:00
Alexander Morozov d9ba9cebac Merge pull request #184 from huikang/criu-cgroup-manage-mode
Add option to support criu manage cgroups mode for dump and restore
2015-10-12 10:51:16 -07:00
Mrunal Patel bfe2bacbf4 Merge pull request #320 from rhatdan/label
Validate label options
2015-10-11 20:54:38 -07:00
Hui Kang 25da513c4b Add option to support criu manage cgroups mode for dump and restore
CRIU supports cgroup-manage mode from v1.7

Signed-off-by: Hui Kang <hkang.sunysb@gmail.com>
2015-10-11 04:42:54 +00:00
Dan Walsh f8b34352fe Validate label options
Only valid options to --security-opt for label should be
disable, user, role, type, level.

Return error on invalid entry

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2015-10-10 06:51:49 -04:00
Mrunal Patel f152edcb1c Merge pull request #316 from cpuguy83/race_on_output_start_error
Fix for race from error on process start
2015-10-08 13:51:54 -07:00
xlgao-zju 02fc164456 change named to names
Signed-off-by: xlgao-zju <xlgao@zju.edu.cn>
2015-10-08 21:44:23 +08:00
Brian Goff 7632c4585f Fix for race from error on process start
This rather naively fixes an error observed where a processes stdio
streams are not written to when there is an error upon starting up the
process, such as when the executable doesn't exist within the
container's rootfs.

Before the "fix", when an error occurred on start, `terminate` is called
immediately, which calls `cmd.Process.Kill()`, then calling `Wait()` on
the process. In some cases when this `Kill` is called the stdio stream
have not yet been written to, causing non-deterministic output. The
error itself is properly preserved but users attached to the process
will not see this error.

With the fix it is just calling `Wait()` when an error occurs rather
than trying to `Kill()` the process first. This seems to preserve stdio.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2015-10-07 21:28:26 -04:00
Alexander Morozov 902c012e85 Merge pull request #319 from dodgerblue/dodgerblue-arm64
nsexec: Align clone child stack ptr to 16
2015-10-06 08:28:24 -07:00
Bogdan Purcareata 4c5eb45862 nsexec: Align clone child stack ptr to 16
This is required on ARM64 builds that use the clone syscall. Check [1].

[1] http://lxr.free-electrons.com/source/arch/arm64/kernel/process.c#L264

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
2015-10-06 10:41:18 +00:00
Antonio Murdaca c5b80bddf1 bump docker pkgs
Docker pkgs were updated while golinting the whole docker code base.
Now when trying to bump libcontainer/runc in docker, it fails compiling
with the following error:
``
vendor/src/github.com/opencontainers/runc/libcontainer/rootfs_linux.go:424:
undefined: mount.MountInfo
``
This is because, for instance, the mount pkg was updated here
0f5c9d301b (diff-49294d05afa48e2f7c0d2f02c6f7614c)
and now that type is only `mount.Info`.
This patch bump docker pkgs commit and adapt code to it.

Signed-off-by: Antonio Murdaca <amurdaca@redhat.com>
2015-10-06 10:48:12 +02:00
Mrunal Patel cc84f2cc9b Merge pull request #305 from hqhq/hq_add_softlimit_systemd
Add memory reservation support for systemd
2015-10-05 16:37:32 -07:00
Mrunal Patel 223975564a Merge pull request #276 from runcom/adapt-spec-96bcd043aa8a28f6f64c95ad61329765f01de1ba
Adapt spec 96bcd043aa
2015-10-05 16:36:09 -07:00
Alexander Morozov d7ce356411 Merge pull request #315 from mrunalp/systemd_name
Systemd name
2015-10-05 15:12:28 -07:00
Mrunal Patel 0b9e7af763 Merge pull request #313 from swagiaal/fix-GetAdditionalGroups
Allow numeric groups for containers without /etc/group
2015-10-05 11:47:36 -07:00
Mrunal Patel 79a02e35fb cgroups: Add name=systemd to list of subsystems
This allows getting the path to the subsystem and so is subsequently
used in EnterPid by an exec process.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-10-05 14:24:11 -04:00
Mrunal Patel 1940c73777 cgroups: Add a name cgroup
This is meant to be used in retrieving the paths so an exec
process enters all the cgroup paths correctly.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-10-05 14:23:05 -04:00
Sami Wagiaalla c25c38cc80 Allow numeric groups for containers without /etc/group
/etc/groups is not needed when specifying numeric group ids. This
change allows containers without /etc/groups to specify numeric
supplemental groups.

Signed-off-by: Sami Wagiaalla <swagiaal@redhat.com>
2015-10-04 19:02:35 -04:00
xlgao-zju 4b360d6300 change uid to gid in func HostGID
Signed-off-by: xlgao-zju <xlgao@zju.edu.cn>
2015-10-05 01:11:48 +08:00
Antonio Murdaca c6e406af24 Adjust runc to new opencontainers/specs version
Godeps: Vendor opencontainers/specs 96bcd043aa

Fix a bug where it's impossible to pass multiple devices to blkio
cgroup controller files. See https://github.com/opencontainers/runc/issues/274

Signed-off-by: Antonio Murdaca <runcom@linux.com>
2015-10-03 12:25:33 +02:00
Alexander Morozov c573ffbd05 Merge pull request #208 from rhvgoyal/config-rootfsPropagation
Create container_private, container_slave and container_shared modes for rootfsPropagation
2015-10-02 13:42:20 -07:00
Vivek Goyal 6a851e1195 exec_test.go: Test case for rootfsPropagation="private"
A test case to test rootfsPropagation="private" and making sure shared
volumes work.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-10-01 17:03:02 -04:00
Vivek Goyal 175e4b8aec exec_test.go: Test cases for rootfsPropagation=rslave
test case to test rootfsPropagation=rslave

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-10-01 17:03:02 -04:00
Vivek Goyal da8d776c08 Make pivotDir rprivate
pivotDir is the one where pivot_root() call puts the old root. We will
unmount pivotDir() and delete it.

Previously we were making / always rslave or rprivate. That will mean 
that pivotDir() could never have mounts which would be shared with
parent mount namespace. That also means that unmounting pivotDir() was
safe and none of the unmount will propagate to parent namespace and
unmount things which we did not want to.

But now user can specify that apply private, shared, slave on /. That
means some of the mounts we inherited from parent could be shared and that
also means if we umount pivotDir/, those mounts will get unmounted in
parent too. That's not what we want.

Instead make pivotDir rprivate so that unmounts don't propagate back to
parent.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-10-01 17:03:02 -04:00
Vivek Goyal 23ec72a426 Make parent mount of container root private if it is shared.
pivot_root() introduces bunch of restrictions otherwise it fails. parent
mount of container root can not be shared otherwise pivot_root() will
fail. 

So far parent could not be shared as we marked everything either private
or slave. But now we have introduced new propagation modes where parent
mount of container rootfs could be shared and pivot_root() will fail.

So check if parent mount is shared and if yes, make it private. This will
make sure pivot_root() works.

Also it will make sure that when we bind mount container rootfs, it does
not propagate to parent mount namespace. Otherwise cleanup becomes a 
problem.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-10-01 17:03:02 -04:00
Vivek Goyal 5dd6caf6cf Replace config.Privatefs with config.RootPropagation
Right now config.Privatefs is a boolean which determines if / is applied
with propagation flag syscall.MS_PRIVATE | syscall.MS_REC or not.

Soon we want to represent other propagation states like private, [r]slave,
and [r]shared. So either we can introduce more boolean variable or keep
track of propagation flags in an integer variable. Keeping an integer
variable is more versatile and can allow various kind of propagation flags
to be specified. So replace Privatefs with RootPropagation which is an
integer.

Note, this will require changes in docker. Instead of setting Privatefs
to true, they will need to set.

config.RootPropagation = syscall.MS_PRIVATE | syscall.MS_REC
 
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-10-01 17:03:02 -04:00
Alexander Morozov 0954faba13 Merge pull request #306 from hqhq/hq_join_perfevent_systemd
Systemd: Join perf_event cgroup
2015-10-01 10:05:35 -07:00
Alexander Morozov 4d5079b9dc Merge pull request #309 from chenchun/fix_reOpenDevNull
Fix reOpenDevNull
2015-09-30 19:06:43 -07:00
Alexander Morozov fba07bce72 Merge pull request #307 from estesp/no-remount-if-unecessary
Only remount if requested flags differ from current
2015-09-30 11:40:06 -07:00
Mrunal Patel 74ded3660b Merge pull request #304 from rhatdan/mountproc
/proc and /sys do not support labeling
2015-09-30 11:36:20 -07:00
Michael Crosby 146916ca93 Merge pull request #308 from LK4D4/fix_tlb_tests
Run tests for all HugetlbSizes
2015-09-30 11:26:40 -07:00
Chun Chen 06d91f546f Fix reOpenDevNull
We should open /dev/null with os.O_RDWR, otherwise it won't be
possible writen to it

Signed-off-by: Chun Chen <ramichen@tencent.com>
2015-09-30 16:05:49 +08:00
Phil Estes 97f5ee4e6a Only remount if requested flags differ from current
Do not remount a bind mount to enable flags unless non-default flags are
provided for the requested mount. This solves a problem with user
namespaces and remount of bind mount permissions.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2015-09-29 23:13:04 -04:00
Alexander Morozov e32b3442ec Run tests for all HugetlbSizes
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-09-29 17:08:41 -07:00
Qiang Huang 6a5ba1109c Systemd: Join perf_event cgroup
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-09-29 15:42:29 +08:00
Qiang Huang fb5a56fb97 Add memory reservation support for systemd
Seems it's missed in the first place.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-09-29 10:02:12 +08:00
Dan Walsh cab342f0de Check for failure on /dev/mqueue and try again without labeling
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2015-09-28 12:31:52 -04:00
Dan Walsh b4dcb75503 /proc and /sys do not support labeling
This is causing docker to crash when --selinux-enforcing mode is set.

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2015-09-28 12:31:52 -04:00
Mrunal Patel f7d1401a69 Add validation for sysctl
/proc/sys isn't completely namespaced and only some properties are allowed
per linux namespace.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-09-25 14:04:18 -04:00
Alexander Morozov 902ccd0f18 Merge pull request #302 from mrunalp/cap_list
Update github.com/syndtr/gocapability/capability to 2c00daeb6c3b4
2015-09-25 08:49:44 -07:00
Mrunal Patel c5d3bda7e1 Merge pull request #292 from keloyang/rpid
no need to use p.cmd.Process.Pid in function, use p.pid() instead.
2015-09-24 15:59:39 -07:00
Mrunal Patel 34d3e2b948 Update github.com/syndtr/gocapability/capability to 2c00daeb6c3b45114c80ac44119e7b8801fdd852
This allows us to use the capability.List() function to construct capability list
dynamically.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-09-24 18:44:01 -04:00
Alexander Morozov aac9179bba Merge pull request #160 from mrunalp/feature/hooks
Add prestart/poststop hooks to runc
2015-09-24 14:52:30 -07:00
Michael Crosby 203d3e258e Move mount methods out of configs pkg
Do not have methods and actions that require syscalls in the configs
package because it breaks cross compile.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-24 09:43:12 -07:00
Mrunal Patel dcafe48737 Add version to HookState to make it json-compatible with spec State
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-09-23 17:13:00 -07:00
Alexander Morozov 83b2975c8b Merge pull request #295 from mheon/seccomp_architecture
Libcontainer: Add support for multiple architectures in Seccomp
2015-09-23 11:08:47 -07:00
Matthew Heon 795a6c9702 Libcontainer: Add support for multiple architectures in Seccomp
This commit allows additional architectures to be added to Seccomp filters
created by containers. This allows containers to make syscalls using these
architectures. For example, in a container on an AMD64 system, only AMD64
syscalls would be usable unless x86 was added to the filter using this patch,
which would allow both 32-bit and 64-bit syscalls to be used.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2015-09-23 13:54:24 -04:00
Michael Crosby 5765dcd086 Merge pull request #296 from crosbymichael/mount-resolv-symlink
Change mount dest after resolving symlinks
2015-09-23 10:21:25 -07:00
Michael Crosby b3bb606513 Change mount dest after resolving symlinks
We need to update the mount's destination after we resolve symlinks so
that it properly creates and mounts the correct location.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-23 10:07:18 -07:00
keloyang 69a5b2df9e no need to use p.cmd.Process.Pid in function, use p.pid() instead.
Signed-off-by: keloyang <yangshukui@huawei.com>
2015-09-23 10:48:36 +08:00
Mrunal Patel d8b7deaf4c Merge pull request #283 from runcom/cleanup-unused-func-args
Cleanup unused func arguments
2015-09-22 16:53:19 -07:00
Mrunal Patel 7570169548 Merge pull request #288 from gitido/fix_userns
Enter existing user namespace if present
2015-09-22 16:27:57 -07:00
Michael Crosby 219b6c99e0 Ignore changing /dev/null permissions if used in STDIO
Whenever dev/null is used as one of the main processes STDIO, do not try
to change the permissions on it via fchown because we should not do it
in the first place and also this will fail if the container is supposed
to be readonly.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-22 15:32:31 -07:00
Ido Yariv 08366a8597 Enter existing user namespace if present
When executing an additional process in a container, all namespaces are
entered but the user namespace. As a result, the process may be
executed as the host's root user. This has both functionality and
security implications.

Fix this by adding the missing user namespace to the array of
namespaces. Since joining a user namespace in which the caller is
already a member yields an error, skip namespaces we're already in.

Last, remove a needless and buggy AT_SYMLINK_NOFOLLOW in the code.

Signed-off-by: Ido Yariv <ido@wizery.com>
2015-09-21 21:49:52 -04:00
Antonio Murdaca d6e6462478 Cleanup unused func arguments
Signed-off-by: Antonio Murdaca <runcom@linux.com>
2015-09-21 11:50:29 +02:00
Michael Crosby 0dad64f7ad Fix STDIO permissions when container user not root
Fix the permissions of the container's main processes STDIO when the
process is not run as the root user.  This changes the permissions right
before switching to the specified user so that it's STDIO matches it's
UID and GID.

Add a test for checking that the STDIO of the process is owned by the
specified user.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-18 14:11:29 -07:00
Vivek Goyal d1f4a5b8b5 libcontainer: Allow passing mount propagation flags
Right now if one passes a mount propagation flag in spec file, it
does not take effect. For example, try following in spec json file.

{
  "type": "bind",
  "source": "/root/mnt-source",
  "destination": "/root/mnt-dest",
  "options": "rbind,shared"
}

One would expect that /root/mnt-dest will be shared inside the container
but that's not the case.

#findmnt -o TARGET,PROPAGATION
`-/root/mnt-dest                      private

Reason being that propagation flags can't be passed in along with other
regular flags. They need to be passed in a separate call to mount syscall.
That too, one propagation flag at a time. (from mount man page).

Hence, store propagation flags separately in a slice and apply these
in that order after the mount call wherever appropriate. This allows
user to control the propagation property of mount point inside
the container.

Storing them separately also solves another problem where recursive flag
(syscall.MS_REC) can get mixed up. For example, options "rbind,private"
and "bind,rprivate" will be same and there will be no way to differentiate
between these if all the flags are stored in a single integer.

This patch would allow one to pass propagation flags "[r]shared,[r]slave,
[r]private,[r]unbindable" in spec file as per mount property.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-09-16 15:53:23 -04:00
Mrunal Patel ec37110957 Update README for the CAP prefix change
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-09-15 14:44:12 -04:00
Mrunal Patel 859abee0c8 Add CAP prefix for capabilities
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-09-15 14:43:03 -04:00
Mrunal Patel 4d8e13fc3e Merge pull request #43 from LK4D4/new_netlink
New netlink library
2015-09-14 14:01:07 -07:00
Mrunal Patel 486ac97618 Merge pull request #236 from hqhq/hq_fix_cgroup_rw
Always remount for bind mount
2015-09-14 12:08:34 -07:00
Rajasekaran 2940f73a14 make localtest failure on removing seccomp flag
Signed-off-by: Rajasekaran <rajasec79@gmail.com>
2015-09-12 14:43:55 +05:30
Mrunal Patel ef9471fd5b Merge pull request #253 from avagin/cr-cgroups
c/r: create cgroups to restore a container
2015-09-11 18:03:40 -07:00
Alexander Morozov b0fd9fb75a Merge pull request #220 from crosbymichael/build-tags
Add seccomp build tag
2015-09-11 12:06:27 -07:00
Michael Crosby a8e0185d97 Add seccomp build tag
Add a seccomp build tag and also support in the Makefile to add or
remove build tags.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-11 12:03:57 -07:00
David Calavera 0f28592b35 Turn hook pointers into values.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-11 11:34:34 -07:00
Michael Crosby dd969cbacd Add test for function based hooks
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-10 18:15:00 -07:00
Mrunal Patel 1dca365393 Add test for prestart hook
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>

Conflicts:
	libcontainer/integration/exec_test.go
2015-09-10 17:59:36 -07:00
Michael Crosby 05567f2c94 Implement hooks in libcontainer
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-10 17:57:31 -07:00
Andrey Vagin df39686c93 c/r: create cgroups to restore a container
Here are two reasons:
* If we use systemd, we need to ask it to create cgroups
* If a container is restored with another ID, we need to
  change paths to cgroups.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
2015-09-10 21:00:27 +03:00
Andrey Vagin da2535f2d1 mount: don't read /proc/self/cgroup many times
Signed-off-by: Andrey Vagin <avagin@openvz.org>
2015-09-10 21:00:22 +03:00
Andrey Vagin e49c1dc559 Rework ParseCgroupFile
Currently we parse /proc/self/cgroup for each controller.
It's ineffective.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
2015-09-10 20:59:27 +03:00
Alexander Morozov 24f4d5d1fd Remove old netlink library
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-09-09 19:38:02 -07:00
Alexander Morozov 916bd6bd68 Use github.com/vishvananda/netlink for networking
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-09-09 19:32:46 -07:00
Qiang Huang b94fe5b7f8 Fix bug in find cgroup mount point dir
Bug was introduced in #250

According to: http://man7.org/linux/man-pages/man5/proc.5.html

36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue
(1)(2)(3)   (4)   (5)      (6)      (7)   (8) (9)   (10)         (11)
...
(7)  optional fields: zero or more fields of the form
       "tag[:value]".
The 7th field is optional. We should skip it when parsing mount info.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-09-10 08:29:12 +08:00
Qiang Huang f2ec7eff7e Rename FindCgroupMountpointAndSource
Rename it to FindCgroupMountpointAndRoot.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-09-09 09:29:11 +08:00
Qiang Huang bc67941c72 Parse directly in FindCgroupMountpointDir
Unify it with FindCgroupMountpoint, and add comments why
we should to do this.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-09-09 09:28:50 +08:00
Alexander Morozov 05b1cda5dd Merge pull request #235 from hqhq/hq_fix_cgroup_test
Fix cgroup mount tests
2015-09-01 14:57:44 -07:00
Vishnu Kannan cc232c4707 Adding oom_score_adj as a container config param.
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2015-08-31 14:02:59 -07:00
Qiang Huang 085f465c00 Fix cgroup mount tests
I got:
```
exec_test.go:823: Mode expected to contain 'ro,nosuid,nodev,noexec': tmpfs on /sys/fs/cgroup type tmpfs (ro,seclabel,nosuid,nodev,noexec,relatime,mode=755
```wq

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-08-31 11:23:18 +08:00
Qiang Huang b7385e291c Always remount for bind mount
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-08-31 11:10:34 +08:00
Michael Crosby b1e7041957 Merge pull request #165 from calavera/context_labels
Make label.Relabel safer.
2015-08-28 14:20:00 -07:00
Matthew Heon 2ee6d1e8b6 Connect Seccomp configuration in Spec to configuration in Libcontainer
Signed-off-by: Matthew Heon <mheon@redhat.com>
2015-08-25 17:35:06 -04:00
Mrunal Patel 2f4c229a8c Merge pull request #215 from boucher/huikang-patch
Add hooks for passing explicit veth pairs for forwarding to CRIU
2015-08-24 21:23:29 -07:00
Hui Kang 7f23085c82 Add hooks for passing explicit veth pairs for forwarding to CRIU.
Signed-off-by: Hui Kang <hkang.sunysb@gmail.com>
2015-08-24 09:26:39 -07:00
boucher 8c812d0f50 Add the criu log file path to the failure message.
Signed-off-by: Ross Boucher <rboucher@gmail.com>
2015-08-21 14:20:59 -07:00
Mrunal Patel e7663a673e Merge pull request #70 from mheon/seccomp
Convert Seccomp support to use Libseccomp
2015-08-21 12:25:33 -07:00
Lai Jiangshan e48363d777 simplify a variable declaration
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
2015-08-20 08:21:44 +08:00
Mrunal Patel ca8831fa75 Merge pull request #183 from rajasec/securityfs
Adding securityfs mount
2015-08-18 14:24:38 -07:00
Mrunal Patel c20bda3f71 Merge pull request #206 from mountkin/ensure-cleanup
Ensure the cleanup jobs in the deferrer are executed on error
2015-08-18 14:16:31 -07:00
Michael Crosby b0ca535f75 Merge pull request #194 from LK4D4/fix_cgroups_again
Fix cgroups again
2015-08-18 13:49:31 -07:00
Michael Crosby c6b6be21c5 Merge pull request #199 from clnperez/ifrdatabyte-sign-pr
Fixing netlink build error on ppc64le with gccgo
2015-08-18 13:48:59 -07:00
rajasec 8cdc409715 Fixing tmpfs
Signed-off-by: rajasec <rajasec79@gmail.com>
2015-08-17 06:22:48 +05:30
Shijiang Wei f0679089b9 Ensure the cleanup jobs in the deferrer are executed on error
Signed-off-by: Shijiang Wei <mountkin@gmail.com>
2015-08-16 12:29:04 +08:00
Michael Chase-Salerno 9bc81d1699 Fixing netlink build error on ppc64le with gccgo
Again. It looks like a build tag was somehow dropped between
the PR here: https://github.com/docker/libcontainer/pull/625
and the move to runc.

Signed-off-by: Christy Perez <clnperez@linux.vnet.ibm.com>
2015-08-13 17:52:47 -05:00
Matthew Heon a6b73dbc73 Remove Seccomp build tag to fix godep
Signed-off-by: Matthew Heon <mheon@redhat.com>
2015-08-13 15:23:43 -04:00
Matthew Heon 59264040bd Update tests to not error on library v2.2.0 and lower
As v2.1.0 is no longer required for successful testing, do not build it in the
Dockerfile - instead just use the version Ubuntu ships.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2015-08-13 09:36:21 -04:00
Matthew Heon 2ae581ae62 Convert Seccomp support to use Libseccomp
This removes the existing, native Go seccomp filter generation and replaces it
with Libseccomp. Libseccomp is a C library which provides architecture
independent generation of Seccomp filters for the Linux kernel.

This adds a dependency on v2.2.1 or above of Libseccomp.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2015-08-13 07:56:27 -04:00
Lai Jiangshan e8817e1104 Simplify the return on process wait
Simplify the code introduced by the commit d1f0d5705deb:
    Return actual ProcessState on Wait error

Cc: Alexander Morozov <lk4d4@docker.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
2015-08-12 22:37:34 +08:00
Alexander Morozov 2b28b3c276 Always use cgroup root of current process
Because for host PID namespace /proc/1/cgroup can point to whole other
world of cgroups.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-08-11 18:04:59 -07:00
Alexander Morozov 5aa6005498 Revert "Fix cgroup parent searching"
This reverts commit 2f9052ca29.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-08-11 18:04:55 -07:00
Alexander Morozov 2f9052ca29 Fix cgroup parent searching
I had pretty convenient input data to miss this bug.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-08-10 14:30:05 -07:00
rajasec 24f7a10a93 Adding securityfs mount
Signed-off-by: rajasec <rajasec79@gmail.com>
2015-08-05 16:50:08 +05:30
Mrunal Patel f3a3025933 Fix minor stylistic issues
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-08-04 17:44:45 -04:00
Mrunal Patel c9d5850629 Don't make modifications to /dev there are no devices in the configuration
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-08-04 16:57:29 -04:00
Michael Crosby a5ef75b681 Add signal API to Container interface
This adds a `Signal()` method to the container interface so that the
initial process can be signaled after a Load or operation.  It also
implements signaling the init process from a nonChildProcess.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-08-03 17:07:29 -07:00
Mrunal Patel ce0a339632 Merge pull request #166 from gitido/fixes
Go1.5 compatibility fix
2015-08-03 13:51:26 -07:00
Michael Crosby 76e706f856 Merge pull request #151 from LK4D4/use_proc_exe
Use /proc/self/exe as default for InitPath
2015-08-03 16:15:33 -04:00
Michael Crosby b1821a4edc Merge pull request #150 from runcom/update-go-systemd-dbus-v3
Update go systemd dbus v3
2015-08-03 16:11:52 -04:00
Ido Yariv 86a85582d2 Don't set /proc/<PID>/setgroups to deny in Go1.5
A boolean field named GidMappingsEnableSetgroups was added to
SysProcAttr in Go1.5. This field determines the value of the process's
setgroups proc entry.

Since the default is to set the entry to 'deny', calling setgroups will
fail on systems running kernels 3.19+.

Set GidMappingsEnableSetgroups to true so setgroups wont be set to
'deny'.

Signed-off-by: Ido Yariv <ido@wizery.com>
2015-08-03 14:59:15 -04:00
Hui Kang 0f66ff921a Add debug message when unable to execute criu
Signed-off-by: Hui Kang <hkang.sunysb@gmail.com>
2015-08-03 17:09:45 +00:00