Commit Graph

79 Commits

Author SHA1 Message Date
Aleksa Sarai 2cd9c31b99
nsenter: guarantee correct user namespace ordering
Depending on your SELinux setup, the order in which you join namespaces
can be important. In general, user namespaces should *always* be joined
and unshared first because then the other namespaces are correctly
pinned and you have the right priviliges within them. This also is very
useful for rootless containers, as well as older kernels that had
essentially broken unshare(2) and clone(2) implementations.

This also includes huge refactorings in how we spawn processes for
complicated reasons that I don't want to get into because it will make
me spiral into a cloud of rage. The reasoning is in the giant comment in
clone_parent. Have fun.

In addition, because we now create multiple children with CLONE_PARENT,
we cannot wait for them to SIGCHLD us in the case of a death. Thus, we
have to resort to having a child kindly send us their exit code before
they die. Hopefully this all works okay, but at this point there's not
much more than we can do.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-10-04 16:17:55 +11:00
Aleksa Sarai ed053a740c
nsenter: specify namespace type in setns()
This avoids us from running into cases where libcontainer thinks that a
particular namespace file is a different type, and makes it a fatal
error rather than causing broken functionality.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-10-04 16:17:55 +11:00
Wang Long 59a241f647 update the comment for container.Pause() method on linux
if a container state is running or created, the container.Pause()
method can set the state to pausing, and then paused.

this patch update the comment, so it can be consistent with the code.

Signed-off-by: Wang Long <long.wanglong@huawei.com>
2016-09-20 10:49:04 +08:00
Qiang Huang 1e319efa36 Merge pull request #815 from rajasec/basecont-comments
Updated the libcontainer interface comments
2016-08-26 09:43:50 +08:00
Michael Crosby 46d9535096 Merge pull request #934 from macrosheep/fix-initargs
Fix and refactor init args
2016-08-24 10:06:01 -07:00
rajasec 1ea17d73fe Updated the libcontainer interface comments
Signed-off-by: rajasec <rajasec79@gmail.com>
2016-08-23 19:14:27 +05:30
Phil Estes 85f4d20b44
Restored-from-checkpoint containers should have a start time
Set the start time similar to a brand new container.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2016-08-21 18:15:18 -04:00
Qiang Huang 41b12c095b Merge pull request #913 from cloudfoundry-incubator/addgroupsnocompatible
Let the user explicitly specify `additionalGids` on `runc exec`
2016-07-15 10:12:31 +08:00
Yang Hongyang a59d63c5d3 Fix and refactor init args
1. According to docs of Cmd.Path and Cmd.Args from package "os/exec":
   Path is the path of the command to run. Args holds command line
   arguments, including the command as Args[0]. We have mixed usage
   of args. In InitPath(), InitArgs only take arguments, in InitArgs(),
   InitArgs including the command as Args[0]. This is confusing.
2. InitArgs() already have the ability to configure a LinuxFactory
   with the provided absolute path to the init binary and arguements as
   InitPath() does.
3. exec.Command() will take care of serching executable path.
4. The default "/proc/self/exe" instead of os.Args[0] is passed to
   InitArgs in order to allow relative path for the runC binary.

Signed-off-by: Yang Hongyang <imhy.yang@gmail.com>
2016-07-06 23:21:02 -04:00
Qiang Huang 14e95b2aa9 Make state detection precise
Fixes: https://github.com/opencontainers/runc/issues/871

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-07-05 08:24:13 +08:00
Petar Petrov f9b72b1b46 Allow additional groups to be overridden in exec
Signed-off-by: Julian Friedman <julz.friedman@uk.ibm.com>
Signed-off-by: Petar Petrov <pppepito86@gmail.com>
Signed-off-by: Georgi Sabev <georgethebeatle@gmail.com>
2016-06-21 10:35:11 +03:00
Mrunal Patel f5b6ff23b8 Merge pull request #881 from rajasec/update-status
Update for stopped container
2016-06-13 16:05:25 -07:00
Michael Crosby 3aacff695d Use fifo for create/start
This removes the use of a signal handler and SIGCONT to signal the init
process to exec the users process.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-06-13 11:26:53 -07:00
rajasec 12869604ca Update for stopped container
Signed-off-by: rajasec <rajasec79@gmail.com>
2016-06-04 22:08:08 +05:30
Michael Crosby 1d61abea46 Allow delete of created container
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-06-02 12:26:12 -07:00
Michael Crosby 6eba9b8ffb Fix SystemError and env lookup
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-31 11:10:47 -07:00
Michael Crosby efcd73fb5b Fix signal handling for unit tests
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-31 11:10:47 -07:00
Michael Crosby 30f1006b33 Fix libcontainer states
Move initialized to created and destoryed to stopped.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-31 11:06:41 -07:00
Michael Crosby 3fe7d7f31e Add create and start command for container lifecycle
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-31 11:06:41 -07:00
Andrew Vagin c161e65ac6 cr: don't fill veth devices if netns is in EmptyNs
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-05-28 01:19:54 +03:00
Qiang Huang b6e23f8166 Add comments for error cases in status functions
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-05-16 18:24:07 +08:00
Michael Crosby 7dd87976ed Merge pull request #758 from rajasec/container-pause-comment
Update the comment for container pause
2016-04-19 16:16:41 -07:00
Michael Crosby 6978875298 Add cause to error messages
This is the inital port of the libcontainer.Error to added a cause to
all the existing error messages.  Going forward, when an error can be
wrapped because it is not being checked at the higher levels for
something like `os.IsNotExist` we can add more information to the error
message like cause and stack file/line information.  This will help
higher level tools to know what cause a container start or operation to
fail.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-04-18 11:37:26 -07:00
rajasec ccbd0a176f Update the comment for container pause
Signed-off-by: rajasec <rajasec79@gmail.com>
2016-04-16 14:59:19 +05:30
Akihiro Suda 1829531241 Fix trivial style errors reported by `go vet` and `golint`
No substantial code change.
Note that some style errors reported by `golint` are not fixed due to possible compatibility issues.

Signed-off-by: Akihiro Suda <suda.kyoto@gmail.com>
2016-04-12 08:13:16 +00:00
George Lestaris f7ae27bfb7 HookState adhears to OCI
Signed-off-by: George Lestaris <glestaris@pivotal.io>
Signed-off-by: Ed King <eking@pivotal.io>
2016-04-06 16:57:59 +01:00
Peng Gao 3fa246609c Fix typo
Signed-off-by: Peng Gao <peng.gao.dut@gmail.com>
2016-03-27 12:44:16 +08:00
Jessica Frazelle 2c5b10189c
remove deadcode
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-03-17 13:36:28 -07:00
Michael Crosby 732a0fb440 Merge pull request #638 from hqhq/hq_fix_bootstrapData
Fix encoding gid mappings
2016-03-14 11:55:12 -07:00
Mrunal Patel 459efccb0a Merge pull request #576 from avagin/cr
Call Prestart hooks before restoring processes
2016-03-14 11:21:29 -07:00
Qiang Huang 2f2c83a2a0 Fix encoding gid mappings
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-03-12 13:18:42 +08:00
Michael Crosby 20422c9bd9 Update libcontainer to support rlimit per process
This updates runc and libcontainer to handle rlimits per process and set
them correctly for the container.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-10 14:35:16 -08:00
Michael Crosby 3cc90bd2d8 Add support for process overrides of settings
This commit adds support to libcontainer to allow caps, no new privs,
apparmor, and selinux process label to the process struct so that it can
be used together of override the base settings on the container config
per individual process.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-03 11:41:33 -08:00
Ido Yariv 78f5148c67 Fix handling of unsupported namespaces
currentState() always adds all possible namespaces to the state,
regardless of whether they are supported.
If orderNamespacePaths detects an unsupported namespace, an error is
returned that results in initialization failure.

Fix this by only adding paths of supported namespaces to the state.

Signed-off-by: Ido Yariv <ido@wizery.com>
2016-03-02 10:16:51 -05:00
Daniel, Dao Quang Minh 42d5d04801 Sets custom namespaces for init processes
An init process can join other namespaces (pidns, ipc etc.). This leverages
C code defined in nsenter package to spawn a process with correct namespaces
and clone if necessary.

This moves all setns and cloneflags related code to nsenter layer, which mean
that we dont use Go os/exec to create process with cloneflags and set
uid/gid_map or setgroups anymore. The necessary data is passed from Go to C
using a netlink binary-encoding format.

With this change, setns and init processes are almost the same, which brings
some opportunity for refactoring.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
[mickael.laventure@docker.com: adapted to apply on master @ d97d5e]
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@docker.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh d6bf4049f8 OrderNamespacePaths gets correct order of ns
This adds orderNamespacePaths to get correct order of namespaces for the
bootstrap program to join.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 12:26:53 -08:00
Stefan Berger 5fbf791e31 Create unique session key name for every container
Create a unique session key name for every container. Use the pattern
_ses.<postfix> with postfix being the container's Id.

This patch does not prevent containers from joining each other's session
keyring.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2016-02-24 08:39:52 -05:00
Andrew Vagin b8121e8998 checkpoint: call Prestart hooks on restore before restoring processes
Docker uses Prestart hooks to call a libnetwork hook to create
network devices and set addesses and routes.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-02-19 02:40:26 +03:00
Andrew Vagin 46c25be297 checkpoint: add support of the EmptyNs criu option
This options is set a namespace mask which will not be dumped and restored.
For example, we are going to use this option to restore network
for docker containers. CRIU will create a network namespace and
call a libnetwork hook to restore network devices, addresses and routes.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-02-19 02:40:26 +03:00
Andrew Vagin a2a771b8e2 libcontainer: update criurpc.proto
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-02-19 02:38:02 +03:00
Michael Crosby 1172a1e1e5 Update list command and created methods
We don't need a CreatedTime method on the container because it's not
part of the interface and can be received via the state.  We also do not
need to call it CreateTime because the type of this field is time.Time
so we know its time.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-01-28 13:32:24 -08:00
Michael Crosby 480e5f4416 Merge pull request #507 from mikebrow/runc-ls-command
adds list command
2016-01-28 13:20:07 -08:00
Mike Brown 4c871267db adds list command, and a timestamp in the container state
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2016-01-28 14:21:06 -06:00
Michael Crosby ddcee3cc2a Do not use stream encoders
Marshall the raw objects for the sync pipes so that no new line chars
are left behind in the pipe causing errors.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-01-26 11:22:05 -08:00
Michael Crosby 556f798a19 Fix various state bugs for pause and destroy
There were issues where a process could die before pausing completed
leaving the container in an inconsistent state and unable to be
destoryed.  This makes sure that if the container is paused and the
process is dead it will unfreeze the cgroup before removing them.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-01-21 16:43:33 -08:00
Doug Davis 49dfa1b62d Remove some hard coded strings
Signed-off-by: Doug Davis <dug@us.ibm.com>
2016-01-19 19:02:31 -08:00
Mrunal Patel 6259f09e97 Merge pull request #426 from gitido/pressure_level
libcontainer: Add support for memcg pressure notifications
2016-01-14 16:23:07 -08:00
Jimmi Dyson 91c7024e52 Revert to non-recursive GetPids, add recursive GetAllPids
Signed-off-by: Jimmi Dyson <jimmidyson@gmail.com>
2016-01-08 19:42:25 +00:00
Ido Yariv 55a8d686a9 libcontainer: Add support for memcg pressure notifications
It may be desirable to receive memory pressure levels notifications
before the container depletes all memory. This may be useful for
handling cases where the system thrashes when reaching the container's
memory limits.

Signed-off-by: Ido Yariv <ido@wizery.com>
2015-12-28 13:36:55 -05:00
Michael Crosby 4415446c32 Add state pattern for container state transition
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add state status() method

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Allow multiple checkpoint on restore

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Handle leave-running state

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Fix state transitions for inprocess

Because the tests use libcontainer in process between the various states
we need to ensure that that usecase works as well as the out of process
one.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Remove isDestroyed method

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Handling Pausing from freezer state

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

freezer status

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

Fixing review comments

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

Added comment when freezer not available

Signed-off-by: Rajasekaran <rajasec79@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Conflicts:
	libcontainer/container_linux.go

Change checkFreezer logic to isPaused()

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Remove state base and factor out destroy func

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add unit test for state transitions

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-12-17 13:55:38 -08:00