Commit Graph

205 Commits

Author SHA1 Message Date
Qiang Huang 94cfb7955b Merge pull request #1387 from avagin/freezer
Don't try to read freezer.state from the current directory
2017-04-24 20:02:45 -05:00
Mrunal Patel 97db1eaad9 Merge pull request #1396 from harche/cstate
Set container state only once during start
2017-04-17 11:32:42 -07:00
Mrunal Patel 7814a0d14b Merge pull request #1399 from avagin/cr-cgroup
restore: apply resource limits
2017-04-13 11:28:28 -07:00
Andrei Vagin 57ef30a2ae restore: apply resource limits
When C/R was implemented, it was enough to call manager.Set to apply
limits and to move a task. Now .Set() and .Apply() have to be called
separately.

Fixes: 8a740d5391 ("libcontainer: cgroups: don't Set in Apply")
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-04-07 02:47:43 +03:00
Adrian Reber 273b7853c8 checkpoint: check if system supports pre-dumping
Instead of relying on version numbers it is possible to check if CRIU
actually supports certain features. This introduces an initial
implementation to check if CRIU and the underlying kernel actually
support dirty memory tracking for memory pre-dumping.

Upstream CRIU also supports the lazy-page migration feature check and
additional feature checks can be included in CRIU to reduce the version
number parsing. There are also certain CRIU features which depend on one
side on the CRIU version but also require certain kernel versions to
actually work. CRIU knows if it can do certain things on the kernel it
is running on and using the feature check RPC interface makes it easier
for runc to decide if the criu+kernel combination will support that
feature.

Feature checking was introduced with CRIU 1.8. Running with older CRIU
versions will ignore the feature check functionality and behave just
like it used to.

v2:
 - Do not use reflection to compare requested and responded
   features. Checking which feature is available is now hardcoded
   and needs to be adapted for every new feature check. The code
   is now much more readable and simpler.

v3:
 - Move the variable criuFeat out of the linuxContainer struct,
   as it is not container specific. Now it is a global variable.

Signed-off-by: Adrian Reber <areber@redhat.com>
2017-04-06 11:17:52 +00:00
Harshal Patil 1be5d31da2 Set container state only once during start
Signed-off-by: Harshal Patil <harshal.patil@in.ibm.com>
2017-04-04 15:08:04 +05:30
Aleksa Sarai f0876b0427
libcontainer: configs: add proper HostUID and HostGID
Previously Host{U,G}ID only gave you the root mapping, which isn't very
useful if you are trying to do other things with the IDMaps.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:20 +11:00
Aleksa Sarai baeef29858
rootless: add rootless cgroup manager
The rootless cgroup manager acts as a noop for all set and apply
operations. It is just used for rootless setups. Currently this is far
too simple (we need to add opportunistic cgroup management), but is good
enough as a first-pass at a noop cgroup manager.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:20 +11:00
Aleksa Sarai d2f49696b0
runc: add support for rootless containers
This enables the support for the rootless container mode. There are many
restrictions on what rootless containers can do, so many different runC
commands have been disabled:

* runc checkpoint
* runc events
* runc pause
* runc ps
* runc restore
* runc resume
* runc update

The following commands work:

* runc create
* runc delete
* runc exec
* runc kill
* runc list
* runc run
* runc spec
* runc state

In addition, any specification options that imply joining cgroups have
also been disabled. This is due to support for unprivileged subtree
management not being available from Linux upstream.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:45:24 +11:00
Aleksa Sarai 6bd4bd9030
*: handle unprivileged operations and !dumpable
Effectively, !dumpable makes implementing rootless containers quite
hard, due to a bunch of different operations on /proc/self no longer
being possible without reordering everything.

!dumpable only really makes sense when you are switching between
different security contexts, which is only the case when we are joining
namespaces. Unfortunately this means that !dumpable will still have
issues in this instance, and it should only be necessary to set
!dumpable if we are not joining USER namespaces (new kernels have
protections that make !dumpable no longer necessary). But that's a topic
for another time.

This also includes code to unset and then re-set dumpable when doing the
USER namespace mappings. This should also be safe because in principle
processes in a container can't see us until after we fork into the PID
namespace (which happens after the user mapping).

In rootless containers, it is not possible to set a non-dumpable
process's /proc/self/oom_score_adj (it's owned by root and thus not
writeable). Thus, it needs to be set inside nsexec before we set
ourselves as non-dumpable.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:45:19 +11:00
Andrei Vagin 88256d646d Don't try to read freezer.state from the current directory
If we try to pause a container on the system without freezer cgroups,
we can found that runc tries to open ./freezer.state. It is obviously wrong.

$ ./runc pause test
no such directory for freezer.state

$ echo FROZEN > freezer.state
$ ./runc pause test
container not running or created: paused

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-03-23 01:58:45 +03:00
Michael Crosby 00a0ecf554 Add separate console socket
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-03-16 10:23:59 -07:00
Mrunal Patel 4f9cb13b64 Update runtime spec to 1.0.0.rc5
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-15 11:38:37 -07:00
Qiang Huang b7932a2e07 Remove unused ExecFifoPath
In container process's Init function, we use
fd + execFifoFilename to open exec fifo, so this
field in init config is never used.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-09 10:58:16 +08:00
Qiang Huang 707dd48b2f Merge pull request #1001 from x1022as/predump
add pre-dump and parent-path to checkpoint
2017-02-24 10:55:06 -08:00
Qiang Huang 733563552e Fix state when _LIBCONTAINER in environment
Fixes: #1311

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-02-22 10:35:14 -08:00
Qiang Huang 805b8c73d3 Do not create exec fifo in factory.Create
It should not be binded to container creation, for
example, runc restore needs to create a
libcontainer.Container, but it won't need exec fifo.

So create exec fifo when container is started or run,
where we really need it.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-02-22 10:34:48 -08:00
Deng Guangxing 98f004182b add pre-dump and parent-path to checkpoint
CRIU gets pre-dump to complete iterative migration.
pre-dump saves process memory info only. And it need parent-path
to specify the former memory files.

This patch add pre-dump and parent-path arguments to runc checkpoint

Signed-off-by: Deng Guangxing <dengguangxing@huawei.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
2017-02-14 19:45:07 +08:00
Aleksa Sarai e034cedce7
libcontainer: init: only pass stateDirFd when creating a container
If we pass a file descriptor to the host filesystem while joining a
container, there is a race condition where a process inside the
container can ptrace(2) the joining process and stop it from closing its
file descriptor to the stateDirFd. Then the process can access the
*host* filesystem from that file descriptor. This was fixed in part by
5d93fed3d2 ("Set init processes as non-dumpable"), but that fix is
more of a hail-mary than an actual fix for the underlying issue.

To fix this, don't open or pass the stateDirFd to the init process
unless we're creating a new container. A proper fix for this would be to
remove the need for even passing around directory file descriptors
(which are quite dangerous in the context of mount namespaces).

There is still an issue with containers that have CAP_SYS_PTRACE and are
using the setns(2)-style of joining a container namespace. Currently I'm
not really sure how to fix it without rampant layer violation.

Fixes: CVE-2016-9962
Fixes: 5d93fed3d2 ("Set init processes as non-dumpable")
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-02-02 00:41:11 +11:00
Qiang Huang db99936a0e Merge pull request #1110 from avagin/cpt-in-userns
checkpoint: handle config.Devices and config.MaskPaths
2017-01-10 00:34:40 -06:00
Zhang Wei a344b2d6a8 sync up `HookState` with OCI spec `State`
`HookState` struct should follow definition of `State` in runtime-spec:

* modify json name of `version` to `ociVersion`.
* Remove redundant `Rootfs` field as rootfs can be retrived from
`bundlePath/config.json`.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2016-12-20 00:00:43 +08:00
Mrunal Patel 34f23cb99c Merge pull request #1018 from cyphar/console-rewrite
Consoles, consoles, consoles.
2016-12-07 14:37:19 -08:00
Xianlu Bird e2e6f58e4e Fix typo
Fix typo
2016-12-01 15:23:58 +08:00
Aleksa Sarai 244c9fc426
*: console rewrite
This implements {createTTY, detach} and all of the combinations and
negations of the two that were previously implemented. There are some
valid questions about out-of-OCI-scope topics like !createTTY and how
things should be handled (why do we dup the current stdio to the
process, and how is that not a security issue). However, these will be
dealt with in a separate patchset.

In order to allow for late console setup, split setupRootfs into the
"preparation" section where all of the mounts are created and the
"finalize" section where we pivot_root and set things as ro. In between
the two we can set up all of the console mountpoints and symlinks we
need.

We use two-stage synchronisation to ensures that when the syscalls are
reordered in a suboptimal way, an out-of-place read() on the parentPipe
will not gobble the ancilliary information.

This patch is part of the console rewrite patchset.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-12-01 15:49:36 +11:00
Michael Crosby e58671e530 Add --all flag to kill
This allows a user to send a signal to all the processes in the
container within a single atomic action to avoid new processes being
forked off before the signal can be sent.

This is basically taking functionality that we already use being
`delete` and exposing it ok the `kill` command by adding a flag.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-11-08 09:35:02 -08:00
Andrei Vagin 040fb7311c checkpoint: handle config.Devices and config.MaskPaths
In user namespaces devices are bind-mounted from the host, so
we need to add them as external mounts for CRIU.

Reported-by: Ross Boucher <boucher@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2016-10-26 23:50:54 +03:00
Aleksa Sarai 2cd9c31b99
nsenter: guarantee correct user namespace ordering
Depending on your SELinux setup, the order in which you join namespaces
can be important. In general, user namespaces should *always* be joined
and unshared first because then the other namespaces are correctly
pinned and you have the right priviliges within them. This also is very
useful for rootless containers, as well as older kernels that had
essentially broken unshare(2) and clone(2) implementations.

This also includes huge refactorings in how we spawn processes for
complicated reasons that I don't want to get into because it will make
me spiral into a cloud of rage. The reasoning is in the giant comment in
clone_parent. Have fun.

In addition, because we now create multiple children with CLONE_PARENT,
we cannot wait for them to SIGCHLD us in the case of a death. Thus, we
have to resort to having a child kindly send us their exit code before
they die. Hopefully this all works okay, but at this point there's not
much more than we can do.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-10-04 16:17:55 +11:00
Aleksa Sarai ed053a740c
nsenter: specify namespace type in setns()
This avoids us from running into cases where libcontainer thinks that a
particular namespace file is a different type, and makes it a fatal
error rather than causing broken functionality.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-10-04 16:17:55 +11:00
Wang Long 59a241f647 update the comment for container.Pause() method on linux
if a container state is running or created, the container.Pause()
method can set the state to pausing, and then paused.

this patch update the comment, so it can be consistent with the code.

Signed-off-by: Wang Long <long.wanglong@huawei.com>
2016-09-20 10:49:04 +08:00
Qiang Huang 1e319efa36 Merge pull request #815 from rajasec/basecont-comments
Updated the libcontainer interface comments
2016-08-26 09:43:50 +08:00
Michael Crosby 46d9535096 Merge pull request #934 from macrosheep/fix-initargs
Fix and refactor init args
2016-08-24 10:06:01 -07:00
rajasec 1ea17d73fe Updated the libcontainer interface comments
Signed-off-by: rajasec <rajasec79@gmail.com>
2016-08-23 19:14:27 +05:30
Phil Estes 85f4d20b44
Restored-from-checkpoint containers should have a start time
Set the start time similar to a brand new container.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2016-08-21 18:15:18 -04:00
Qiang Huang 41b12c095b Merge pull request #913 from cloudfoundry-incubator/addgroupsnocompatible
Let the user explicitly specify `additionalGids` on `runc exec`
2016-07-15 10:12:31 +08:00
Yang Hongyang a59d63c5d3 Fix and refactor init args
1. According to docs of Cmd.Path and Cmd.Args from package "os/exec":
   Path is the path of the command to run. Args holds command line
   arguments, including the command as Args[0]. We have mixed usage
   of args. In InitPath(), InitArgs only take arguments, in InitArgs(),
   InitArgs including the command as Args[0]. This is confusing.
2. InitArgs() already have the ability to configure a LinuxFactory
   with the provided absolute path to the init binary and arguements as
   InitPath() does.
3. exec.Command() will take care of serching executable path.
4. The default "/proc/self/exe" instead of os.Args[0] is passed to
   InitArgs in order to allow relative path for the runC binary.

Signed-off-by: Yang Hongyang <imhy.yang@gmail.com>
2016-07-06 23:21:02 -04:00
Qiang Huang 14e95b2aa9 Make state detection precise
Fixes: https://github.com/opencontainers/runc/issues/871

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-07-05 08:24:13 +08:00
Petar Petrov f9b72b1b46 Allow additional groups to be overridden in exec
Signed-off-by: Julian Friedman <julz.friedman@uk.ibm.com>
Signed-off-by: Petar Petrov <pppepito86@gmail.com>
Signed-off-by: Georgi Sabev <georgethebeatle@gmail.com>
2016-06-21 10:35:11 +03:00
Mrunal Patel f5b6ff23b8 Merge pull request #881 from rajasec/update-status
Update for stopped container
2016-06-13 16:05:25 -07:00
Michael Crosby 3aacff695d Use fifo for create/start
This removes the use of a signal handler and SIGCONT to signal the init
process to exec the users process.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-06-13 11:26:53 -07:00
rajasec 12869604ca Update for stopped container
Signed-off-by: rajasec <rajasec79@gmail.com>
2016-06-04 22:08:08 +05:30
Michael Crosby 1d61abea46 Allow delete of created container
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-06-02 12:26:12 -07:00
Michael Crosby 6eba9b8ffb Fix SystemError and env lookup
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-31 11:10:47 -07:00
Michael Crosby efcd73fb5b Fix signal handling for unit tests
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-31 11:10:47 -07:00
Michael Crosby 30f1006b33 Fix libcontainer states
Move initialized to created and destoryed to stopped.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-31 11:06:41 -07:00
Michael Crosby 3fe7d7f31e Add create and start command for container lifecycle
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-31 11:06:41 -07:00
Andrew Vagin c161e65ac6 cr: don't fill veth devices if netns is in EmptyNs
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-05-28 01:19:54 +03:00
Qiang Huang b6e23f8166 Add comments for error cases in status functions
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-05-16 18:24:07 +08:00
Michael Crosby 7dd87976ed Merge pull request #758 from rajasec/container-pause-comment
Update the comment for container pause
2016-04-19 16:16:41 -07:00
Michael Crosby 6978875298 Add cause to error messages
This is the inital port of the libcontainer.Error to added a cause to
all the existing error messages.  Going forward, when an error can be
wrapped because it is not being checked at the higher levels for
something like `os.IsNotExist` we can add more information to the error
message like cause and stack file/line information.  This will help
higher level tools to know what cause a container start or operation to
fail.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-04-18 11:37:26 -07:00
rajasec ccbd0a176f Update the comment for container pause
Signed-off-by: rajasec <rajasec79@gmail.com>
2016-04-16 14:59:19 +05:30
Akihiro Suda 1829531241 Fix trivial style errors reported by `go vet` and `golint`
No substantial code change.
Note that some style errors reported by `golint` are not fixed due to possible compatibility issues.

Signed-off-by: Akihiro Suda <suda.kyoto@gmail.com>
2016-04-12 08:13:16 +00:00
George Lestaris f7ae27bfb7 HookState adhears to OCI
Signed-off-by: George Lestaris <glestaris@pivotal.io>
Signed-off-by: Ed King <eking@pivotal.io>
2016-04-06 16:57:59 +01:00
Peng Gao 3fa246609c Fix typo
Signed-off-by: Peng Gao <peng.gao.dut@gmail.com>
2016-03-27 12:44:16 +08:00
Jessica Frazelle 2c5b10189c
remove deadcode
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-03-17 13:36:28 -07:00
Michael Crosby 732a0fb440 Merge pull request #638 from hqhq/hq_fix_bootstrapData
Fix encoding gid mappings
2016-03-14 11:55:12 -07:00
Mrunal Patel 459efccb0a Merge pull request #576 from avagin/cr
Call Prestart hooks before restoring processes
2016-03-14 11:21:29 -07:00
Qiang Huang 2f2c83a2a0 Fix encoding gid mappings
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-03-12 13:18:42 +08:00
Michael Crosby 20422c9bd9 Update libcontainer to support rlimit per process
This updates runc and libcontainer to handle rlimits per process and set
them correctly for the container.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-10 14:35:16 -08:00
Michael Crosby 3cc90bd2d8 Add support for process overrides of settings
This commit adds support to libcontainer to allow caps, no new privs,
apparmor, and selinux process label to the process struct so that it can
be used together of override the base settings on the container config
per individual process.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-03 11:41:33 -08:00
Ido Yariv 78f5148c67 Fix handling of unsupported namespaces
currentState() always adds all possible namespaces to the state,
regardless of whether they are supported.
If orderNamespacePaths detects an unsupported namespace, an error is
returned that results in initialization failure.

Fix this by only adding paths of supported namespaces to the state.

Signed-off-by: Ido Yariv <ido@wizery.com>
2016-03-02 10:16:51 -05:00
Daniel, Dao Quang Minh 42d5d04801 Sets custom namespaces for init processes
An init process can join other namespaces (pidns, ipc etc.). This leverages
C code defined in nsenter package to spawn a process with correct namespaces
and clone if necessary.

This moves all setns and cloneflags related code to nsenter layer, which mean
that we dont use Go os/exec to create process with cloneflags and set
uid/gid_map or setgroups anymore. The necessary data is passed from Go to C
using a netlink binary-encoding format.

With this change, setns and init processes are almost the same, which brings
some opportunity for refactoring.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
[mickael.laventure@docker.com: adapted to apply on master @ d97d5e]
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@docker.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh d6bf4049f8 OrderNamespacePaths gets correct order of ns
This adds orderNamespacePaths to get correct order of namespaces for the
bootstrap program to join.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 12:26:53 -08:00
Stefan Berger 5fbf791e31 Create unique session key name for every container
Create a unique session key name for every container. Use the pattern
_ses.<postfix> with postfix being the container's Id.

This patch does not prevent containers from joining each other's session
keyring.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2016-02-24 08:39:52 -05:00
Andrew Vagin b8121e8998 checkpoint: call Prestart hooks on restore before restoring processes
Docker uses Prestart hooks to call a libnetwork hook to create
network devices and set addesses and routes.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-02-19 02:40:26 +03:00
Andrew Vagin 46c25be297 checkpoint: add support of the EmptyNs criu option
This options is set a namespace mask which will not be dumped and restored.
For example, we are going to use this option to restore network
for docker containers. CRIU will create a network namespace and
call a libnetwork hook to restore network devices, addresses and routes.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-02-19 02:40:26 +03:00
Andrew Vagin a2a771b8e2 libcontainer: update criurpc.proto
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-02-19 02:38:02 +03:00
Michael Crosby 1172a1e1e5 Update list command and created methods
We don't need a CreatedTime method on the container because it's not
part of the interface and can be received via the state.  We also do not
need to call it CreateTime because the type of this field is time.Time
so we know its time.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-01-28 13:32:24 -08:00
Michael Crosby 480e5f4416 Merge pull request #507 from mikebrow/runc-ls-command
adds list command
2016-01-28 13:20:07 -08:00
Mike Brown 4c871267db adds list command, and a timestamp in the container state
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2016-01-28 14:21:06 -06:00
Michael Crosby ddcee3cc2a Do not use stream encoders
Marshall the raw objects for the sync pipes so that no new line chars
are left behind in the pipe causing errors.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-01-26 11:22:05 -08:00
Michael Crosby 556f798a19 Fix various state bugs for pause and destroy
There were issues where a process could die before pausing completed
leaving the container in an inconsistent state and unable to be
destoryed.  This makes sure that if the container is paused and the
process is dead it will unfreeze the cgroup before removing them.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-01-21 16:43:33 -08:00
Doug Davis 49dfa1b62d Remove some hard coded strings
Signed-off-by: Doug Davis <dug@us.ibm.com>
2016-01-19 19:02:31 -08:00
Mrunal Patel 6259f09e97 Merge pull request #426 from gitido/pressure_level
libcontainer: Add support for memcg pressure notifications
2016-01-14 16:23:07 -08:00
Jimmi Dyson 91c7024e52 Revert to non-recursive GetPids, add recursive GetAllPids
Signed-off-by: Jimmi Dyson <jimmidyson@gmail.com>
2016-01-08 19:42:25 +00:00
Ido Yariv 55a8d686a9 libcontainer: Add support for memcg pressure notifications
It may be desirable to receive memory pressure levels notifications
before the container depletes all memory. This may be useful for
handling cases where the system thrashes when reaching the container's
memory limits.

Signed-off-by: Ido Yariv <ido@wizery.com>
2015-12-28 13:36:55 -05:00
Michael Crosby 4415446c32 Add state pattern for container state transition
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add state status() method

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Allow multiple checkpoint on restore

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Handle leave-running state

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Fix state transitions for inprocess

Because the tests use libcontainer in process between the various states
we need to ensure that that usecase works as well as the out of process
one.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Remove isDestroyed method

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Handling Pausing from freezer state

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

freezer status

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

Fixing review comments

Signed-off-by: Rajasekaran <rajasec79@gmail.com>

Added comment when freezer not available

Signed-off-by: Rajasekaran <rajasec79@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Conflicts:
	libcontainer/container_linux.go

Change checkFreezer logic to isPaused()

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Remove state base and factor out destroy func

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add unit test for state transitions

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-12-17 13:55:38 -08:00
Mrunal Patel 55a49f2110 Move the cgroups setting into a Resources struct
This allows us to distinguish cases where a container
needs to just join the paths or also additionally
set cgroups settings. This will help in implementing
cgroupsPath support in the spec.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-12-16 15:53:31 -05:00
Daniel, Dao Quang Minh 7d423cb7a1 setns: replace env with netlink for bootstrap data
replace passing of pid and console path via environment variable with passing
them with netlink message via an established pipe.

this change requires us to set _LIBCONTAINER_INITTYPE and
_LIBCONTAINER_INITPIPE as the env environment of the bootstrap process as we
only send the bootstrap data for setns process right now. When init and setns
bootstrap process are unified (i.e., init use nsexec instead of Go to clone new
process), we can remove _LIBCONTAINER_INITTYPE.

Note:
- we read nlmsghdr first before reading the content so we can get the total
  length of the payload and allocate buffer properly instead of allocating
  one large buffer.

- check read bytes vs the wanted number. It's an error if we failed to read
  the desired number of bytes from the pipe into the buffer.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2015-12-03 18:03:48 +00:00
Daniel, Dao Quang Minh d914bf7347 setns: add bootstrap data
add bootstrap data to setns process. If we have any bootstrap data then copy it
to the bootstrap process (i.e. nsexec) using the sync pipe. This will allow us
to eventually replace environment variable usage with more structured data
to setup namespaces, write pid/gid map, setgroup etc.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2015-11-22 11:36:58 +00:00
Michael Crosby 2be14dc963 Merge pull request #392 from mrunalp/poststart
Add poststart hooks
2015-11-12 16:34:38 -08:00
Michael Crosby 879dfdd980 Fix race setting process opts
When starting and quering for pids a container can start and exit before
this is set.  So set the opts after the process is started and while
libcontainer still has the container's process blocking on the pipe.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-11-06 16:51:59 -08:00
Mrunal Patel 452e8a73c5 Integrate poststart hooks with spec
* Call poststart hooks after the container is started
* Tie in with spec configuration

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-11-06 18:03:32 -05:00
John Howard a919bd3f67 Windows: Refactor Container interface
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-11-02 15:12:16 -08:00
Mrunal Patel 7caef5626b Merge pull request #359 from jhowardmsft/jjh/state_struct
Windows: Refactor state struct
2015-11-02 15:04:12 -08:00
John Howard fe1cce69b3 Windows: Refactor state struct
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-10-26 14:45:20 -07:00
Adrian Reber c42ef59bf9 Add criu related debug output
While testing different versions of criu it helps to know which
criu binary with which options is currently used. Therefore additional
debug output to display these information is added.

v2: increase readability of printed out criu options

Signed-off-by: Adrian Reber <adrian@lisas.de>
2015-10-13 10:41:00 +02:00
Hui Kang 25da513c4b Add option to support criu manage cgroups mode for dump and restore
CRIU supports cgroup-manage mode from v1.7

Signed-off-by: Hui Kang <hkang.sunysb@gmail.com>
2015-10-11 04:42:54 +00:00
Mrunal Patel dcafe48737 Add version to HookState to make it json-compatible with spec State
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-09-23 17:13:00 -07:00
Mrunal Patel ef9471fd5b Merge pull request #253 from avagin/cr-cgroups
c/r: create cgroups to restore a container
2015-09-11 18:03:40 -07:00
David Calavera 0f28592b35 Turn hook pointers into values.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-11 11:34:34 -07:00
Michael Crosby dd969cbacd Add test for function based hooks
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-10 18:15:00 -07:00
Michael Crosby 05567f2c94 Implement hooks in libcontainer
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-10 17:57:31 -07:00
Andrey Vagin df39686c93 c/r: create cgroups to restore a container
Here are two reasons:
* If we use systemd, we need to ask it to create cgroups
* If a container is restored with another ID, we need to
  change paths to cgroups.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
2015-09-10 21:00:27 +03:00
Mrunal Patel 2f4c229a8c Merge pull request #215 from boucher/huikang-patch
Add hooks for passing explicit veth pairs for forwarding to CRIU
2015-08-24 21:23:29 -07:00
Hui Kang 7f23085c82 Add hooks for passing explicit veth pairs for forwarding to CRIU.
Signed-off-by: Hui Kang <hkang.sunysb@gmail.com>
2015-08-24 09:26:39 -07:00
boucher 8c812d0f50 Add the criu log file path to the failure message.
Signed-off-by: Ross Boucher <rboucher@gmail.com>
2015-08-21 14:20:59 -07:00
Mrunal Patel f3a3025933 Fix minor stylistic issues
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-08-04 17:44:45 -04:00
Michael Crosby a5ef75b681 Add signal API to Container interface
This adds a `Signal()` method to the container interface so that the
initial process can be signaled after a Load or operation.  It also
implements signaling the init process from a nonChildProcess.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-08-03 17:07:29 -07:00
Ido Yariv 86a85582d2 Don't set /proc/<PID>/setgroups to deny in Go1.5
A boolean field named GidMappingsEnableSetgroups was added to
SysProcAttr in Go1.5. This field determines the value of the process's
setgroups proc entry.

Since the default is to set the entry to 'deny', calling setgroups will
fail on systems running kernels 3.19+.

Set GidMappingsEnableSetgroups to true so setgroups wont be set to
'deny'.

Signed-off-by: Ido Yariv <ido@wizery.com>
2015-08-03 14:59:15 -04:00
Hui Kang 0f66ff921a Add debug message when unable to execute criu
Signed-off-by: Hui Kang <hkang.sunysb@gmail.com>
2015-08-03 17:09:45 +00:00
Andrey Vagin af4a5e708a ct: give criu informations about cgroup mounts
Actually cgroup mounts are bind-mounts, so they should be
handled by the same way.

Reported-by: Ross Boucher <rboucher@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
2015-07-20 22:56:07 +03:00
mapk0y 986dc0f730 checkpoint/restore commands support 'file-locks' option.
Signed-off-by: mapk0y <mapk0y@gmail.com>
2015-06-27 18:56:24 +09:00
unclejack 9408c09d50 libcontainer: gofmt pass 2015-06-24 01:57:42 +03:00
Michael Crosby 080df7ab88 Update import paths for new repository
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-06-21 19:29:59 -07:00
Michael Crosby 8f97d39dd2 Move libcontainer into subdirectory
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-06-21 19:29:15 -07:00