Commit Graph

2612 Commits

Author SHA1 Message Date
Daniel, Dao Quang Minh 42d5d04801 Sets custom namespaces for init processes
An init process can join other namespaces (pidns, ipc etc.). This leverages
C code defined in nsenter package to spawn a process with correct namespaces
and clone if necessary.

This moves all setns and cloneflags related code to nsenter layer, which mean
that we dont use Go os/exec to create process with cloneflags and set
uid/gid_map or setgroups anymore. The necessary data is passed from Go to C
using a netlink binary-encoding format.

With this change, setns and init processes are almost the same, which brings
some opportunity for refactoring.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
[mickael.laventure@docker.com: adapted to apply on master @ d97d5e]
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@docker.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh d6bf4049f8 OrderNamespacePaths gets correct order of ns
This adds orderNamespacePaths to get correct order of namespaces for the
bootstrap program to join.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh 2d32210620 Integration tests for joining namespaces
Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 12:26:53 -08:00
Daniel, Dao Quang Minh 4217b9c121 Do not override the specified userns path
Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 11:59:48 -08:00
Daniel, Dao Quang Minh f376cf84b9 Check if a namespace is supported
This adds `configs.IsNamespaceSupported(nsType)` to check if the host supports
a namespace type.

Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
2016-02-28 11:59:48 -08:00
Alexander Morozov d282265f72 Merge pull request #596 from hushan/decoder_fix
Use single decoder instance for one stream
2016-02-27 16:27:57 -08:00
Mrunal Patel 64d87ebdec Merge pull request #585 from crosbymichael/dev-remountro
Remount /dev as ro after it is populated
2016-02-27 00:31:40 -08:00
Alexander Morozov 4a12ff6e58 Merge pull request #443 from BenHall/build
Build runC binary via a Docker container
2016-02-26 20:17:31 -08:00
Alexander Morozov 52fcc65943 Merge pull request #587 from crosbymichael/labels
Add bundle to runc list
2016-02-26 20:08:00 -08:00
Alexander Morozov 9ae2ed1051 Merge pull request #591 from crosbymichael/exec-errors
Return proper exit code for exec errors
2016-02-26 19:58:47 -08:00
Alexander Morozov 67c3a21a05 Merge pull request #593 from crosbymichael/wait-pipes
Wait for pipes to write all data before exit
2016-02-26 19:56:21 -08:00
Alexander Morozov c3e997e2bb Merge pull request #594 from crosbymichael/mount-types
Allow extra mount types
2016-02-26 19:54:07 -08:00
Michael Crosby a12336eb3e Update masked and ro paths
This updates the current list to what we have now in docker and also
makes these always added so that these are masked out.  Privileged
containers can always unmount these if they want to read from kcore or
something like that.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-26 15:54:53 -08:00
Michael Crosby c5a34a6fe2 Allow extra mount types
This allows the mount syscall to validate the addiontal types where we
do not have to perform extra validation and is up to the consumer to
verify the functionality of the type of device they are trying to
mount.

Fixes #572

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-26 15:21:33 -08:00
Mrunal Patel 53e4dd65f5 Merge pull request #588 from rajasec/pivot-root
Removing pivot directory in defer
2016-02-26 12:48:29 -08:00
Michael Crosby 8d0a05b8dd Wait for pipes to write all data before exit
Add a waitgroup to wait for the io.Copy of stdout/err to finish before
existing runc.  The problem happens more in exec because it is really
fast and the pipe has data buffered but not yet read after the process
has already exited.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-26 12:14:47 -08:00
Michael Crosby 6bb653a6e8 Return proper exit code for exec errors
Exec erros from the exec() syscall in the container's init should be
treated as if the container ran but couldn't execute the process for the
user instead of returning a libcontainer error as if it was an issue in
the library.

Before specifying different commands like `/etc`, `asldfkjasdlfj`, or
`/alsdjfkasdlfj` would always return 1 on the command line with a
libcontainer specific error message.  Now they return the correct
message and exit status defined for unix processes.

Example:

```bash
root@deathstar:/containers/redis# runc start test
exec: "/asdlfkjasldkfj": file does not exist
root@deathstar:/containers/redis# echo $?
127
root@deathstar:/containers/redis# runc start test
exec: "asdlfkjasldkfj": executable file not found in $PATH
root@deathstar:/containers/redis# echo $?
127
root@deathstar:/containers/redis# runc start test
exec: "/etc": permission denied
root@deathstar:/containers/redis# echo $?
126
```

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-26 11:41:56 -08:00
rajasec 05905ab0a6 Updating swapiness value in README
Signed-off-by: rajasec <rajasec79@gmail.com>
2016-02-26 22:53:28 +05:30
Hushan Jia 8597d5c969 Use single decoder instance for one stream
This will avoid part of the stream be read and abandomed
and resulting decoding errors.

Signed-off-by: Hushan Jia <hushan.jia@gmail.com>
2016-02-26 19:40:35 +08:00
rajasec ff9e6adc2a Create pid file when not exist
Signed-off-by: rajasec <rajasec79@gmail.com>
2016-02-26 13:10:30 +05:30
Mrunal Patel 930dbb38a2 Merge pull request #328 from hqhq/hq_build_runc_everywhere
Make runc buildable everywhere
2016-02-25 23:22:00 -08:00
Mrunal Patel 4951f5821b Merge pull request #582 from stefanberger/new_session_keyring
Create unique session key name for every container
2016-02-25 17:54:14 -08:00
Tonis Tiigi 30534f979b Fix setting OomScoreAdj from OCI spec
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-02-25 12:01:18 -08:00
Michael Crosby fc8c8ed9da Merge pull request #303 from mrunalp/sysctl_validation
Add validation for sysctl
2016-02-25 11:24:41 -08:00
Michael Crosby 77d2793c77 Merge pull request #584 from rajasec/selinux-errcheck
Added error check in Getfilecon
2016-02-25 10:55:39 -08:00
Michael Crosby f23ff4d194 Fix bundle path for exec
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-25 10:29:48 -08:00
rajasec 1db7322ded Removing pivot directory in defer
Signed-off-by: rajasec <rajasec79@gmail.com>

Changing to name values for defer as per review comments

Signed-off-by: rajasec <rajasec79@gmail.com>

Fixed review comments

Signed-off-by: rajasec <rajasec79@gmail.com>
2016-02-25 13:12:40 +05:30
rajasec 3b2805834b Adding linux label to test file
Signed-off-by: rajasec <rajasec79@gmail.com>

Fixed review comments

Signed-off-by: rajasec <rajasec79@gmail.com>
2016-02-25 07:52:32 +05:30
Michael Crosby ac43d4a0ab Save bundle path in labels
This saves and returns the bundle path for the container in the
container's config and state.  It also returns the information via runc
list.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-24 11:11:10 -08:00
Michael Crosby e34b4fbcd3 Add labels to libconatiner config
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-24 10:45:20 -08:00
Alexander Morozov f94eb27013 Merge pull request #580 from estesp/swappiness-fix
Handle memory swappiness default properly
2016-02-24 10:33:50 -08:00
Phil Estes 0b5581fd28 Handle memory swappiness as a pointer to handle default/unset case
This prior fix to set "-1" explicitly was lost, and it is simpler to use
the same pointer type from the OCI spec to handle nil pointer == -1 ==
unset case.

Also, as a nearly humorous aside, there was a test for MemorySwappiness
that was actually setting Memory, and it was passing because of this
bug (as it was always setting everyone's MemorySwappiness to zero!)

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2016-02-24 09:02:06 -06:00
Stefan Berger 5fbf791e31 Create unique session key name for every container
Create a unique session key name for every container. Use the pattern
_ses.<postfix> with postfix being the container's Id.

This patch does not prevent containers from joining each other's session
keyring.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2016-02-24 08:39:52 -05:00
rajasec 039d25c341 Added error check in Getfilecon
Signed-off-by: rajasec <rajasec79@gmail.com>

Fixed review comments

Signed-off-by: rajasec <rajasec79@gmail.com>

Fixed review comments for adding length check

Signed-off-by: rajasec <rajasec79@gmail.com>

Fixed review comment

Signed-off-by: rajasec <rajasec79@gmail.com>
2016-02-24 17:37:28 +05:30
Mrunal Patel 15b6b24413 Merge pull request #568 from mrunalp/move_hooks
Move pre-start hooks after container mounts
2016-02-24 10:07:32 +05:30
Michael Crosby fc98958321 Remount /dev as ro after it is populated
Because we more than likely control dev and populate devices and files
inside of it we need to make sure that we fulfil the user's request to
make it ro only after it has been populated.  This removes the need to
expose something like ReadonlyPaths in the config but still have the
same outcome but more seemless for the user.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-23 13:56:01 -08:00
Mrunal Patel 2f27649848 Move pre-start hooks after container mounts
Today mounts in pre-start hooks get overriden by the default mounts.
Moving the pre-start hooks to after the container mounts and before
the pivot/move root gives better flexiblity in the hooks.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-02-23 02:50:35 -08:00
Mrunal Patel 2c3115481e Merge pull request #583 from crosbymichael/delete
Make sure container is destroyed on error
2016-02-23 13:30:19 +05:30
Michael Crosby 4bc25aaea1 Make sure container is destroyed on error
Even in the detach usecase we need to make sure that the contianer is
destroyed if there is an error starting the container or anywhere in
that workflow.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-22 15:34:41 -08:00
Mrunal Patel 5204301d1b Merge pull request #571 from mikebrow/runc-list-format-options
adding --format json to list command
2016-02-20 10:16:06 +05:30
Michael Crosby ee6a72df4e Merge pull request #577 from crosbymichael/m-named-cgroup
Move the process outside of the systemd cgroup
2016-02-19 13:51:58 -08:00
Michael Crosby 47f16e89df Move the process outside of the systemd cgroup
If you don't move the process out of the named cgroup for systemd then
systemd will try to delete all the cgroups that the process is currently
in.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-02-19 11:26:46 -08:00
Mike Brown 160daf293e adding --format json to list command
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2016-02-18 22:07:04 -06:00
Mrunal Patel 0107c7fb6c Merge pull request #573 from LK4D4/fix_dash_cgroup
Look for " - " instead of just - as separator
2016-02-19 08:57:39 +05:30
Andrew Vagin b8121e8998 checkpoint: call Prestart hooks on restore before restoring processes
Docker uses Prestart hooks to call a libnetwork hook to create
network devices and set addesses and routes.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-02-19 02:40:26 +03:00
Andrew Vagin 46c25be297 checkpoint: add support of the EmptyNs criu option
This options is set a namespace mask which will not be dumped and restored.
For example, we are going to use this option to restore network
for docker containers. CRIU will create a network namespace and
call a libnetwork hook to restore network devices, addresses and routes.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-02-19 02:40:26 +03:00
Andrew Vagin a2a771b8e2 libcontainer: update criurpc.proto
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
2016-02-19 02:38:02 +03:00
Michael Crosby e28cfafa6d Merge pull request #567 from rajasec/tty-remove
Removing tty0 tty1 from allowed devices
2016-02-18 13:56:32 -08:00
Alexander Morozov 98cbce80fb Look for " - " instead of just - as separator
- symbol can appear in any path

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2016-02-18 09:58:29 -08:00
Alexander Morozov 488e315c21 Merge pull request #570 from crosbymichael/tty-nil
Check if tty is nil in handler
2016-02-17 13:37:21 -08:00