Commit Graph

783 Commits

Author SHA1 Message Date
Tianon Gravi 7212d61242 Revert "Always mount a /run tmpfs in the container"
This reverts commit 905795ece624675abe2ec2622b0bbafdb9d7f44c.

Docker-DCO-1.1-Signed-off-by: Andrew Page <admwiggin@gmail.com> (github: tianon)
2014-05-21 14:28:19 -06:00
Michael Crosby 095e9f3ba6 Update code post codereview
Add specific types for Required and Optional DeviceNodes
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-21 00:40:41 +00:00
Michael Crosby 5e5631af9b Update documentation for container struct in libcontainer
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-20 23:34:46 +00:00
Michael Crosby df4cde7662 Mount /dev in tmpfs for privileged containers
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-20 22:51:24 +00:00
Alexander Larsson 6de35476ec cgroups: Allow mknod for any device in systemd cgroup backend
Without this any container startup fails:
2014/05/20 09:20:36 setup mount namespace copy additional dev nodes mknod fuse operation not permitted

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-05-20 09:29:32 +02:00
Michael Crosby d9973b74f6 Make sure dev/fuse is created in container
Fixes #5849

If the host system does not have fuse enabled in the kernel config we
will ignore the is not exist errors when trying to copy the device node
from the host system into the container.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-19 20:46:59 +00:00
Victor Marmol 8d83ef55af Merge pull request #5903 from alexlarsson/writable-proc
Make /proc writable, but not /proc/sys and /proc/sysrq-trigger
2014-05-19 12:21:15 -07:00
Alexander Larsson 73c607b7ad Make /proc writable, but not /proc/sys and /proc/sysrq-trigger
Some applications want to write to /proc. For instance:

docker run -it centos groupadd foo

Gives: groupadd: failure while writing changes to /etc/group

And strace reveals why:

open("/proc/self/task/13/attr/fscreate", O_RDWR) = -1 EROFS (Read-only file system)

I've looked at what other systems do, and systemd-nspawn makes /proc read-write
and /proc/sys readonly, while lxc allows "proc:mixed" which does the same,
plus it makes /proc/sysrq-trigger also readonly.

The later seems like a prudent idea, so we follows lxc proc:mixed.
Additionally we make /proc/irq and /proc/bus, as these seem to let
you control various hardware things.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-05-19 20:46:05 +02:00
Victor Marmol 7c1f13e897 Merge pull request #5792 from bernerdschaefer/nsinit-supports-pdeathsig
Add PDEATHSIG support to nsinit library
2014-05-19 11:13:23 -07:00
Michael Crosby 9d1742a048 Add the rest of the caps so that they are retained in privilged mode
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-19 16:43:31 +00:00
Victor Vieux ac55d2b1fc add support for CAP_FOWNER
Docker-DCO-1.1-Signed-off-by: Victor Vieux <vieux@docker.com> (github: vieux)
2014-05-17 01:16:07 +00:00
Victor Marmol 6ecb140618 Make libcontainer's CapabilitiesMask into a []string (Capabilities).
Docker-DCO-1.1-Signed-off-by: Victor Marmol <vmarmol@google.com> (github: vmarmol)
2014-05-17 00:44:10 +00:00
Michael Crosby afea8ea5d0 Merge pull request #5833 from ActiveState/fix_nsinit_env_panic
fix panic when passing empty environment
2014-05-16 12:03:26 -07:00
Sridhar Ratnakumar d636e8aa0a fix panic when passing empty environment
Docker-DCO-1.1-Signed-off-by: Sridhar Ratnakumar <github@srid.name> (github: srid)
2014-05-16 11:55:34 -07:00
Victor Marmol 8f778870f6 Merge pull request #5810 from vmarmol/drop-caps
Change libcontainer to drop all capabilities by default.
2014-05-16 11:51:41 -07:00
Bernerd Schaefer 98b3daa8dd nsinit.DefaultCreateCommand sets Pdeathsig to SIGKILL
Docker-DCO-1.1-Signed-off-by: Bernerd Schaefer <bj.schaefer@gmail.com> (github: bernerdschaefer)
2014-05-16 13:48:41 +02:00
Bernerd Schaefer 1c33652c66 nsinit.Init() restores parent death signal before exec
Docker-DCO-1.1-Signed-off-by: Bernerd Schaefer <bj.schaefer@gmail.com> (github: bernerdschaefer)
2014-05-16 13:48:41 +02:00
Victor Marmol e0f356a13a Change libcontainer to drop all capabilities by default. Only keeps
those that were specified in the config. This commit also explicitly
adds a set of capabilities that we were silently not dropping and were
assumed by the tests.

Docker-DCO-1.1-Signed-off-by: Victor Marmol <vmarmol@google.com> (github: vmarmol)
2014-05-16 00:57:58 +00:00
Michael Crosby 15e734a277 Remove the cgroups maintainer file
We don't need this because it is covered by the libcontainer MAINTAINERS
file
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-14 16:01:45 -07:00
Michael Crosby 3ce347c35f Move cgroups package into libcontainer
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-14 15:21:44 -07:00
Bernerd Schaefer 32b89ef88c Setup standard /dev symlinks
After copying allowed device nodes, set up "/dev/fd", "/dev/stdin",
"/dev/stdout", and "/dev/stderr" symlinks.

Docker-DCO-1.1-Signed-off-by: Bernerd Schaefer <bj.schaefer@gmail.com> (github: bernerdschaefer)
[rebased by @crosbymichael]
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-14 13:59:13 -07:00
Michael Crosby b2ccfae5c4 Merge pull request #5791 from bernerdschaefer/nsinit-exec-forwards-signals
"nsinit exec ..." forwards signals to container
2014-05-14 11:05:27 -07:00
Victor Vieux 7e31653f91 Merge pull request #5781 from creack/remove_bind_console
Remove the bind mount for dev/console which override the mknod/label
2014-05-14 10:57:21 -07:00
Bernerd Schaefer edbe7cc547 "nsinit exec ..." forwards signals to container
Docker-DCO-1.1-Signed-off-by: Bernerd Schaefer <bj.schaefer@gmail.com> (github: bernerdschaefer)
2014-05-14 11:01:02 +02:00
Guillaume J. Charmes 7eea7dc6ba Remove the bind mount for dev/console which override the mknod/label
Docker-DCO-1.1-Signed-off-by: Guillaume J. Charmes <guillaume@charmes.net> (github: creack)
2014-05-13 11:59:27 -07:00
Michael Crosby 06b6ff0f39 Update code to handle new path to Follow Symlink func
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-13 10:54:08 -07:00
Alexander Larsson 3208a2737c libcontainer: Ensure bind mount target files are inside rootfs
Before we create any files to bind-mount on, make sure they are
inside the container rootfs, handling for instance absolute symbolic
links inside the container.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-05-13 10:24:52 -07:00
Alexander Larsson 4c0f535f69 Always mount a /run tmpfs in the container
All modern distros set up /run to be a tmpfs, see for instance:
https://wiki.debian.org/ReleaseGoals/RunDirectory

Its a very useful place to store pid-files, sockets and other things
that only live at runtime and that should not be stored in the image.

This is also useful when running systemd inside a container, as it
will try to mount /run if not already mounted, which will fail for
non-privileged container.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-05-12 21:41:04 +02:00
Michael Crosby 4e703db73c Merge pull request #5748 from crosbymichael/libcontainer-bindmounts
libcontainer: Create dirs/files as needed for bind mounts
2014-05-12 12:27:18 -07:00
Michael Crosby b8a8c7bfaf Remove newline char in error message
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-12 12:24:30 -07:00
Vishnu Kannan 779e5e2b31 Correct a comment in libcontainer Mount Namespace setup.
Docker-DCO-1.1-Signed-off-by: Vishnu Kannan <vishnuk@google.com> (github: vishh)
2014-05-12 19:01:36 +00:00
Alexander Larsson 9573cee2d1 libcontainer: Create dirs/files as needed for bind mounts
If you specify a bind mount in a place that doesn't have a file yet we
create that (and parent directories). This is needed because otherwise
you can't use volumes like e.g. /dev/log, as that gets covered by the
/dev tmpfs mounts.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-05-12 09:57:15 +02:00
Tianon Gravi b64310a3ce Update restrict.Restrict to both show the error message when failing to mount /dev/null over /proc/kcore, and to ignore "not exists" errors while doing so (for when CONFIG_PROC_KCORE=n in the kernel)
Docker-DCO-1.1-Signed-off-by: Andrew Page <admwiggin@gmail.com> (github: tianon)
2014-05-08 01:03:45 -06:00
Michael Crosby 68de553fca Merge pull request #5630 from rjnagal/libcontainer-fixes
Check supplied hostname before using it.
2014-05-06 09:49:52 -07:00
Michael Crosby 6c1bf629ef Improve libcontainer namespace and cap format
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-05 12:34:21 -07:00
Rohit Jnagal 2499d43325 Check supplied hostname before using it.
Docker-DCO-1.1-Signed-off-by: Rohit Jnagal <jnagal@google.com> (github: rjnagal)
2014-05-05 18:12:25 +00:00
Michael Crosby 60ba7d05e9 Merge pull request #5556 from crosbymichael/no-restrict-lxc
Don't restrict lxc because of apparmor
2014-05-02 17:20:27 -07:00
Guillaume J. Charmes 78a83e35aa Month devpts before mounting subdirs
Docker-DCO-1.1-Signed-off-by: Guillaume J. Charmes <guillaume@charmes.net> (github: creack)
2014-05-02 13:55:45 -07:00
Michael Crosby da71a207cb Don't restrict lxc because of apparmor
We don't have the flexibility to do extra things with lxc because it is
a black box and most fo the magic happens before we get a chance to
interact with it in dockerinit.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-02 11:14:24 -07:00
Guillaume J. Charmes 883af43a41 Merge pull request #5529 from crosbymichael/restrict-proc
Mount /proc and /sys read-only, except in privileged containers
2014-05-02 10:52:53 -07:00
Michael Crosby 08dbd53706 Apply apparmor before restrictions
There is not need for the remount hack, we use aa_change_onexec so the
apparmor profile is not applied until we exec the users app.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 19:09:12 -07:00
Victor Marmol 9bee7a2d73 Adding Rohit Jnagal and Victor Marmol to pkg/libcontainer maintainers.
Docker-DCO-1.1-Signed-off-by: Victor Marmol <vmarmol@google.com> (github: vmarmol)
2014-05-01 15:51:38 -07:00
Michael Crosby 3720dca241 Fix /proc/kcore mount of /dev/null
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 15:26:58 -07:00
Michael Crosby ab60f78e8b Mount attr and task as rw for selinux support
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 15:26:58 -07:00
Michael Crosby df78075a8e Update restrictions for better handling of mounts
This also cleans up some of the left over restriction paths code from
before.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 15:26:58 -07:00
Michael Crosby dcc4061db3 Update to enable cross compile
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 15:26:58 -07:00
Jérôme Petazzoni 5a6b042e53 Mount /proc and /sys read-only, except in privileged containers.
It has been pointed out that some files in /proc and /sys can be used
to break out of containers. However, if those filesystems are mounted
read-only, most of the known exploits are mitigated, since they rely
on writing some file in those filesystems.

This does not replace security modules (like SELinux or AppArmor), it
is just another layer of security. Likewise, it doesn't mean that the
other mitigations (shadowing parts of /proc or /sys with bind mounts)
are useless. Those measures are still useful. As such, the shadowing
of /proc/kcore is still enabled with both LXC and native drivers.

Special care has to be taken with /proc/1/attr, which still needs to
be mounted read-write in order to enable the AppArmor profile. It is
bind-mounted from a private read-write mount of procfs.

All that enforcement is done in dockerinit. The code doing the real
work is in libcontainer. The init function for the LXC driver calls
the function from libcontainer to avoid code duplication.

Docker-DCO-1.1-Signed-off-by: Jérôme Petazzoni <jerome@docker.com> (github: jpetazzo)
2014-05-01 15:26:58 -07:00
Eiichi Tsukata 439e010c59 drop CAP_SYSLOG capability
Kernel capabilities for privileged syslog operations are currently splitted into
CAP_SYS_ADMIN and CAP_SYSLOG since the following commit:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ce6ada35bdf710d16582cc4869c26722547e6f11

This patch drops CAP_SYSLOG to prevent containers from messing with
host's syslog (e.g. `dmesg -c` clears up host's printk ring buffer).

Closes #5491

Docker-DCO-1.1-Signed-off-by: Eiichi Tsukata <devel@etsukata.com> (github: Etsukata)
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 11:43:55 -07:00
Michael Crosby 181743d9d2 Remove container.json from readme
No need to duplicate this information when we already have a
container.json file in the root of libcontainer
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 18:52:15 -07:00
Michael Crosby 568ee3ea63 Make native driver use Exec func with different CreateCommand
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 18:49:24 -07:00