The issue with doing a remount as ro with sysfs is that if a container
is still in one of the hosts namepsaces, commonly with the NET
namespace, the remount will cause the host's systems sysfs to be
remounted as ro also. We can fix this correctly by not doing the
remount and just mount sys as ro in the first place.
The other remounts are individual files within proc so they will not
have this issue.
For context please see:
https://github.com/dotcloud/docker/issues/7101
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@docker.com> (github: crosbymichael)
This moves the sync pipe into a separate package to help the changes
when moving the API around.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@docker.com> (github: crosbymichael)
Ensure that the command is killed if we receive an error from the child
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@docker.com> (github: crosbymichael)
We use a unix domain socketpair instead of a pipe for the sync pipe,
which allows us to use two-way shutdown. After sending the
context we shut down the write side which lets the child know
it finished reading.
We then block on a read in the parent for the child closing the file
(ensuring we close our version of it too) to sync for when the child
is finished initializing. If the read is non-empty we assume this
is an error report and fail with an error. Otherwise we continue as
before.
This also means we're now calling back the start callback later,
meaning at that point its more likely to have succeeded, as well as
having consumed all the container resources (like volume mounts,
making it safe to e.g. unmount them when the start callback is
called).
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Also add unit test for container json files to ensure that the mount
config is read and device nodes are validated.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@docker.com> (github: crosbymichael)
'namespaces' need to refactored a bit more to move the API part of it to 'libcontainer' package and keep the namespace specific code inside that package.
This change is not expected to break docker.
Docker-DCO-1.1-Signed-off-by: Vishnu Kannan <vishnuk@google.com> (github: vishh)
Sometimes I was getting:
2014/06/13 13:47:24 finalize namespace drop bounding set read /proc/1/status: bad file descriptor
This happens when applying the capabilities, and the code that
reads the current caps opens /proc/1/status and then reads some data from it.
But during this it gets a EBADFD error.
The problem is that FinalizeNamespace() closes all FDs before applying
the caps, and if a GC then happens after /proc/1/status is opened but
before reading from the fd, then an old os.File finalizer may close the
already closed-and-reused fd, wreaking havoc.
We fix this by instead of closing the FDs we mark them close-on-exec
which guarantees that they will be closed when we do the final
exec into the container.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)