replace passing of pid and console path via environment variable with passing
them with netlink message via an established pipe.
this change requires us to set _LIBCONTAINER_INITTYPE and
_LIBCONTAINER_INITPIPE as the env environment of the bootstrap process as we
only send the bootstrap data for setns process right now. When init and setns
bootstrap process are unified (i.e., init use nsexec instead of Go to clone new
process), we can remove _LIBCONTAINER_INITTYPE.
Note:
- we read nlmsghdr first before reading the content so we can get the total
length of the payload and allocate buffer properly instead of allocating
one large buffer.
- check read bytes vs the wanted number. It's an error if we failed to read
the desired number of bytes from the pipe into the buffer.
Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
This patch fixes the following go vet warnings:
```
libcontainer/network_linux.go:96: github.com/vishvananda/netlink.Device
composite literal uses unkeyed fields
libcontainer/network_linux.go:114: github.com/vishvananda/netlink.Device
composite literal uses unkeyed fields
```
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
add bootstrap data to setns process. If we have any bootstrap data then copy it
to the bootstrap process (i.e. nsexec) using the sync pipe. This will allow us
to eventually replace environment variable usage with more structured data
to setup namespaces, write pid/gid map, setgroup etc.
Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
Enables launching userns containers by catching EPERM errors for writing
to devices cgroups, and for mknod invocations.
Signed-off-by: Abin Shahab <ashahab@altiscale.com>
When starting and quering for pids a container can start and exit before
this is set. So set the opts after the process is started and while
libcontainer still has the container's process blocking on the pipe.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
The former cgroup entry is confusing, separate it to parent
and name.
Rename entry `c` to `config`.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
'parent' function is confusing with parent cgroup, it's actually
parent path, so rename it to parentPath.
The name 'data' is too common to be identified, rename it to cgroupData
which is exactly what it is.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
The spec uses symlinks to "/proc/1/..." but the implementation uses
"/proc/self/...": see setupDevSymlinks (libcontainer/rootfs_linux.go).
The implementation is more correct, so I'm changing the spec to match
the implementation.
Signed-off-by: Alban Crequy <alban.crequy@coreos.com>
Minor fix, the former setupDev=true means not setup dev,
which is contrary to intuition, just correct it.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
We have a rule that for optional cgroups, don't fail if some
of them are not mounted, but we want it fail hard when a
user specifies an option and we are unable to fulfill the
request.
Memory cgroup should also follow this rule.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>