Refactor DeviceFromPath in order to get rid of package syscall and
directly use the functions from x/sys/unix. This also allows to get rid
of the conversion from the OS-independent file mode values (from the os
package) to Linux specific values and instead let's us use the raw
file mode value directly.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Use ParseSocketControlMessage and ParseUnixRights from
golang.org/x/sys/unix instead of their syscall equivalent.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
It appears as though these semantics were not fully thought out when
implementing them for rootless containers. It is not necessary (and
could be potentially dangerous) to set the owner of /run/ctr/$id to be
the root inside the container (if user namespaces are being used).
Instead, just use the e{g,u}id of runc to determine the owner.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Use IoctlGetInt and IoctlGetTermios/IoctlSetTermios instead of manually
reimplementing them.
Because of unlockpt, the ioctl wrapper is still needed as it needs to
pass a pointer to a value, which is not supported by any ioctl function
in x/sys/unix yet.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Use unix.Prctl() instead of manually reimplementing it using
unix.RawSyscall. Also use unix.SECCOMP_MODE_FILTER instead of locally
defining it.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
replace #1492#1494fix#1422
Since https://github.com/opencontainers/runtime-spec/pull/876 the memory
specifications are now `int64`, as that better matches the visible interface where
`-1` is a valid value. Otherwise finding the correct value was difficult as it
was kernel dependent.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Use unix.Eventfd() instead of calling manually reimplementing it using
the raw syscall. Also use the correct corresponding unix.EFD_CLOEXEC
flag instead of unix.FD_CLOEXEC (which can have a different value on
some architectures and thus might lead to unexpected behavior).
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
And Stat_t.PID and Stat_t.Name while we're at it. Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1]. With this change
we no longer claim Running status for zombie/dead processes.
I've also removed the kill(2) call from runType. It was originally
added in 13841ef3 (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2a (Make state detection precise,
2016-07-05, #930), and with the /stat access the kill(2) check is
redundant.
I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.
It would be nice to distinguish between "/proc/[pid]/stat doesn't
exist" and errors parsing its contents, but I've skipped that for the
moment.
The Running -> Stopped change in checkpoint_test.go is because the
post-checkpoint process is a zombie, and with this commit zombie
processes are Stopped (and no longer Running).
[1]: https://github.com/opencontainers/runc/pull/1483#issuecomment-307527789
Signed-off-by: W. Trevor King <wking@tremily.us>
And convert the various start-time properties from strings to uint64s.
This removes all internal consumers of the deprecated
GetProcessStartTime function.
Signed-off-by: W. Trevor King <wking@tremily.us>
Use KeyctlJoinSessionKeyring, KeyctlString and KeyctlSetperm from
golang.org/x/sys/unix instead of manually reimplementing them.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
And use it only in local tooling that is forwarding the pseudoterminal
master. That way runC no longer has an opinion on the onlcr setting
for folks who are creating a terminal and detaching. They'll use
--console-socket and can setup the pseudoterminal however they like
without runC having an opinion. With this commit, the only cases
where runC still has applies SaneTerminal is when *it* is the process
consuming the master descriptor.
Signed-off-by: W. Trevor King <wking@tremily.us>
Use the NLA_ALIGNTO and NLA_HDRLEN constants from x/sys/unix instead of
syscall, as the syscall package shouldn't be used anymore (except for a
few exceptions).
This also makes the syscall_NLA_HDRLEN workaround for gccgo unnecessary.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Use the symlink xattr syscall wrappers Lgetxattr, Llistxattr and
Lsetxattr from x/sys/unix (introduced in
golang/sys@b90f89a1e7) instead of
providing own wrappers. Leave the functionality of system.Lgetxattr
intact with respect to the retry with a larger buffer, but switch it to
use unix.Lgetxattr.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Follow commit 3d7cb4293c ("Move libcontainer to x/sys/unix") and also
move the examples in README.md from syscall to x/sys/unix.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Since syscall is outdated and broken for some architectures,
use x/sys/unix instead.
There are still some dependencies on the syscall package that will
remain in syscall for the forseeable future:
Errno
Signal
SysProcAttr
Additionally:
- os still uses syscall, so it needs to be kept for anything
returning *os.ProcessState, such as process.Wait.
Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>
* User Case:
User could use prestart hook to add block devices to container. so the
hook should have a way to set the permissions of the devices.
Just move cgroup config operation before prestart hook will work.
Signed-off-by: Wentao Zhang <zhangwentao234@huawei.com>
FreeBSD does not support cgroups or namespaces, which the code suggested, and is not supported
in runc anyway right now. So clean up the file naming to use `_linux` where appropriate.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
A freezer cgroup allows to dump processes faster.
If a user wants to checkpoint a container and its storage,
he has to pause a container, but in this case we need to pass
a path to its freezer cgroup to "criu dump".
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>