runc/libcontainer
Aleksa Sarai 1a5fdc1c5f
init: support setting -u with rootless containers
Now that rootless containers have support for multiple uid and gid
mappings, allow --user to work as expected. If the user is not mapped,
an error occurs (as usual).

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-09-09 12:45:33 +10:00
..
apparmor Updating error condition in applying apparmor profile 2016-05-04 19:10:55 +05:30
cgroups Fix systemd cgroup after memory type changed 2017-08-25 01:14:16 -04:00
configs tests: fix for rootless multiple uids/gids 2017-09-09 12:45:32 +10:00
criurpc criurpc.proto: copy latest criurpc.proto from criu 3.3 2017-08-02 16:07:32 +00:00
devices Handle non-devices correctly in DeviceFromPath 2017-08-09 08:52:20 -07:00
integration Merge pull request #1537 from tklauser/staticcheck 2017-08-02 09:52:11 -04:00
intelrdt libcontainer: intelrdt: use init() to avoid race condition 2017-09-08 17:15:31 +08:00
keys Use keyctl wrappers from x/sys/unix 2017-06-09 15:55:18 +02:00
nsenter nsenter: do not resolve path in nsexec context 2017-09-09 12:45:33 +10:00
seccomp libcontainer: use Prctl() from x/sys/unix 2017-07-10 10:56:58 +02:00
specconv Merge pull request #1588 from s7v7nislands/delete_unused 2017-09-08 17:34:00 -07:00
stacktrace fix typos 2016-11-30 13:31:36 +08:00
system libcontainer: use ioctl wrappers from x/sys/unix 2017-07-10 10:56:58 +02:00
user Revert "Merge pull request #1450 from vrothberg/sgid-non-numeric" 2017-08-04 14:28:21 -07:00
utils Move libcontainer to x/sys/unix 2017-05-22 17:35:20 -05:00
xattr Use symlink xattr functions from x/sys/unix 2017-05-31 13:50:34 +02:00
README.md update READ.me for new struct configs.Config.Capabilities 2017-06-09 18:47:05 +08:00
SPEC.md libcontainer/SPEC.md: add documentation for Intel RDT/CAT 2017-09-01 14:26:33 +08:00
capabilities_linux.go Remove ambient build tag 2017-03-15 11:38:43 -07:00
compat_1.5_linux.go Don't set /proc/<PID>/setgroups to deny in Go1.5 2015-08-03 14:59:15 -04:00
console.go Remove terminal info 2017-03-16 10:23:59 -07:00
console_freebsd.go console: don't chown(2) the slave PTY 2016-12-01 15:49:36 +11:00
console_linux.go libcontainer: use ioctl wrappers from x/sys/unix 2017-07-10 10:56:58 +02:00
console_solaris.go console: don't chown(2) the slave PTY 2016-12-01 15:49:36 +11:00
console_windows.go console: don't chown(2) the slave PTY 2016-12-01 15:49:36 +11:00
container.go libcontainer: Replace GetProcessStartTime with Stat_t.StartTime 2017-06-20 16:26:55 -07:00
container_linux.go nsenter: correctly handle newgidmap path for rootless containers 2017-09-09 12:45:32 +10:00
container_linux_test.go libcontainer: intelrdt: use init() to avoid race condition 2017-09-08 17:15:31 +08:00
container_solaris.go Get runc to build clean on Solaris 2016-04-12 16:13:08 -07:00
container_windows.go Windows: Refactor Container interface 2015-11-02 15:12:16 -08:00
criu_opts_linux.go checkpoint: support lazy migration 2017-09-06 12:35:38 +00:00
criu_opts_windows.go Windows: Factor down criu_opts 2015-10-23 12:58:59 -07:00
error.go Fix the outdated comment for Error interface 2017-01-03 15:06:47 +08:00
error_test.go [unittest] add extra ErrorCode in TestErrorCode testcase 2016-09-22 20:15:54 +08:00
factory.go could load a stopped container. 2017-04-07 07:39:41 -04:00
factory_linux.go rootless: allow multiple user/group mappings 2017-09-09 12:45:32 +10:00
factory_linux_test.go libcontainer: add test cases for Intel RDT/CAT 2017-09-01 14:35:40 +08:00
generic_error.go libcontainer: refactor syncT handling 2016-12-01 15:46:04 +11:00
generic_error_test.go add testcase in generic_error_test.go 2017-04-18 08:56:02 +08:00
init_linux.go init: support setting -u with rootless containers 2017-09-09 12:45:33 +10:00
message_linux.go rootless: allow multiple user/group mappings 2017-09-09 12:45:32 +10:00
network_linux.go Revert "fix minor issue" 2017-03-20 12:28:43 +11:00
notify_linux.go Fix flaky test TestNotifyOnOOM 2017-08-14 15:18:59 +08:00
notify_linux_test.go Some fixes for testMemoryNotification 2017-08-14 15:28:03 +08:00
process.go Add separate console socket 2017-03-16 10:23:59 -07:00
process_linux.go libcontainer: add support for Intel RDT/CAT in runc 2017-09-01 14:26:33 +08:00
restored_process.go libcontainer: Replace GetProcessStartTime with Stat_t.StartTime 2017-06-20 16:26:55 -07:00
rootfs_linux.go fix --read-only containers under --userns-remap 2017-08-24 16:43:21 -06:00
rootfs_linux_test.go Remove check for binding to / 2016-09-29 15:26:09 -07:00
setgroups_linux.go Don't set /proc/<PID>/setgroups to deny in Go1.5 2015-08-03 14:59:15 -04:00
setns_init_linux.go setns init: delay seccomp as late as possible 2017-08-26 13:42:30 +10:00
standard_init_linux.go init: move close(stateDirFd) before seccomp apply 2017-08-26 13:42:26 +10:00
state_linux.go libcontainer: add support for Intel RDT/CAT in runc 2017-09-01 14:26:33 +08:00
state_linux_test.go add createdState and runningState status testcase 2017-04-19 16:28:03 +08:00
stats.go Move libcontainer into subdirectory 2015-06-21 19:29:15 -07:00
stats_freebsd.go Move libcontainer into subdirectory 2015-06-21 19:29:15 -07:00
stats_linux.go libcontainer: add support for Intel RDT/CAT in runc 2017-09-01 14:26:33 +08:00
stats_solaris.go Get runc to build clean on Solaris 2016-04-12 16:13:08 -07:00
stats_windows.go Move libcontainer into subdirectory 2015-06-21 19:29:15 -07:00
sync.go Add separate console socket 2017-03-16 10:23:59 -07:00

README.md

libcontainer

GoDoc

Libcontainer provides a native Go implementation for creating containers with namespaces, cgroups, capabilities, and filesystem access controls. It allows you to manage the lifecycle of the container performing additional operations after the container is created.

Container

A container is a self contained execution environment that shares the kernel of the host system and which is (optionally) isolated from other containers in the system.

Using libcontainer

Because containers are spawned in a two step process you will need a binary that will be executed as the init process for the container. In libcontainer, we use the current binary (/proc/self/exe) to be executed as the init process, and use arg "init", we call the first step process "bootstrap", so you always need a "init" function as the entry of "bootstrap".

In addition to the go init function the early stage bootstrap is handled by importing nsenter.

import (
	_ "github.com/opencontainers/runc/libcontainer/nsenter"
)

func init() {
	if len(os.Args) > 1 && os.Args[1] == "init" {
		runtime.GOMAXPROCS(1)
		runtime.LockOSThread()
		factory, _ := libcontainer.New("")
		if err := factory.StartInitialization(); err != nil {
			logrus.Fatal(err)
		}
		panic("--this line should have never been executed, congratulations--")
	}
}

Then to create a container you first have to initialize an instance of a factory that will handle the creation and initialization for a container.

factory, err := libcontainer.New("/var/lib/container", libcontainer.Cgroupfs, libcontainer.InitArgs(os.Args[0], "init"))
if err != nil {
	logrus.Fatal(err)
	return
}

Once you have an instance of the factory created we can create a configuration struct describing how the container is to be created. A sample would look similar to this:

defaultMountFlags := unix.MS_NOEXEC | unix.MS_NOSUID | unix.MS_NODEV
config := &configs.Config{
	Rootfs: "/your/path/to/rootfs",
	Capabilities: &configs.Capabilities{
                Bounding: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
                Effective: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
                Inheritable: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
                Permitted: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
                Ambient: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
        },
	Namespaces: configs.Namespaces([]configs.Namespace{
		{Type: configs.NEWNS},
		{Type: configs.NEWUTS},
		{Type: configs.NEWIPC},
		{Type: configs.NEWPID},
		{Type: configs.NEWUSER},
		{Type: configs.NEWNET},
	}),
	Cgroups: &configs.Cgroup{
		Name:   "test-container",
		Parent: "system",
		Resources: &configs.Resources{
			MemorySwappiness: nil,
			AllowAllDevices:  nil,
			AllowedDevices:   configs.DefaultAllowedDevices,
		},
	},
	MaskPaths: []string{
		"/proc/kcore",
		"/sys/firmware",
	},
	ReadonlyPaths: []string{
		"/proc/sys", "/proc/sysrq-trigger", "/proc/irq", "/proc/bus",
	},
	Devices:  configs.DefaultAutoCreatedDevices,
	Hostname: "testing",
	Mounts: []*configs.Mount{
		{
			Source:      "proc",
			Destination: "/proc",
			Device:      "proc",
			Flags:       defaultMountFlags,
		},
		{
			Source:      "tmpfs",
			Destination: "/dev",
			Device:      "tmpfs",
			Flags:       unix.MS_NOSUID | unix.MS_STRICTATIME,
			Data:        "mode=755",
		},
		{
			Source:      "devpts",
			Destination: "/dev/pts",
			Device:      "devpts",
			Flags:       unix.MS_NOSUID | unix.MS_NOEXEC,
			Data:        "newinstance,ptmxmode=0666,mode=0620,gid=5",
		},
		{
			Device:      "tmpfs",
			Source:      "shm",
			Destination: "/dev/shm",
			Data:        "mode=1777,size=65536k",
			Flags:       defaultMountFlags,
		},
		{
			Source:      "mqueue",
			Destination: "/dev/mqueue",
			Device:      "mqueue",
			Flags:       defaultMountFlags,
		},
		{
			Source:      "sysfs",
			Destination: "/sys",
			Device:      "sysfs",
			Flags:       defaultMountFlags | unix.MS_RDONLY,
		},
	},
	UidMappings: []configs.IDMap{
		{
			ContainerID: 0,
			HostID: 1000,
			Size: 65536,
		},
	},
	GidMappings: []configs.IDMap{
		{
			ContainerID: 0,
			HostID: 1000,
			Size: 65536,
		},
	},
	Networks: []*configs.Network{
		{
			Type:    "loopback",
			Address: "127.0.0.1/0",
			Gateway: "localhost",
		},
	},
	Rlimits: []configs.Rlimit{
		{
			Type: unix.RLIMIT_NOFILE,
			Hard: uint64(1025),
			Soft: uint64(1025),
		},
	},
}

Once you have the configuration populated you can create a container:

container, err := factory.Create("container-id", config)
if err != nil {
	logrus.Fatal(err)
	return
}

To spawn bash as the initial process inside the container and have the processes pid returned in order to wait, signal, or kill the process:

process := &libcontainer.Process{
	Args:   []string{"/bin/bash"},
	Env:    []string{"PATH=/bin"},
	User:   "daemon",
	Stdin:  os.Stdin,
	Stdout: os.Stdout,
	Stderr: os.Stderr,
}

err := container.Run(process)
if err != nil {
	container.Destroy()
	logrus.Fatal(err)
	return
}

// wait for the process to finish.
_, err := process.Wait()
if err != nil {
	logrus.Fatal(err)
}

// destroy the container.
container.Destroy()

Additional ways to interact with a running container are:

// return all the pids for all processes running inside the container.
processes, err := container.Processes()

// get detailed cpu, memory, io, and network statistics for the container and
// it's processes.
stats, err := container.Stats()

// pause all processes inside the container.
container.Pause()

// resume all paused processes.
container.Resume()

// send signal to container's init process.
container.Signal(signal)

// update container resource constraints.
container.Set(config)

// get current status of the container.
status, err := container.Status()

// get current container's state information.
state, err := container.State()

Checkpoint & Restore

libcontainer now integrates CRIU for checkpointing and restoring containers. This let's you save the state of a process running inside a container to disk, and then restore that state into a new process, on the same machine or on another machine.

criu version 1.5.2 or higher is required to use checkpoint and restore. If you don't already have criu installed, you can build it from source, following the online instructions. criu is also installed in the docker image generated when building libcontainer with docker.

Code and documentation copyright 2014 Docker, inc. Code released under the Apache 2.0 license. Docs released under Creative commons.