runc/libcontainer
Filipe Brandenburger cd41feb46b Remove detection for scope properties, which have always been broken
The detection for scope properties (whether scope units support
DefaultDependencies= or Delegate=) has always been broken, since systemd
refuses to create scopes unless at least one PID is attached to it (and
this has been so since scope units were introduced in systemd v205.)

This can be seen in journal logs whenever a container is started with
libpod:

  Feb 11 15:08:07 myhost systemd[1]: libcontainer-12345-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing.
  Feb 11 15:08:07 myhost systemd[1]: libcontainer-12345-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing.

Since this logic never worked, just assume both attributes are supported
(which is what the code does when detection fails for this reason, since
it's looking for an "unknown attribute" or "read-only attribute" to mark
them as false) and skip the detection altogether.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
2019-02-11 16:05:37 -08:00
..
apparmor libcontainer: remove dependency on libapparmor 2017-12-15 09:59:58 +01:00
cgroups Remove detection for scope properties, which have always been broken 2019-02-11 16:05:37 -08:00
configs Merge pull request #1919 from xiaochenshen/rdt-mba-software-controller 2018-11-26 16:45:42 -05:00
criurpc Update criurpc definition for latest features 2018-12-21 07:42:12 +01:00
devices libcontainer: devices: fix mips builds 2018-06-17 11:22:01 +10:00
integration integration: fix mis-use of libcontainer.Factory 2019-01-24 23:12:48 +13:00
intelrdt libcontainer: intelrdt: fix null intelrdt path issue in Destroy() 2019-01-05 00:34:25 +08:00
keys various cleanups to address linter issues 2018-10-13 21:14:03 +02:00
mount remove placeholder for non-linux platforms 2017-11-24 18:14:51 +00:00
nsenter nsenter: clone /proc/self/exe to avoid exposing host binary to container 2019-02-08 18:57:59 +11:00
seccomp Fix breaking change in Seccomp profile behavior 2017-10-18 11:53:56 -04:00
specconv Merge pull request #1916 from crosbymichael/cgns 2018-11-13 12:21:38 -08:00
stacktrace doc: fix typo 2018-09-07 11:58:59 +08:00
system libcontainer: fix compilation on GOARCH=arm GOARM=6 (32 bits) 2018-06-14 18:33:14 +00:00
user libcontainer: CurrentGroupSubGIDs -> CurrentUserSubGIDs 2018-08-29 07:46:03 +09:00
utils test: add more test case for CleanPath 2018-09-14 21:37:12 +08:00
README.md Merge pull request #1916 from crosbymichael/cgns 2018-11-13 12:21:38 -08:00
SPEC.md Merge pull request #1919 from xiaochenshen/rdt-mba-software-controller 2018-11-26 16:45:42 -05:00
capabilities_linux.go libcontainer/capabilities_linux: Drop os.Getpid() call 2018-02-19 15:47:42 -08:00
console_linux.go tty: move IO of master pty to be done with epoll 2017-07-28 12:35:02 +01:00
container.go libcontainer: Set 'status' in hook stdin 2018-11-14 06:49:49 -08:00
container_linux.go Enable CRIU configuration files 2018-12-21 07:42:12 +01:00
container_linux_test.go Fix .Fatalf() error message 2018-12-19 20:22:48 +08:00
criu_opts_linux.go Update criu_opts_linux.go 2017-12-05 15:16:26 +08:00
error.go Fix the outdated comment for Error interface 2017-01-03 15:06:47 +08:00
error_test.go [unittest] add extra ErrorCode in TestErrorCode testcase 2016-09-22 20:15:54 +08:00
factory.go could load a stopped container. 2017-04-07 07:39:41 -04:00
factory_linux.go libcontainer: intelrdt: add support for Intel RDT/MBA in runc 2018-10-16 14:29:29 +08:00
factory_linux_test.go libcontainer: Set 'status' in hook stdin 2018-11-14 06:49:49 -08:00
generic_error.go libcontainer: refactor syncT handling 2016-12-01 15:46:04 +11:00
generic_error_test.go add testcase in generic_error_test.go 2017-04-18 08:56:02 +08:00
init_linux.go Merge pull request #1911 from theSuess/linter-fixes 2018-11-13 12:13:34 -05:00
message_linux.go Disable rootless mode except RootlessCgMgr when executed as the root in userns 2018-09-07 15:05:03 +09:00
network_linux.go Remove unused veth setup code 2018-08-24 15:41:52 -07:00
notify_linux.go Fix flaky test TestNotifyOnOOM 2017-08-14 15:18:59 +08:00
notify_linux_test.go Some fixes for testMemoryNotification 2017-08-14 15:28:03 +08:00
process.go Fix race in runc exec 2018-06-01 16:25:58 -07:00
process_linux.go libcontainer: Set 'status' in hook stdin 2018-11-14 06:49:49 -08:00
restored_process.go libcontainer: Replace GetProcessStartTime with Stat_t.StartTime 2017-06-20 16:26:55 -07:00
rootfs_linux.go rootfs: umount all procfs and sysfs with --no-pivot 2019-01-14 09:53:35 +01:00
rootfs_linux_test.go linux: drop check for /proc as invalid dest 2018-08-30 09:56:18 +02:00
setns_init_linux.go Merge pull request #1814 from rhatdan/selinux 2018-11-05 10:00:11 -05:00
standard_init_linux.go Merge pull request #1911 from theSuess/linter-fixes 2018-11-13 12:13:34 -05:00
state_linux.go libcontainer: Set 'status' in hook stdin 2018-11-14 06:49:49 -08:00
state_linux_test.go libcontainer/state_linux_test: Add a testTransitions helper 2018-01-25 11:18:45 -08:00
stats.go Move libcontainer into subdirectory 2015-06-21 19:29:15 -07:00
stats_linux.go libcontainer: add support for Intel RDT/CAT in runc 2017-09-01 14:26:33 +08:00
sync.go various cleanups to address linter issues 2018-10-13 21:14:03 +02:00

README.md

libcontainer

GoDoc

Libcontainer provides a native Go implementation for creating containers with namespaces, cgroups, capabilities, and filesystem access controls. It allows you to manage the lifecycle of the container performing additional operations after the container is created.

Container

A container is a self contained execution environment that shares the kernel of the host system and which is (optionally) isolated from other containers in the system.

Using libcontainer

Because containers are spawned in a two step process you will need a binary that will be executed as the init process for the container. In libcontainer, we use the current binary (/proc/self/exe) to be executed as the init process, and use arg "init", we call the first step process "bootstrap", so you always need a "init" function as the entry of "bootstrap".

In addition to the go init function the early stage bootstrap is handled by importing nsenter.

import (
	_ "github.com/opencontainers/runc/libcontainer/nsenter"
)

func init() {
	if len(os.Args) > 1 && os.Args[1] == "init" {
		runtime.GOMAXPROCS(1)
		runtime.LockOSThread()
		factory, _ := libcontainer.New("")
		if err := factory.StartInitialization(); err != nil {
			logrus.Fatal(err)
		}
		panic("--this line should have never been executed, congratulations--")
	}
}

Then to create a container you first have to initialize an instance of a factory that will handle the creation and initialization for a container.

factory, err := libcontainer.New("/var/lib/container", libcontainer.Cgroupfs, libcontainer.InitArgs(os.Args[0], "init"))
if err != nil {
	logrus.Fatal(err)
	return
}

Once you have an instance of the factory created we can create a configuration struct describing how the container is to be created. A sample would look similar to this:

defaultMountFlags := unix.MS_NOEXEC | unix.MS_NOSUID | unix.MS_NODEV
config := &configs.Config{
	Rootfs: "/your/path/to/rootfs",
	Capabilities: &configs.Capabilities{
                Bounding: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
                Effective: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
                Inheritable: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
                Permitted: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
                Ambient: []string{
                        "CAP_CHOWN",
                        "CAP_DAC_OVERRIDE",
                        "CAP_FSETID",
                        "CAP_FOWNER",
                        "CAP_MKNOD",
                        "CAP_NET_RAW",
                        "CAP_SETGID",
                        "CAP_SETUID",
                        "CAP_SETFCAP",
                        "CAP_SETPCAP",
                        "CAP_NET_BIND_SERVICE",
                        "CAP_SYS_CHROOT",
                        "CAP_KILL",
                        "CAP_AUDIT_WRITE",
                },
        },
	Namespaces: configs.Namespaces([]configs.Namespace{
		{Type: configs.NEWNS},
		{Type: configs.NEWUTS},
		{Type: configs.NEWIPC},
		{Type: configs.NEWPID},
		{Type: configs.NEWUSER},
		{Type: configs.NEWNET},
		{Type: configs.NEWCGROUP},
	}),
	Cgroups: &configs.Cgroup{
		Name:   "test-container",
		Parent: "system",
		Resources: &configs.Resources{
			MemorySwappiness: nil,
			AllowAllDevices:  nil,
			AllowedDevices:   configs.DefaultAllowedDevices,
		},
	},
	MaskPaths: []string{
		"/proc/kcore",
		"/sys/firmware",
	},
	ReadonlyPaths: []string{
		"/proc/sys", "/proc/sysrq-trigger", "/proc/irq", "/proc/bus",
	},
	Devices:  configs.DefaultAutoCreatedDevices,
	Hostname: "testing",
	Mounts: []*configs.Mount{
		{
			Source:      "proc",
			Destination: "/proc",
			Device:      "proc",
			Flags:       defaultMountFlags,
		},
		{
			Source:      "tmpfs",
			Destination: "/dev",
			Device:      "tmpfs",
			Flags:       unix.MS_NOSUID | unix.MS_STRICTATIME,
			Data:        "mode=755",
		},
		{
			Source:      "devpts",
			Destination: "/dev/pts",
			Device:      "devpts",
			Flags:       unix.MS_NOSUID | unix.MS_NOEXEC,
			Data:        "newinstance,ptmxmode=0666,mode=0620,gid=5",
		},
		{
			Device:      "tmpfs",
			Source:      "shm",
			Destination: "/dev/shm",
			Data:        "mode=1777,size=65536k",
			Flags:       defaultMountFlags,
		},
		{
			Source:      "mqueue",
			Destination: "/dev/mqueue",
			Device:      "mqueue",
			Flags:       defaultMountFlags,
		},
		{
			Source:      "sysfs",
			Destination: "/sys",
			Device:      "sysfs",
			Flags:       defaultMountFlags | unix.MS_RDONLY,
		},
	},
	UidMappings: []configs.IDMap{
		{
			ContainerID: 0,
			HostID: 1000,
			Size: 65536,
		},
	},
	GidMappings: []configs.IDMap{
		{
			ContainerID: 0,
			HostID: 1000,
			Size: 65536,
		},
	},
	Networks: []*configs.Network{
		{
			Type:    "loopback",
			Address: "127.0.0.1/0",
			Gateway: "localhost",
		},
	},
	Rlimits: []configs.Rlimit{
		{
			Type: unix.RLIMIT_NOFILE,
			Hard: uint64(1025),
			Soft: uint64(1025),
		},
	},
}

Once you have the configuration populated you can create a container:

container, err := factory.Create("container-id", config)
if err != nil {
	logrus.Fatal(err)
	return
}

To spawn bash as the initial process inside the container and have the processes pid returned in order to wait, signal, or kill the process:

process := &libcontainer.Process{
	Args:   []string{"/bin/bash"},
	Env:    []string{"PATH=/bin"},
	User:   "daemon",
	Stdin:  os.Stdin,
	Stdout: os.Stdout,
	Stderr: os.Stderr,
}

err := container.Run(process)
if err != nil {
	container.Destroy()
	logrus.Fatal(err)
	return
}

// wait for the process to finish.
_, err := process.Wait()
if err != nil {
	logrus.Fatal(err)
}

// destroy the container.
container.Destroy()

Additional ways to interact with a running container are:

// return all the pids for all processes running inside the container.
processes, err := container.Processes()

// get detailed cpu, memory, io, and network statistics for the container and
// it's processes.
stats, err := container.Stats()

// pause all processes inside the container.
container.Pause()

// resume all paused processes.
container.Resume()

// send signal to container's init process.
container.Signal(signal)

// update container resource constraints.
container.Set(config)

// get current status of the container.
status, err := container.Status()

// get current container's state information.
state, err := container.State()

Checkpoint & Restore

libcontainer now integrates CRIU for checkpointing and restoring containers. This let's you save the state of a process running inside a container to disk, and then restore that state into a new process, on the same machine or on another machine.

criu version 1.5.2 or higher is required to use checkpoint and restore. If you don't already have criu installed, you can build it from source, following the online instructions. criu is also installed in the docker image generated when building libcontainer with docker.

Code and documentation copyright 2014 Docker, inc. The code and documentation are released under the Apache 2.0 license. The documentation is also released under Creative Commons Attribution 4.0 International License. You may obtain a copy of the license, titled CC-BY-4.0, at http://creativecommons.org/licenses/by/4.0/.