Update spec version

This bump of the spec includes a change to the deivce type to be a
string so that it is more readable in the json serialization.

It also includes the change were caps, no new privs, and process
labeling features are moved from the container config onto the process.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This commit is contained in:
Michael Crosby 2016-03-03 10:26:38 -08:00
parent b86570a4d4
commit aa9660027b
10 changed files with 179 additions and 109 deletions

4
Godeps/Godeps.json generated
View File

@ -58,8 +58,8 @@
},
{
"ImportPath": "github.com/opencontainers/specs",
"Comment": "v0.2.0-49-g25cbfc4",
"Rev": "25cbfc427ba6154016f33c7ed1b8ed43b8b7b7ed"
"Comment": "v0.3.0-15-ga1e32a8",
"Rev": "a1e32a8ead2ba57adce3e36e956b4dc32c1b85c4"
},
{
"ImportPath": "github.com/seccomp/libseccomp-golang",

View File

@ -67,7 +67,7 @@ When in doubt, start on the [mailing-list](#mailing-list).
## Weekly Call
The contributors and maintainers of the project have a weekly meeting Wednesdays at 10:00 AM PST.
Everyone is welcome to participate in the [BlueJeans call][BlueJeans].
Everyone is welcome to participate via [UberConference web][UberConference] or audio-only: 646-494-8704 (no PIN needed.)
An initial agenda will be posted to the [mailing list](#mailing-list) earlier in the week, and everyone is welcome to propose additional topics or suggest other agenda alterations there.
Minutes are posted to the [mailing list](#mailing-list) and minutes from past calls are archived to the [wiki](https://github.com/opencontainers/specs/wiki) for those who are unable to join the call.
@ -155,3 +155,4 @@ Read more on [How to Write a Git Commit Message](http://chris.beams.io/posts/git
8. When possible, one keyword to scope the change in the subject (i.e. "README: ...", "runtime: ...")
[BlueJeans]: https://bluejeans.com/1771332256/
[UberConference]: https://www.uberconference.com/ssaul

View File

@ -3,19 +3,6 @@
The Linux container specification uses various kernel features like namespaces, cgroups, capabilities, LSM, and file system jails to fulfill the spec.
Additional information is needed for Linux over the [default spec configuration](config.md) in order to configure these various kernel features.
## Capabilities
Capabilities is an array that specifies Linux capabilities that can be provided to the process inside the container.
Valid values are the strings for capabilities defined in [the man page](http://man7.org/linux/man-pages/man7/capabilities.7.html)
```json
"capabilities": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
]
```
## Default File Systems
The Linux ABI includes both syscalls and several special file paths.
@ -112,7 +99,7 @@ The runtime may supply them however it likes (with [mknod][mknod.2], by bind mou
The following parameters can be specified:
* **`type`** *(char, required)* - type of device: `c`, `b`, `u` or `p`.
* **`type`** *(string, required)* - type of device: `c`, `b`, `u` or `p`.
More info in [mknod(1)][mknod.1].
* **`path`** *(string, required)* - full path to device inside container.
* **`major, minor`** *(int64, required unless **`type`** is `p`)* - [major, minor numbers][devices] for the device.
@ -130,7 +117,7 @@ The following parameters can be specified:
"type": "c",
"major": 10,
"minor": 229,
"fileMode": 0666,
"fileMode": 438,
"uid": 0,
"gid": 0
},
@ -139,7 +126,7 @@ The following parameters can be specified:
"type": "b",
"major": 8,
"minor": 0,
"fileMode": 0660,
"fileMode": 432,
"uid": 0,
"gid": 0
}
@ -194,7 +181,7 @@ The runtime MUST apply entries in the listed order.
The following parameters can be specified:
* **`allow`** *(boolean, required)* - whether the entry is allowed or denied.
* **`type`** *(char, optional)* - type of device: `a` (all), `c` (char), or `b` (block).
* **`type`** *(string, optional)* - type of device: `a` (all), `c` (char), or `b` (block).
`null` or unset values mean "all", mapping to `a`.
* **`major, minor`** *(int64, optional)* - [major, minor numbers][devices] for the device.
`null` or unset values mean "all", mapping to [`*` in the filesystem API][cgroup-v1-devices].
@ -486,28 +473,6 @@ The kernel enforces the `soft` limit for a resource while the `hard` limit acts
]
```
## SELinux process label
SELinux process label specifies the label with which the processes in a container are run.
For more information about SELinux, see [Selinux documentation](http://selinuxproject.org/page/Main_Page)
###### Example
```json
"selinuxProcessLabel": "system_u:system_r:svirt_lxc_net_t:s0:c124,c675"
```
## Apparmor profile
Apparmor profile specifies the name of the apparmor profile that will be used for the container.
For more information about Apparmor, see [Apparmor documentation](https://wiki.ubuntu.com/AppArmor)
###### Example
```json
"apparmorProfile": "acme_secure_profile"
```
## seccomp
Seccomp provides application sandboxing mechanism in the Linux kernel.
@ -574,17 +539,6 @@ Its value is either slave, private, or shared.
"rootfsPropagation": "slave",
```
## No new privileges
Setting `noNewPrivileges` to true prevents the processes in the container from gaining additional privileges.
[The kernel doc](https://www.kernel.org/doc/Documentation/prctl/no_new_privs.txt) has more information on how this is achieved using a prctl system call.
###### Example
```json
"noNewPrivileges": true,
```
[cgroup-v1]: https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
[cgroup-v1-blkio]: https://www.kernel.org/doc/Documentation/cgroup-v1/blkio-controller.txt
[cgroup-v1-cpusets]: https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt

View File

@ -33,6 +33,14 @@ type Process struct {
// Cwd is the current working directory for the process and must be
// relative to the container's root.
Cwd string `json:"cwd"`
// Capabilities are linux capabilities that are kept for the container.
Capabilities []string `json:"capabilities,omitempty"`
// ApparmorProfile specified the apparmor profile for the container.
ApparmorProfile string `json:"apparmorProfile,omitempty"`
// SelinuxProcessLabel specifies the selinux context that the container process is run as.
SelinuxLabel string `json:"selinuxLabel,omitempty"`
// NoNewPrivileges controls whether additional privileges could be gained by processes in the container.
NoNewPrivileges bool `json:"noNewPrivileges,omitempty"`
}
// Root contains information about the container's root filesystem on the host.

View File

@ -90,6 +90,17 @@ See links for details about [mountvol](http://ss64.com/nt/mountvol.html) and [Se
* **`env`** (array of strings, optional) contains a list of variables that will be set in the process's environment prior to execution. Elements in the array are specified as Strings in the form "KEY=value". The left hand side must consist solely of letters, digits, and underscores `_` as outlined in [IEEE Std 1003.1-2001](http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap08.html).
* **`args`** (string, required) executable to launch and any flags as an array. The executable is the first element and must be available at the given path inside of the rootfs. If the executable path is not an absolute path then the search $PATH is interpreted to find the executable.
For Linux-based systemd the process structure supports the following process specific fields:
* **`capabilities`** (array of strings, optional) capabilities is an array that specifies Linux capabilities that can be provided to the process inside the container.
Valid values are the strings for capabilities defined in [the man page](http://man7.org/linux/man-pages/man7/capabilities.7.html)
* **`apparmorProfile`** (string, optional) apparmor profile specifies the name of the apparmor profile that will be used for the container.
For more information about Apparmor, see [Apparmor documentation](https://wiki.ubuntu.com/AppArmor)
* **`selinuxLabel`** (string, optional) SELinux process label specifies the label with which the processes in a container are run.
For more information about SELinux, see [Selinux documentation](http://selinuxproject.org/page/Main_Page)
* **`noNewPrivileges`** (bool, optional) setting `noNewPrivileges` to true prevents the processes in the container from gaining additional privileges.
[The kernel doc](https://www.kernel.org/doc/Documentation/prctl/no_new_privs.txt) has more information on how this is achieved using a prctl system call.
The user for the process is a platform-specific structure that allows specific control over which user the process runs as.
For Linux-based systems the user structure has the following fields:
@ -114,6 +125,14 @@ For Linux-based systems the user structure has the following fields:
"cwd": "/root",
"args": [
"sh"
],
"apparmorProfile": "acme_secure_profile",
"selinuxLabel": "system_u:system_r:svirt_lxc_net_t:s0:c124,c675",
"noNewPrivileges": true,
"capabilities": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
]
}
```

View File

@ -14,8 +14,6 @@ type LinuxSpec struct {
// Linux contains platform specific configuration for linux based containers.
type Linux struct {
// Capabilities are linux capabilities that are kept for the container.
Capabilities []string `json:"capabilities"`
// UIDMapping specifies user mappings for supporting user namespaces on linux.
UIDMappings []IDMapping `json:"uidMappings,omitempty"`
// GIDMapping specifies group mappings for supporting user namespaces on linux.
@ -35,16 +33,10 @@ type Linux struct {
Namespaces []Namespace `json:"namespaces"`
// Devices are a list of device nodes that are created for the container
Devices []Device `json:"devices"`
// ApparmorProfile specified the apparmor profile for the container.
ApparmorProfile string `json:"apparmorProfile"`
// SelinuxProcessLabel specifies the selinux context that the container process is run as.
SelinuxProcessLabel string `json:"selinuxProcessLabel"`
// Seccomp specifies the seccomp security settings for the container.
Seccomp Seccomp `json:"seccomp"`
// RootfsPropagation is the rootfs mount propagation mode for the container.
RootfsPropagation string `json:"rootfsPropagation,omitempty"`
// NoNewPrivileges controls whether additional privileges could be gained by processes in the container.
NoNewPrivileges bool `json:"noNewPrivileges,omitempty"`
}
// User specifies linux specific user and group information for the container's
@ -238,7 +230,7 @@ type Device struct {
// Path to the device.
Path string `json:"path"`
// Device type, block, char, etc.
Type rune `json:"type"`
Type string `json:"type"`
// Major is the device's major number.
Major int64 `json:"major"`
// Minor is the device's minor number.
@ -256,7 +248,7 @@ type DeviceCgroup struct {
// Allow or deny
Allow bool `json:"allow"`
// Device type, block, char, etc.
Type *rune `json:"type,omitempty"`
Type *string `json:"type,omitempty"`
// Major is the device's major number.
Major *int64 `json:"major,omitempty"`
// Minor is the device's minor number.

View File

@ -1,19 +1,15 @@
# Runtime and Lifecycle
## Scope of a Container
Barring access control concerns, the entity using a runtime to create a container MUST be able to use the operations defined in this specification against that same container.
Whether other entities using the same, or other, instance of the runtime can see that container is out of scope of this specification.
## State
Runtime MUST store container metadata on disk so that external tools can consume and act on this information.
It is recommended that this data be stored in a temporary filesystem so that it can be removed on a system reboot.
On Linux/Unix based systems the metadata MUST be stored under `/run/opencontainer/containers`.
For non-Linux/Unix based systems the location of the root metadata directory is currently undefined.
Within that directory there MUST be one directory for each container created, where the name of the directory MUST be the ID of the container.
For example: for a Linux container with an ID of `173975398351`, there will be a corresponding directory: `/run/opencontainer/containers/173975398351`.
Within each container's directory, there MUST be a JSON encoded file called `state.json` that contains the runtime state of the container.
For example: `/run/opencontainer/containers/173975398351/state.json`.
The state of a container MUST include, at least, the following propeties:
The `state.json` file MUST contain all of the following properties:
* **`version`**: (string) is the OCF specification version used when creating the container.
* **`ociVersion`**: (string) is the OCI specification version used when creating the container.
* **`id`**: (string) is the container's ID.
This MUST be unique across all containers on this host.
There is no requirement that it be unique across hosts.
@ -23,37 +19,111 @@ This allows the hooks to perform cleanup and teardown logic after the runtime de
* **`bundlePath`**: (string) is the absolute path to the container's bundle directory.
This is provided so that consumers can find the container's configuration and root filesystem on the host.
*Example*
When serialized in JSON, the format MUST adhere to the following pattern:
```json
{
"version": "0.2.0",
"id": "oc-container",
"ociVersion": "0.2.0",
"id": "oci-container1",
"pid": 4422,
"bundlePath": "/containers/redis"
}
```
See [Query State](#query-state) for information on retrieving the state of a container.
## Lifecycle
The lifecycle describes the timeline of events that happen from when a container is created to when it ceases to exist.
1. OCI compliant runtime is invoked by passing the bundle path as argument.
2. The container's runtime environment is created according to the configuration in [`config.json`](config.md).
Any updates to `config.json` after container is running do not affect the container.
3. The container's state.json file is written to the filesystem.
4. The prestart hooks are invoked by the runtime.
If any prestart hook fails, then the container is stopped and the lifecycle continues at step 8.
5. The user specified process is executed in the container.
6. The poststart hooks are invoked by the runtime.
If any poststart hook fails, then the container is stopped and the lifecycle continues at step 8.
7. Additional actions such as pausing the container, resuming the container or signaling the container may be performed using the runtime interface.
The container could also error out or crash.
8. The container is destroyed by undoing the steps performed during create phase (step 2).
9. The poststop hooks are invoked by the runtime and errors, if any, are logged.
10. The state.json file associated with the container is removed and the return code of the container's user specified process is returned or logged.
1. OCI compliant runtime is invoked with a reference to the location of the bundle.
How this reference is passed to the runtime is an implementation detail.
2. The container's runtime environment MUST be created according to the configuration in [`config.json`](config.md).
Any updates to `config.json` after container is running MUST not affect the container.
3. The prestart hooks MUST be invoked by the runtime.
If any prestart hook fails, then the container MUST be stopped and the lifecycle continues at step 8.
4. The user specified process MUST be executed in the container.
5. The poststart hooks MUST be invoked by the runtime.
If any poststart hook fails, then the container MUST be stopped and the lifecycle continues at step 8.
6. Additional actions such as pausing the container, resuming the container or signaling the container MAY be performed using the runtime interface.
The container MAY also error out, exit or crash.
7. The container MUST be destroyed by undoing the steps performed during create phase (step 2).
8. The poststop hooks MUST be invoked by the runtime and errors, if any, MAY be logged.
Note: The lifecycle is a WIP and it will evolve as we have more use cases and more information on the viability of a separate create phase.
## Operations
OCI compliant runtimes MUST support the following operations, unless the operation is not supported by the base operating system.
### Errors
In cases where the specified operation generates an error, this specification does not mandate how, or even if, that error is returned or exposed to the user of an implementation.
Unless otherwise stated, generating an error MUST leave the state of the environment as if the operation were never attempted - modulo any possible trivial ancillary changes such as logging.
### Query State
`state <container-id>`
This operation MUST generate an error if it is not provided the ID of a container.
This operation MUST return the state of a container as specified in the [State](#state) section.
In particular, the state MUST be serialized as JSON.
### Start
`start <container-id> <path-to-bundle>`
This operation MUST generate an error if it is not provided a path to the bundle and the container ID to associate with the container.
If the ID provided is not unique across all containers within the scope of the runtime, or is not valid in any other way, the implementation MUST generate an error.
Using the data in `config.json`, that are in the bundle's directory, this operation MUST create a new container.
This includes creating the relevant namespaces, resource limits, etc and configuring the appropriate capabilities for the container.
A new process within the scope of the container MUST be created as specified by the `config.json` file otherwise an error MUST be generated.
Attempting to start an already running container MUST have no effect on the container and MUST generate an error.
### Stop
`stop <container-id>`
This operation MUST generate an error if it is not provided the container ID.
This operation MUST stop and delete a running container.
Stopping a container MUST stop all of the processes running within the scope of the container.
Deleting a container MUST delete the associated namespaces and resources associated with the container.
Once a container is deleted, its `id` MAY be used by subsequent containers.
Attempting to stop a container that is not running MUST have no effect on the container and MUST generate an error.
### Exec
`exec <container-id> <path-to-json>`
This operation MUST generate an error if it is not provided the container ID and a path to the JSON describing the process to start.
The JSON describing the new process MUST adhere to the [Process configuration](config.md#process-configuration) definition.
This operation MUST create a new process within the scope of the container.
If the container is not running then this operation MUST have no effect on the container and MUST generate an error.
Executing this operation multiple times MUST result in a new process each time.
Example:
```
{
"terminal": true,
"user": {
"uid": 0,
"gid": 0,
"additionalGids": null
},
"args": [
"/bin/sleep",
"60"
],
"env": [
"version=1.0"
],
"cwd": "...",
}
```
This specification does not manadate the name of this JSON file.
See the specification of the `config.json` file for the definition of these fields.
The stopping, or exiting, of these secondary process MUST have no effect on the state of the container.
In other words, a container (and its PID 1 process) MUST NOT be stopped due to the exiting of a secondary process.
## Hooks
See [runtime configuration for hooks](./config.md#hooks)
Many of the operations specified in this specification have "hooks" that allow for additional actions to be taken before or after each operation.
See [runtime configuration for hooks](./config.md#hooks) for more information.

View File

@ -13,9 +13,14 @@ The redundancy reduction from removing the namespacing prefix is not useful enou
## Optional settings should have pointer Go types
So we have a consistent way to identify unset values ([source][optional-pointer]).
The exceptions are entries where the Go default for the type is a no-op in the spec, in which case `omitempty` is sufficient and no pointer is needed (sources [here][no-pointer-for-slices], [here][no-pointer-for-boolean], and [here][pointer-when-updates-require-changes]).
[capabilities]: config-linux.md#capabilities
[class-id]: config-linux.md#network
[integer-over-hex]: https://github.com/opencontainers/specs/pull/267#discussion_r48360013
[keep-prefix]: https://github.com/opencontainers/specs/pull/159#issuecomment-138728337
[no-pointer-for-boolean]: https://github.com/opencontainers/specs/pull/290#discussion_r50296396
[no-pointer-for-slices]: https://github.com/opencontainers/specs/pull/316/files#r50782982
[optional-pointer]: https://github.com/opencontainers/specs/pull/233#discussion_r47829711
[pointer-when-updates-require-changes]: https://github.com/opencontainers/specs/pull/317/files#r50932706

View File

@ -6,12 +6,12 @@ const (
// VersionMajor is for an API incompatible changes
VersionMajor = 0
// VersionMinor is for functionality in a backwards-compatible manner
VersionMinor = 3
VersionMinor = 4
// VersionPatch is for backwards-compatible bug fixes
VersionPatch = 0
// VersionDev indicates development branch. Releases will be empty string.
VersionDev = ""
VersionDev = "-dev"
)
// Version is the specification version that the package types support.

49
spec.go
View File

@ -56,7 +56,13 @@ var specCommand = cli.Command{
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm",
},
Cwd: "/",
Cwd: "/",
NoNewPrivileges: true,
Capabilities: []string{
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE",
},
},
Hostname: "shell",
Mounts: []specs.Mount{
@ -105,11 +111,6 @@ var specCommand = cli.Command{
},
},
Linux: specs.Linux{
Capabilities: []string{
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE",
},
Resources: &specs.Resources{
Devices: []specs.DeviceCgroup{
{
@ -142,7 +143,6 @@ var specCommand = cli.Command{
Soft: uint64(1024),
},
},
NoNewPrivileges: true,
},
}
@ -246,7 +246,7 @@ func createLibcontainerConfig(cgroupName string, spec *specs.LinuxSpec) (*config
}
config := &configs.Config{
Rootfs: rootfsPath,
Capabilities: spec.Linux.Capabilities,
Capabilities: spec.Process.Capabilities,
Readonlyfs: spec.Root.Readonly,
Hostname: spec.Hostname,
Labels: []string{
@ -303,9 +303,9 @@ func createLibcontainerConfig(cgroupName string, spec *specs.LinuxSpec) (*config
}
config.Seccomp = seccomp
config.Sysctl = spec.Linux.Sysctl
config.ProcessLabel = spec.Linux.SelinuxProcessLabel
config.AppArmorProfile = spec.Linux.ApparmorProfile
config.NoNewPrivileges = spec.Linux.NoNewPrivileges
config.ProcessLabel = spec.Process.SelinuxLabel
config.AppArmorProfile = spec.Process.ApparmorProfile
config.NoNewPrivileges = spec.Process.NoNewPrivileges
if oomScoreAdj := spec.Linux.Resources.OOMScoreAdj; oomScoreAdj != nil {
config.OomScoreAdj = *oomScoreAdj
}
@ -362,7 +362,7 @@ func createCgroupConfig(name string, spec *specs.LinuxSpec) (*configs.Cgroup, er
}
for i, d := range spec.Linux.Resources.Devices {
var (
t = 'a'
t = "a"
major = int64(-1)
minor = int64(-1)
)
@ -378,8 +378,12 @@ func createCgroupConfig(name string, spec *specs.LinuxSpec) (*configs.Cgroup, er
if d.Access == nil || *d.Access == "" {
return nil, fmt.Errorf("device access at %d field canot be empty", i)
}
dt, err := stringToDeviceRune(t)
if err != nil {
return nil, err
}
dd := &configs.Device{
Type: t,
Type: dt,
Major: major,
Minor: minor,
Permissions: *d.Access,
@ -494,6 +498,19 @@ func createCgroupConfig(name string, spec *specs.LinuxSpec) (*configs.Cgroup, er
return c, nil
}
func stringToDeviceRune(s string) (rune, error) {
switch s {
case "a":
return 'a', nil
case "b":
return 'b', nil
case "c":
return 'c', nil
default:
return 0, fmt.Errorf("invalid device type %q", s)
}
}
func createDevices(spec *specs.LinuxSpec, config *configs.Config) error {
// add whitelisted devices
config.Devices = []*configs.Device{
@ -561,8 +578,12 @@ func createDevices(spec *specs.LinuxSpec, config *configs.Config) error {
if d.GID != nil {
gid = *d.GID
}
dt, err := stringToDeviceRune(d.Type)
if err != nil {
return err
}
device := &configs.Device{
Type: d.Type,
Type: dt,
Path: d.Path,
Major: d.Major,
Minor: d.Minor,