runc/runtime.md

135 lines
7.7 KiB
Markdown
Raw Normal View History

# Runtime and Lifecycle
2015-06-06 08:39:27 +08:00
## Scope of a Container
Barring access control concerns, the entity using a runtime to create a container MUST be able to use the operations defined in this specification against that same container.
Whether other entities using the same, or other, instance of the runtime can see that container is out of scope of this specification.
## State
The state of a container includes the following properties:
* **`ociVersion`** (string, REQUIRED) is the OCI specification version used when creating the container.
* **`id`** (string, REQUIRED) is the container's ID.
This MUST be unique across all containers on this host.
There is no requirement that it be unique across hosts.
* **`status`** (string, REQUIRED) is the runtime state of the container.
The value MAY be one of:
runtime: Replace "process is stopped" with "process exits" proc(5) describes the following state entries in proc/[pid]/stat [1] (for modern kernels): * R Running * S Sleeping in an interruptible wait * D Waiting in uninterruptible disk sleep * Z Zombie * T Stopped (on a signal) * t Tracing stop * X Dead and ps(1) has a bit more context [2] (for modern kernels): * D uninterruptible sleep (usually IO) * R running or runnable (on run queue) * S interruptible sleep (waiting for an event to complete) * T stopped by job control signal * t stopped by debugger during the tracing * X dead (should never be seen) * Z defunct ("zombie") process, terminated but not reaped by its parent So I expect "stopped" to mean "process still exists but is paused, e.g. by SIGSTOP". And I expect "exited" to mean "process has finished and is either a zombie or dead". After this commit, 'git grep -i stop' only turns up the "stopped" state (which I've left alone for backwards compat), some poststop-hook stuff, a reference in principles.md, a "stoppage" in LICENSE, and some ChangeLog entries. Also replace "container's process" with "container process" to match usage in the rest of the repository. After this commit: $ git grep -i "container process" | wc -l 20 $ git grep -i "container's process" | wc -l 1 Also reword status entries to avoid "running", which is less precise in our spec (e.g. it also includes "sleeping", "waiting", ...). Also removes a "them" leftover from a partial plural -> singular reroll of be594153 (Split create and start, 2016-04-01, #384). [1]: http://man7.org/linux/man-pages/man5/proc.5.html [2]: http://man7.org/linux/man-pages/man1/ps.1.html Signed-off-by: W. Trevor King <wking@tremily.us>
2016-05-27 13:47:52 +08:00
* `created`: the container process has neither exited nor executed the user-specified program
* `running`: the container process has executed the user-specified program but has not exited
* `stopped`: the container process has exited
Additional values MAY be defined by the runtime, however, they MUST be used to represent new runtime states not defined above.
* **`pid`** (int, REQUIRED when `status` is `created` or `running`) is the ID of the container process, as seen by the host.
* **`bundlePath`** (string, REQUIRED) is the absolute path to the container's bundle directory.
This is provided so that consumers can find the container's configuration and root filesystem on the host.
* **`annotations`** (map, OPTIONAL) contains the list of annotations associated with the container.
If no annotations were provided then this property MAY either be absent or an empty map.
The state MAY include additional properties.
When serialized in JSON, the format MUST adhere to the following pattern:
```json
{
"ociVersion": "0.2.0",
"id": "oci-container1",
"status": "running",
"pid": 4422,
"bundlePath": "/containers/redis",
"annotations": {
"myKey": "myValue"
}
}
```
See [Query State](#query-state) for information on retrieving the state of a container.
## Lifecycle
The lifecycle describes the timeline of events that happen from when a container is created to when it ceases to exist.
1. OCI compliant runtime's [`create`](runtime.md#create) command is invoked with a reference to the location of the bundle and a unique identifier.
2. The container's runtime environment MUST be created according to the configuration in [`config.json`](config.md).
If the runtime is unable to create the environment specified in the [`config.json`](config.md), it MUST generate an error.
*: Replace "user-specified code" with "user-specified program" In [1], I'd proposed replacing our old "user-specified process" with "user-specified code" to help distinguish between 'create' (cloning the container process) and 'start' (signaling the container process to execve or similar the user-specified $STUFF_FROM_THE_process_CONFIG). That PR was rejected, although the renaming proposed there had already landed via dd0cd210 (Add a 'status' field to our state struct, 2016-05-26, #462). This PR attempts to find a common ground between "process" (preferred by maintainers in #466 [2,3,4], but which I consider incorrect [5]) and "code" (which maintainers found confusing [3,4,6]). The Linux execve(2) says "program" and unpacks that to "a binary executable, or a script starting with a [shebang]" [7]. proc(5) documents /proc/[pid]/exe by talking about "the executed command" [8]. The POSIX exec docs call this the "process image" and talk about loading it from the "new process image file" (although they also sprinkle in a number of “program” references, apparently interchangeably with “process image”) [9]. POSIX formally defines "command" [11], "executable file" [12], and "program" [13]. The only reference to "process image" in the definitions is in the "executable file" entry. The "command" definition is focused on the shell, the "executable file" definition is focused on files, and the "program" definition talks about a "prepared sequence of instructions to the system", so "program" seems like the best fit. [1]: https://github.com/opencontainers/runtime-spec/pull/466 Subject: runtime: Replace "user-specified process" with "user-specified code" in 'create' [2]: https://github.com/opencontainers/runtime-spec/pull/466#r64982402 [3]: https://github.com/opencontainers/runtime-spec/pull/466#issuecomment-223132793 [4]: https://github.com/opencontainers/runtime-spec/pull/466#issuecomment-258563220 [5]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_295 [6]: https://github.com/opencontainers/runtime-spec/pull/466#r64982165 [7]: http://man7.org/linux/man-pages/man2/execve.2.html [8]: http://man7.org/linux/man-pages/man5/proc.5.html [9]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html [10]: https://git.kernel.org/cgit/docs/man-pages/man-pages.git/ [11]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_104 [12]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_154 [13]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_306 Signed-off-by: W. Trevor King <wking@tremily.us>
2016-11-18 18:51:51 +08:00
While the resources requested in the [`config.json`](config.md) MUST be created, the user-specified program (from [`process`](config.md#process)) MUST NOT be run at this time.
Any updates to [`config.json`](config.md) after this step MUST NOT affect the container.
3. Once the container is created additional actions MAY be performed based on the features the runtime chooses to support.
However, some actions might only be available based on the current state of the container (e.g. only available while it is started).
4. Runtime's [`start`](runtime.md#start) command is invoked with the unique identifier of the container.
*: Replace "user-specified code" with "user-specified program" In [1], I'd proposed replacing our old "user-specified process" with "user-specified code" to help distinguish between 'create' (cloning the container process) and 'start' (signaling the container process to execve or similar the user-specified $STUFF_FROM_THE_process_CONFIG). That PR was rejected, although the renaming proposed there had already landed via dd0cd210 (Add a 'status' field to our state struct, 2016-05-26, #462). This PR attempts to find a common ground between "process" (preferred by maintainers in #466 [2,3,4], but which I consider incorrect [5]) and "code" (which maintainers found confusing [3,4,6]). The Linux execve(2) says "program" and unpacks that to "a binary executable, or a script starting with a [shebang]" [7]. proc(5) documents /proc/[pid]/exe by talking about "the executed command" [8]. The POSIX exec docs call this the "process image" and talk about loading it from the "new process image file" (although they also sprinkle in a number of “program” references, apparently interchangeably with “process image”) [9]. POSIX formally defines "command" [11], "executable file" [12], and "program" [13]. The only reference to "process image" in the definitions is in the "executable file" entry. The "command" definition is focused on the shell, the "executable file" definition is focused on files, and the "program" definition talks about a "prepared sequence of instructions to the system", so "program" seems like the best fit. [1]: https://github.com/opencontainers/runtime-spec/pull/466 Subject: runtime: Replace "user-specified process" with "user-specified code" in 'create' [2]: https://github.com/opencontainers/runtime-spec/pull/466#r64982402 [3]: https://github.com/opencontainers/runtime-spec/pull/466#issuecomment-223132793 [4]: https://github.com/opencontainers/runtime-spec/pull/466#issuecomment-258563220 [5]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_295 [6]: https://github.com/opencontainers/runtime-spec/pull/466#r64982165 [7]: http://man7.org/linux/man-pages/man2/execve.2.html [8]: http://man7.org/linux/man-pages/man5/proc.5.html [9]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html [10]: https://git.kernel.org/cgit/docs/man-pages/man-pages.git/ [11]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_104 [12]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_154 [13]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_306 Signed-off-by: W. Trevor King <wking@tremily.us>
2016-11-18 18:51:51 +08:00
The runtime MUST run the user-specified program, as specified by [`process`](config.md#process).
runtime: Replace "process is stopped" with "process exits" proc(5) describes the following state entries in proc/[pid]/stat [1] (for modern kernels): * R Running * S Sleeping in an interruptible wait * D Waiting in uninterruptible disk sleep * Z Zombie * T Stopped (on a signal) * t Tracing stop * X Dead and ps(1) has a bit more context [2] (for modern kernels): * D uninterruptible sleep (usually IO) * R running or runnable (on run queue) * S interruptible sleep (waiting for an event to complete) * T stopped by job control signal * t stopped by debugger during the tracing * X dead (should never be seen) * Z defunct ("zombie") process, terminated but not reaped by its parent So I expect "stopped" to mean "process still exists but is paused, e.g. by SIGSTOP". And I expect "exited" to mean "process has finished and is either a zombie or dead". After this commit, 'git grep -i stop' only turns up the "stopped" state (which I've left alone for backwards compat), some poststop-hook stuff, a reference in principles.md, a "stoppage" in LICENSE, and some ChangeLog entries. Also replace "container's process" with "container process" to match usage in the rest of the repository. After this commit: $ git grep -i "container process" | wc -l 20 $ git grep -i "container's process" | wc -l 1 Also reword status entries to avoid "running", which is less precise in our spec (e.g. it also includes "sleeping", "waiting", ...). Also removes a "them" leftover from a partial plural -> singular reroll of be594153 (Split create and start, 2016-04-01, #384). [1]: http://man7.org/linux/man-pages/man5/proc.5.html [2]: http://man7.org/linux/man-pages/man1/ps.1.html Signed-off-by: W. Trevor King <wking@tremily.us>
2016-05-27 13:47:52 +08:00
5. The container process exits.
This MAY happen due to erroring out, exiting, crashing or the runtime's [`kill`](runtime.md#kill) operation being invoked.
6. Runtime's [`delete`](runtime.md#delete) command is invoked with the unique identifier of the container.
The container MUST be destroyed by undoing the steps performed during create phase (step 2).
## Errors
In cases where the specified operation generates an error, this specification does not mandate how, or even if, that error is returned or exposed to the user of an implementation.
Unless otherwise stated, generating an error MUST leave the state of the environment as if the operation were never attempted - modulo any possible trivial ancillary changes such as logging.
## Operations
OCI compliant runtimes MUST support the following operations, unless the operation is not supported by the base operating system.
Note: these operations are not specifying any command-line APIs, and the parameters are inputs for general operations.
### Query State
`state <container-id>`
This operation MUST generate an error if it is not provided the ID of a container.
Attempting to query a container that does not exist MUST generate an error.
This operation MUST return the state of a container as specified in the [State](#state) section.
### Create
`create <container-id> <path-to-bundle>`
This operation MUST generate an error if it is not provided a path to the bundle and the container ID to associate with the container.
runtime.md: Require 'create' to fail if config.json asks for the impossible We don't want to silently ignore settings that we understand but cannot implement [1] (we *do* want to ignore settings that we don't understand [2], but that's a separate issue). This raises a slightly sticky certification issue. If a runtime *always* exits 'create' with an error: func create() err { return fmt.Errorf("nope, I cannot create that container either.") } it would be neither complaint nor non-compliant. It would not fail any MUSTs, but availing itself of the "cannot create the maintainer" option specified in this commit would mean the test suite could not test the deeper requirements around the config properties themselves. So with this change, making Microsoft certifiable will still need an explicit weakening around root.path. The easiest way to do that might be to have separate annotations for whether a setting is optional for config authors and whether it's optional for runtime authors (supported): * **`readonly`** (bool, config:optional, support:optional) ... But I'll leave hashing that out to a later commit. Regardless of the certification impact, we want to be clear that silently ignoring known parameters is wrong. [1]: https://github.com/opencontainers/runtime-spec/pull/476/files/9b8e21826cc9887f51f095604120cfbb788078b2#r65400731 Subject: [ Config | Root Config ] Clarify readonly [2]: https://github.com/opencontainers/runtime-spec/pull/510 Subject: Add text about extensions Signed-off-by: W. Trevor King <wking@tremily.us>
2016-09-09 05:56:08 +08:00
If the ID provided is not unique across all containers within the scope of the runtime, or is not valid in any other way, the implementation MUST generate an error and a new container MUST NOT be created.
Using the data in [`config.json`](config.md), this operation MUST create a new container.
*: Replace "user-specified code" with "user-specified program" In [1], I'd proposed replacing our old "user-specified process" with "user-specified code" to help distinguish between 'create' (cloning the container process) and 'start' (signaling the container process to execve or similar the user-specified $STUFF_FROM_THE_process_CONFIG). That PR was rejected, although the renaming proposed there had already landed via dd0cd210 (Add a 'status' field to our state struct, 2016-05-26, #462). This PR attempts to find a common ground between "process" (preferred by maintainers in #466 [2,3,4], but which I consider incorrect [5]) and "code" (which maintainers found confusing [3,4,6]). The Linux execve(2) says "program" and unpacks that to "a binary executable, or a script starting with a [shebang]" [7]. proc(5) documents /proc/[pid]/exe by talking about "the executed command" [8]. The POSIX exec docs call this the "process image" and talk about loading it from the "new process image file" (although they also sprinkle in a number of “program” references, apparently interchangeably with “process image”) [9]. POSIX formally defines "command" [11], "executable file" [12], and "program" [13]. The only reference to "process image" in the definitions is in the "executable file" entry. The "command" definition is focused on the shell, the "executable file" definition is focused on files, and the "program" definition talks about a "prepared sequence of instructions to the system", so "program" seems like the best fit. [1]: https://github.com/opencontainers/runtime-spec/pull/466 Subject: runtime: Replace "user-specified process" with "user-specified code" in 'create' [2]: https://github.com/opencontainers/runtime-spec/pull/466#r64982402 [3]: https://github.com/opencontainers/runtime-spec/pull/466#issuecomment-223132793 [4]: https://github.com/opencontainers/runtime-spec/pull/466#issuecomment-258563220 [5]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_295 [6]: https://github.com/opencontainers/runtime-spec/pull/466#r64982165 [7]: http://man7.org/linux/man-pages/man2/execve.2.html [8]: http://man7.org/linux/man-pages/man5/proc.5.html [9]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html [10]: https://git.kernel.org/cgit/docs/man-pages/man-pages.git/ [11]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_104 [12]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_154 [13]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_306 Signed-off-by: W. Trevor King <wking@tremily.us>
2016-11-18 18:51:51 +08:00
This means that all of the resources associated with the container MUST be created, however, the user-specified program MUST NOT be run at this time.
If the runtime cannot create the container as specified in [`config.json`](config.md), it MUST generate an error and a new container MUST NOT be created.
Upon successful completion of this operation the `status` property of this container MUST be `created`.
The runtime MAY validate `config.json` against this spec, either generically or with respect to the local system capabilities, before creating the container ([step 2](#lifecycle)).
Runtime callers who are interested in pre-create validation can run [bundle-validation tools](implementations.md#testing--tools) before invoking the create operation.
Any changes made to the [`config.json`](config.md) file after this operation will not have an effect on the container.
### Start
`start <container-id>`
This operation MUST generate an error if it is not provided the container ID.
Attempting to start a container that does not exist MUST generate an error.
Attempting to start an already started container MUST have no effect on the container and MUST generate an error.
*: Replace "user-specified code" with "user-specified program" In [1], I'd proposed replacing our old "user-specified process" with "user-specified code" to help distinguish between 'create' (cloning the container process) and 'start' (signaling the container process to execve or similar the user-specified $STUFF_FROM_THE_process_CONFIG). That PR was rejected, although the renaming proposed there had already landed via dd0cd210 (Add a 'status' field to our state struct, 2016-05-26, #462). This PR attempts to find a common ground between "process" (preferred by maintainers in #466 [2,3,4], but which I consider incorrect [5]) and "code" (which maintainers found confusing [3,4,6]). The Linux execve(2) says "program" and unpacks that to "a binary executable, or a script starting with a [shebang]" [7]. proc(5) documents /proc/[pid]/exe by talking about "the executed command" [8]. The POSIX exec docs call this the "process image" and talk about loading it from the "new process image file" (although they also sprinkle in a number of “program” references, apparently interchangeably with “process image”) [9]. POSIX formally defines "command" [11], "executable file" [12], and "program" [13]. The only reference to "process image" in the definitions is in the "executable file" entry. The "command" definition is focused on the shell, the "executable file" definition is focused on files, and the "program" definition talks about a "prepared sequence of instructions to the system", so "program" seems like the best fit. [1]: https://github.com/opencontainers/runtime-spec/pull/466 Subject: runtime: Replace "user-specified process" with "user-specified code" in 'create' [2]: https://github.com/opencontainers/runtime-spec/pull/466#r64982402 [3]: https://github.com/opencontainers/runtime-spec/pull/466#issuecomment-223132793 [4]: https://github.com/opencontainers/runtime-spec/pull/466#issuecomment-258563220 [5]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_295 [6]: https://github.com/opencontainers/runtime-spec/pull/466#r64982165 [7]: http://man7.org/linux/man-pages/man2/execve.2.html [8]: http://man7.org/linux/man-pages/man5/proc.5.html [9]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html [10]: https://git.kernel.org/cgit/docs/man-pages/man-pages.git/ [11]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_104 [12]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_154 [13]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_306 Signed-off-by: W. Trevor King <wking@tremily.us>
2016-11-18 18:51:51 +08:00
This operation MUST run the user-specified program as specified by [`process`](config.md#process).
Upon successful completion of this operation the `status` property of this container MUST be `running`.
### Kill
`kill <container-id> <signal>`
This operation MUST generate an error if it is not provided the container ID.
Attempting to send a signal to a container that is not running MUST have no effect on the container and MUST generate an error.
This operation MUST send the specified signal to the process in the container.
When the process in the container is stopped, irrespective of it being as a result of a `kill` operation or any other reason, the `status` property of this container MUST be `stopped`.
### Delete
`delete <container-id>`
This operation MUST generate an error if it is not provided the container ID.
Attempting to delete a container that does not exist MUST generate an error.
Attempting to delete a container whose process is still running MUST generate an error.
Deleting a container MUST delete the resources that were created during the `create` step.
Note that resources associated with the container, but not created by this container, MUST NOT be deleted.
Once a container is deleted its ID MAY be used by a subsequent container.
## Hooks
Many of the operations specified in this specification have "hooks" that allow for additional actions to be taken before or after each operation.
See [runtime configuration for hooks](./config.md#hooks) for more information.