Commit Graph

43 Commits

Author SHA1 Message Date
Michael Crosby adcbe530a9 Add masked and readonly paths
Fixes #320

This adds the maskedPaths and readonlyPaths fields to the spec so that
proper masking and setting of files in /proc can be configured.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-04-01 10:46:41 -07:00
Antonio Murdaca 5ded78475c *: fix typos
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-03-21 11:51:19 +01:00
W. Trevor King 5dad125595 config-linux: Specify host mount namespace for namespace paths
Avoid trouble with situations like:

  # mount --bind /mnt/test /mnt/test
  # mount --make-rprivate /mnt/test
  # touch /mnt/test/mnt /mnt/test/user
  # mount --bind /proc/123/ns/mnt /mnt/test/mnt
  # mount --bind /proc/123/ns/user /mnt/test/user
  # nsenter --mount=/proc/123/ns/mnt --user /proc/123/ns/user sh

which uses the required private mount for binding mount namespace
references [1,2,3].  We want to avoid:

1. Runtime opens /mnt/test/mnt as fd 3.
2. Runtime joins the mount namespace referenced by fd 3.
3. Runtime fails to open /mnt/test/user, because /mnt/test is not
   visible in the current mount namespace.

and instead get runtime authors to setup flows like:

1. Runtime opens /mnt/test/mnt as fd 3.
2. Runtime opens /mnt/test/user as fd 4.
3. Runtime joins the mount namespace referenced by fd 3.
4. Runtime joins the user namespace referenced by fd 4.

This also applies to new namespace creation.  We want to avoid:

1. Runtime clones a container process with a new mount namespace.
2c. Container process fails to open /mnt/test/user, because /mnt/test
    is not visible in the current mount namespace.

in favor of something like:

1. Runtime opens /mnt/test/user as fd 3.
2. Runtime clones a container process with a new mount namespace.
3h. Host process closes unneeded fd 3.
3c. Container process joins the user namespace referenced by fd 3.

I also define runtime and container namespaces, so we have consistent
terminology.  I prefer:

* host namespace: a namespace you are in when you invoke the runtime
* host process: the runtime process invoked by the user
* container process: the process created by a clone call in the host
  process which will eventually execute the user-configured process.

Both the host and container processes are running runtime code
(although the container process eventually transitions to
user-configured code), so I find "runtime process", "runtime
namespace", etc. to be imprecise.  However, the maintainer consensus
is for "runtime namespace" [4,5], so that's what we're going with
here.

[1]: http://karelzak.blogspot.com/2015/04/persistent-namespaces.html
[2]: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4ce5d2b1a8fde84c0eebe70652cf28b9beda6b4e
[3]: http://mid.gmane.org/87haeahkzc.fsf@xmission.com
[4]: https://github.com/opencontainers/specs/pull/275#discussion_r48057211
[5]: https://github.com/opencontainers/specs/pull/275#discussion_r48324264

Signed-off-by: W. Trevor King <wking@tremily.us>
2016-03-16 14:47:29 -07:00
Julian Friedman 9d9ed06d5e Move rlimits to process
Signed-off-by: Julian Friedman <julz.friedman@uk.ibm.com>
2016-03-10 09:44:43 +00:00
Michael Crosby 5a8a779fb0 Move process specific settings to process
This moves process specific settings like caps, apparmor, and selinux
process label onto the process structure to allow the same settings to
be changed at exec time.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-03-02 11:40:09 -08:00
Qiang Huang ccf3a246ca Fix fileMode json example
In json, os.FileMode would be presented as a uint32, which
is decimal. Otherwise we'll get error:
`invalid character '6' after object key:value pair`
when unmarshal the json file.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-02-23 13:34:20 +08:00
Qiang Huang 9bab930044 Fix type of devices type
Fixes: opencontainers/runc#566

For type rune, we can assign char as 'c' in struct, but after
marshal, it'll be presented as int32. So in json config it needs
to be presented as a number which is not friendly to be identified.

Change it to string so that you can actually write "b", "c" in json
spec and you can easily know what type of device it is.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-02-23 13:33:57 +08:00
W. Trevor King 1b0056cbff config-linux: Update links to cgroups documentation
With 34a9304a (Merge branch 'for-4.5' of
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup, 2016-01-13,
[1]), Linux restructured their cgroups documentation.  This updated
all of our Documentation/cgroups references to match the new layout,
using reference-style links [2] which let us collect link label
definitions at the bottom of the file.  That makes the spec source
easier to read (no distracting URLs in the middle of a sentence) and
makes the URLs easier to update (only one place to check / fix).

[1]: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=34a9304a96d6351c2d35dcdc9293258378fc0bd8
[2]: http://daringfireball.net/projects/markdown/syntax#link

Signed-off-by: W. Trevor King <wking@tremily.us>
2016-01-27 20:14:33 -08:00
W. Trevor King 7d5b027673 runtime-config-linux: Separate mknod from cgroups
With mknod entries in linux.devices and cgroups entries in
linux.resources.devices.  Background discussion in [1].

For specifying device cgroups independent of device creation.  This
makes it easy to distinguish between configs that call for cgroup
adjustments (which have linux.resources entries) from those that
don't.  Without this split, folks interested in making that
distinction would have to parse the device section to determine if it
included cgroup changes.  This will also make it easy to drop either
portion (mknod [2] or cgroups [3]) independently of the other if the
project decides to do so.

Using seperate sections for mknod and cgroups also allows us to avoid
the complicated validation rules needed for the combined format
mknod/cgroup [4].

Now that there is a section specific to supplying devices, I shifted
the default device listing over from config-linux [5].  The /dev/ptmx
entry is a bit awkward, since it's not a device, but it seemed to fit
better over here.  But I would also be fine leaving it with the other
mounts in config-linux.

fileMode, uid, and gid are optional, because mknod(2) doesn't need
them and specifies the handling when they aren't set [6,7].
Similarly, major/minor numbers are only required for S_IFCHR and
S_IFBLK [6].  I've left off wording about required runtime behavior
for unset values, because I'd rather address that with a blanket rule
[8].

For the cgroup, access is optional because the kernel docs show an
example that doesn't write an access field to the devices.deny file
[9].  The current kernel docs don't go into much detail on this
behavior (I expect unset and 'rwm' are equivalent), but if the kernel
doesn't need a value written, the spec should get out of the way and
allow users to not specify a value.

The reference links are sorted into two blocks, with kernel-doc links
sorted alphabetically followed by man pages sorted alphabetically by
section.  The cgroup link is new since 2016-01-13 [10].

[1]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/y_Fsa2_jJaM
     Subject: Separate config entries for device mknod and cgroups?
     Date: Mon, 5 Oct 2015 12:46:55 -0700
     Message-ID: <20151005194655.GN28418@odin.tremily.us>
[2]: https://github.com/opencontainers/specs/pull/98
[3]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/qWHoKs8Fsrk
     Subject: removal of cgroups from the OCI Linux spec
     Date: Wed, 28 Oct 2015 17:01:59 +0000
     Message-ID: <CAD2oYtO1RMCcUp52w-xXemzDTs+J6t4hS5Mm4mX+uBnVONGDfA@mail.gmail.com>
[4]: https://github.com/opencontainers/specs/pull/101
[5]: https://github.com/opencontainers/specs/pull/171#discussion_r41190655
[6]: http://man7.org/linux/man-pages/man2/mknod.2.html#DESCRIPTION
[7]: https://github.com/opencontainers/specs/pull/298/files#r51053835
[8]: https://github.com/opencontainers/specs/pull/285#issuecomment-167823651
[9]: https://kernel.org/doc/Documentation/cgroup-v1/devices.txt
[10]: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=34a9304a96d6351c2d35dcdc9293258378fc0bd8

Signed-off-by: W. Trevor King <wking@tremily.us>
2016-01-27 13:52:15 -08:00
W. Trevor King cb2da5430a config: Single, unified config file
Reverting 7232e4b1 (specs: introduce the concept of a runtime.json,
2015-07-30, #88) after discussion on the mailing list [1].  The main
reason is that it's hard to draw a clear line around "inherently
runtime-specific" or "non-portable", so we shouldn't try to do that in
the spec.  Folks who want to flag settings as non-portable for their
own system are welcome to do so (e.g. "we will clobber 'hooks' in
bundles we run") are welcome to do so, but we don't have to have
to split the config into multiple files to do that.

There have been a number of additional changes since #88, so this
isn't a pure Git reversion.  Besides copy-pasting and the associated
link-target updates, I've:

* Restored path -> destination, now that the mount type contains both
  source and target paths again.  I'd prefer 'target' to 'destination'
  to match mount(2), but the pre-7232e4b1 phrasing was 'destination'
  (possibly due to Windows using 'target' for the source?).

* Restored the Windows mount example to its pre-7232e4b1 content.

* Removed required mounts from the config example (requirements landed
  in 3848a238, config-linux: specify the default devices/filesystems
  available, 2015-09-09, #164), because specifying those mounts in the
  config is now redundant.

* Used headers (vs. bold paragraphs) to set off mount examples so we
  get link anchors in the rendered Markdown.

* Replaced references to runtime.json with references to config.json.

[1]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/0QbyJDM9fWY
     Subject: Single, unified config file (i.e. rolling back specs#88)
     Date: Wed, 4 Nov 2015 09:53:20 -0800
     Message-ID: <20151104175320.GC24652@odin.tremily.us>

Signed-off-by: W. Trevor King <wking@tremily.us>
2016-01-27 09:51:54 -08:00
Gao feng 053f05933b move the description of user ns mapping to proper file
They should stay in runtime not config.

Signed-off-by: Gao feng <omarapazanadi@gmail.com>
2016-01-05 14:19:45 +08:00
Vincent Batts 70372d3880 *.md: update TOC and links
Some of the docs were not even linked to, and did not have a logic
outline for their grouping.

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-09-25 11:47:16 -04:00
Vincent Batts 712a7467d1 Merge remote-tracking branch 'origin/pr/163' 2015-09-10 10:07:40 -04:00
Vincent Batts 9a8748cad4 Merge pull request #160 from mrunalp/cap_fix
Modify the capabilities constants to match header files like other constants
2015-09-09 18:59:48 -04:00
Mrunal Patel 663be9d677 Modify the capabilities constants to match header files like other constants
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-09-09 12:43:17 -04:00
Brandon Philips 3848a23819 config-linux: specify the default devices/filesystems available
Fixes #95

Signed-off-by: Brandon Philips <brandon.philips@coreos.com>
2015-09-09 09:36:59 -07:00
Lai Jiangshan 339e038400 Deduplicate the field of RootfsPropagation
There are two RootfsPropagation fields, one is Linux.RootfsPropagation,
the other one is LinuxRuntime.RootfsPropagation. They are duplicated,
one of them should be removed.

The RootfsPropagation is definitely a runtime specific configuration,
so we remove the one of Linux.RootfsPropagation.

And the description of it is moved from config-linux.md to
runtime-config-linux.md.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
2015-09-09 23:27:37 +08:00
Vincent Batts 6cab2747d9 *.md: markdown formatting
Closes https://github.com/opencontainers/specs/issues/83

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-09-09 10:17:06 -04:00
Brandon Philips 7232e4b137 specs: introduce the concept of a runtime.json
Based on our discussion in-person yesterday it seems necessary to
separate the concept of runtime configuration from application
configuration. There are a few motivators:

- To support runtime updates of things like cgroups, rlimits, etc we
  should separate things that are inherently runtime specific from
  things that are static to the application running in the container.

- To support the goal of being able to move a bundle between hosts we
  should make it clear what parts of the spec are and are not portable
  between hosts so that upon landing on a new host the non-portable
  options may be rewritten or removed.

- In order to attach a cryptographic identity to a bundle we must not
  include details in the bundle that are host specific.
2015-08-26 09:44:09 -07:00
Tiesheng 45ae53d4db Fix typos in the "Namespace types" section
Signed-off-by: ChengTiesheng <chengtiesheng@huawei.com>
2015-08-20 11:08:40 +08:00
Mrunal Patel af36d746ba Add Apparmor, Selinux and Seccomp sections
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-08-07 14:19:10 -04:00
Alexander Morozov 5273b3d785 Replace Linux.Device with more specific config
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-08-06 10:26:29 -07:00
Michael Crosby 55912bd676 Merge pull request #79 from laijs/json-notation-in-md
specs: add json notation
2015-07-27 09:06:50 -07:00
Lai Jiangshan d485f77fbd specs: fix the description for the [ug]idMappings
The fields in the [ug]idMappings are changed, we should fix
the description correspondingly.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
2015-07-26 16:30:59 +08:00
Lai Jiangshan 2e186c62c3 specs: add json notation
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
2015-07-26 16:27:20 +08:00
Huamin Chen c53bf87ac2 make rootfs mount propagation mode settable
Signed-off-by: Huamin Chen  <hchen@redhat.com>
2015-07-16 08:50:11 -04:00
W. Trevor King 0887300359 spec_linux.go: Rename IDMapping fields to follow syscall.SysProcIDMap
'From' and 'To' are potentially ambiguous for a one-to-one map like
this, and there's already an established name convention in
SysProcIDMap [1].  This commit removes the mental overhead of two
separate naming schemes for the same information.  I'd like to drop
IDMapping entirely in favor of SysProcIDMap, but SysProcIDMap doesn't
give the JSON hints we need for (de)serializing.

[1]: https://golang.org/pkg/syscall/#SysProcIDMap
2015-07-08 10:48:51 -07:00
Michael Crosby e8990d65d1 Merge pull request #50 from mrunalp/userns_section
Adds a section for user namespace mappings
2015-07-08 09:28:18 -07:00
Jonathan Boulle 625798536e config: minor cleanup
- link to official SemVer page
- link between config.md and config-linux.md and explain relationship
- fix typo (arch -> os)
- tweak formatting of some special characters
2015-07-06 17:37:01 -07:00
Mrunal Patel d8237f1899 Adds a section for user namespace mappings
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-07-06 16:05:05 -04:00
Jonathan Boulle 1937c009ea *: small spelling fixes 2015-07-01 10:20:43 -07:00
lizf-os a402b7ae4e Fix typos in the rlimits section
Signed-off-by: Zefan Li <lizefan@huawei.com>
2015-07-01 10:25:46 +08:00
Brandon Philips aa7e14306b Merge pull request #35 from mrunalp/rlimits
Adds section for Linux Rlimits
2015-06-30 16:04:05 -07:00
Mrunal Patel a4df2e4ad5 Adds link to kernel cgroups documentation
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-06-30 18:45:10 -04:00
Mrunal Patel 7f9d7d30bd Adds section for Linux Rlimits
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-06-30 18:35:38 -04:00
Michael Crosby 92b590a760 Add linux spec description
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-06-30 15:19:06 -07:00
Michael Crosby f2569d17b4 Update config-linux for better formatting on values
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-06-30 15:13:30 -07:00
Michael Crosby 377213e01c Merge pull request #29 from mrunalp/linux-sysctl
Adds section for Linux Sysctl.
2015-06-30 14:32:24 -07:00
Mrunal Patel 328aba4468 Adds section for Linux Sysctl.
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-06-30 15:03:16 -04:00
Mrunal Patel 144e9719f5 Makes namespaces description linux specific
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-06-30 15:01:09 -04:00
Michael Crosby 9eb09f9593 Move linux specific options to linux spec
This moves some of the linux specific options like namespaces and
devices to the linux config document.  It also removes processes as an
array and replaces it with a single process.

It adds the "platform" struct for OS and Arch and updates many of the
examples to match the changes.  I also remove some of the redundant
windows examples on the portable spec document because they did not add
any extra value and many values were the same.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-06-29 14:15:33 -07:00
Mrunal Patel d5c2670df6 Adds user namespace to the list of namespaces
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-06-24 21:14:35 -07:00
Brandon Philips 5d2eb180f6 *: re-org the spec
We had an in-person spec discussion, lets separate the spec into some
high-level sections to clarify future discussion.

Crosby agreed to let me merge to master :)
2015-06-24 17:15:48 -07:00