Commit Graph

125 Commits

Author SHA1 Message Date
W. Trevor King 10ab597ee5 config-linux: Remove explicit 'null' from device cgroup values
Catch the Markdown spec up with the JSON Schema change in 09274372
(schema: Drop pointers and nulls, 2017-01-18, #662).  The Markdown is
canonical, so we could restore the explicit-null handling to the JSON
Schema instead, but the maintainers feel (and I agree) that there's no
point in explicitly allowing a null value when callers can simply
leave the property unset [1].

[1]: https://github.com/opencontainers/runtime-spec/pull/555#issuecomment-272020515

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-05-11 01:29:56 -07:00
Michael Crosby 27064b8336 Merge pull request #767 from wking/rfc2119-namespaces
config-linux: RFC 2119 tightening for namespaces
2017-05-10 14:13:22 -07:00
Mrunal Patel cde4b6624f Merge pull request #799 from wking/inline-internal-links
*: Use inline links for remaining internal references
2017-05-10 13:58:40 -07:00
W. Trevor King fae94dbab0 config-linux: Remove redundant MUST for minimum cgroup controllers
Any runtime which violated that constraint would necessarily violate
some more specific constraint on a 'resources' setting.

This also removes a non-spec-requirement "required" to avoid any
confusion with the spec-requirement REQUIRED [1].

[1]: https://github.com/opencontainers/runtime-spec/pull/729#issue-214550260

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-05-10 13:41:54 -07:00
Tianon Gravi cd92a0e385 Merge pull request #713 from Mashimiao/config-linux-fix-network-interface
config-linux: make interface name clear
2017-05-10 13:12:25 -07:00
W. Trevor King 65cb135df8 *: Use inline links for remaining internal references
Since f9dc90b0 (make link usage consistent across the specification,
2017-02-09, #687), the official style is to only use reference-style
links for external links.  I expect the remaining three entries just
slipped through.  This commit adjusts everything found with:

  $ git grep ']: [a-z]' | grep -v http

It also fixes the underscore -> hyphen in the
glossary.md#container-namespace target and updates the capabilities
location to catch up with 5a8a779f (Move process specific settings to
process, 2016-03-02, #329).

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-05-10 11:26:14 -07:00
W. Trevor King 4b49c64a88 config: Shift oomScoreAdj from linux.resources to process
The only discussion related to this is in [1,2], where the
relationship between oomScoreAdj and disableOOMKiller is raised. But
since 429f936 (Adding cgroups path to the Spec, 2015-09-02, #137)
resources has been tied to cgroups, and oomScoreAdj is not about
cgroups.  For example, we currently have (in config-linux.md):

  You can configure a container's cgroups via the resources field of
  the Linux configuration.

I suggested we move the property from linux.resources.oomScoreAdj to
linux.oomScoreAdj so config authors and runtimes don't have to worry
about what cgroupsPath means if the only entry in resources is
oomScoreAdj.  Michael responded with [4]:

  If anything it should probably go on the process

So that's what this commit does.

I've gone with the four-space indents here to keep Pandoc happy (see
7795661 (runtime.md: Fix sub-bullet indentation, 2016-06-08, #495),
but have left the existing entries in this list unchanged to reduce
churn.

[1]: https://github.com/opencontainers/runtime-spec/pull/236
[2]: https://github.com/opencontainers/runtime-spec/pull/292
[3]: https://github.com/opencontainers/runtime-spec/pull/137
[4]: https://github.com/opencontainers/runtime-spec/issues/782#issuecomment-299990075

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-05-09 16:46:30 -07:00
W. Trevor King b644395e96 config-linux: RFC 2119 tightening for namespaces
Previously we had no MUST-level runtime requirements for namespace
entries in valid configs.  This commit attempts to pin those down.

I think we want more wording about new namespace creation (what
namespace is the seed/parent?  Which user namespace owns a runtime
namespace?  For more background on hierarchical namespaces, see [1].
For more background on the owning user namespace idea, see [2,3,4]),
but that wording proved contentious [5,6], so I punted it to [7].

The "'path' not associated with a namespace of type 'type'" condition
ensures that runtimes don't blindly call setns(2) on the path without
setting nstype nonzero.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7306ed8d94af729ecef8b6e37506a1c6fc14788
     nsfs: add ioctl to get a parent namespace, 2016-09-06
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6786741dbf99e44fb0c0ed85a37582b8a26f1c3b
     nsfs: add ioctl to get owning user namespace for ns file
     descriptor, 2016-09-06
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5ff5ce6e20ee22511398bb31fb912466cf82a36
     nsfs: Add an ioctl() to return the namespace type, 2017-01-25
[4]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d95fa3c76a66b6d76b1e109ea505c55e66360f3c
     nsfs: Add an ioctl() to return owner UID of a userns, 2017-01-25
[5]: https://github.com/opencontainers/runtime-spec/pull/767#discussion_r115591844
[6]: https://github.com/opencontainers/runtime-spec/pull/767#discussion_r115592437
[7]: https://github.com/opencontainers/runtime-spec/pull/795

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-05-09 15:16:17 -07:00
Mrunal Patel 01ec62d3e4 Merge pull request #781 from wking/oomScoreAdj-rfc-2119
config-linux: RFC 2119 wording for oomScoreAdj
2017-05-09 13:13:45 -07:00
W. Trevor King b11ade4616 config-linux: RFC 2119 wording for intelRdt
So we can compliance-test runtimes for these settings.

Also remove the tutorial, since the kernel docs should provide
sufficient documentation on that front.  The kernel can be patched if
they do not, and we do not include tutorials for other config-linux
settings in this spec.

The updated example was recommended by Xiaochen to compensate for the
removed inline tutorial [1].

[1]: https://github.com/opencontainers/runtime-spec/pull/787#discussion_r114254422

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-05-09 09:29:49 -07:00
W. Trevor King e9a39e76f4 config-linux: RFC 2119 wording for oomScoreAdj
The previous wording hinted at, but did not require, this setting to
be implemented via oom_score_adj.  With the new wording, when proc is
mounted at /proc, the container process can check this value by
looking at /proc/self/oom_score_adj.

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-05-09 09:28:25 -07:00
Tianon Gravi 138ad89ca8 Merge pull request #768 from wking/optional-syscalls
config-linux: Make linux.seccomp.syscalls OPTIONAL
2017-04-26 08:29:52 -07:00
v1.0.0.batts c6bff91450 Merge pull request #769 from wking/require-syscall-names
config-linux: Require at least one entry in linux.seccomp.sycalls[].names
2017-04-26 11:26:05 -04:00
v1.0.0.batts 482fe6bf1c Merge pull request #773 from q384566678/device-up
config-linux.md: Update the link to the devices
2017-04-26 11:16:05 -04:00
Qiang Huang ce55de2517 Remove range limit which depend on kernel
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-04-26 22:46:02 +08:00
W. Trevor King 42984e8d3c config-linux: Make linux.seccomp.syscalls OPTIONAL
Before this commit, linux.seccomp.sycalls was required, but we didn't
require an entry in the array.  That means '"syscalls": []' would be
technically valid, and I'm pretty sure that's not what we want.

If it makes sense to have a seccomp property that does not need
syscalls entries, then syscalls should be optional (which is what this
commit is doing).

If it does not makes sense to have an empty/unset syscalls then it
should be required and have a minimum length of one.

Before 652323c (improve seccomp format to be more expressive,
2017-01-13, #657), syscalls was omitempty (and therefore more
optional-feeling, although there was no real Markdown spec for seccomp
before 3ca5c6c, config-linux.md: fix seccomp, 2017-03-02, #706, so
it's hard to know).  This commit has gone with OPTIONAL, because a
seccomp config which only sets defaultAction seems potentially valid.

The SCMP_ACT_KILL example is prompted by:

On Tue, Apr 25, 2017 at 01:32:26PM -0700, David Lyle wrote [1]:
> Technically, OPTIONAL is the right value, but unless you specify the
> default action for seccomp to be SCMP_ACT_ALLOW the result will be
> an error at run time.
>
> I would suggest an additional clarification to this fact in
> config-linux.md would be very helpful if marking syscall as
> OPTIONAL.

I've phrased the example more conservatively, because I'm not sure
that SCMP_ACT_ALLOW is the only possible value to avoid an error.  For
example, perhaps a SCMP_ACT_TRACE default with an empty syscalls array
would not die on the first syscall.  The point of the example is to
remind config authors that without a useful syscalls array, the
default value is very important ;).

Also add the previously-missing 'required' property to the seccomp
JSON Schema entry.

[1]: https://github.com/opencontainers/runtime-spec/pull/768#issuecomment-297156102

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-04-25 15:06:57 -07:00
Michael Crosby f2276206b3 Merge pull request #770 from q384566678/rootfsPropagation-test
config-linux.md: Increase the valid value of rootfsPropagation
2017-04-25 11:18:19 -07:00
W. Trevor King 4c33c9e041 config-linux: Fix 'file' POSIX link
This was broken by f9dc90b0 (make link usage consistent across the
specification, 2017-02-09, #687), which updated the link label, but
not this link.  Now that the link label matches the link text, we can
use the implicit link name shortcut [1].

[1]: https://daringfireball.net/projects/markdown/syntax#link

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-04-21 09:40:45 -07:00
zhouhao 9d5ff350b4 config-linux.md: Update the link to the devices
Signed-off-by: zhouhao <zhouhao@cn.fujitsu.com>
2017-04-20 13:38:05 +08:00
zhouhao e3d8d10e05 config-linux.md: Increase the valid value of rootfsPropagation
Signed-off-by: zhouhao <zhouhao@cn.fujitsu.com>
2017-04-13 09:33:07 +08:00
W. Trevor King 5c62f9b839 config-linux: Require at least one entry in linux.seccomp.sycalls[].names
I expect the (undocumented) intention here is to iterate through
'names' and call seccomp_rule_add(3) or similar for each name.  In
that case, an empty 'names' makes the whole syscall entry a no-op, and
with this commit we can warn users who are validating such configs.

If, on the other hand, we were comfortable with no-op syscall entries,
we'd want to make 'names' OPTIONAL.

Warning folks who accidentally empty (or don't set) 'names' seems more
useful to me, and doesn't restrict the useful config space, so that's
what I've gone with in this commit.

minItems is documented in [1], and there is an example of its use in
[2]:

  "options": {
    "type": "array",
    "minItems": 1,
    "items": { "type": "string" },
    "uniqueItems": true
  },

[1]: https://tools.ietf.org/html/draft-wright-json-schema-validation-00#section-5.11
[2]: http://json-schema.org/example2.html

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-04-12 10:17:13 -07:00
Michael Crosby 18f4f18955 Merge pull request #751 from hqhq/use_MUST_for_weight
Use MUST and MAY for weight and leafWeight
2017-04-03 14:18:18 -07:00
Qiang Huang 018c5f20b0 Use MUST and MAY for weight and leafWeight
Carry: #728

Signed-off-by: Rob Dolin <robdolin@microsoft.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-30 13:01:38 +08:00
W. Trevor King ff207496ab *: Replace "array" type with "array of objects"
We have a few different element types in our arrays, so it's useful to
clarify the element type for the property being specified.  Before
this commit:

  $ sed -n 's|.*\*\*`\([^`]*\)`\*\*[^(]*(\([^,]*\),.*|\2|p' *.md | sort | uniq -c | grep array
        7 array
        1 array of ints
        8 array of objects
       13 array of strings

All of the bare 'array' instances turned out to be arrays of objects.

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-03-29 11:17:32 -07:00
Mrunal Patel 71366eecb5 Merge pull request #741 from q384566678/fix-info
config-linux.md: fix info
2017-03-28 16:30:59 -07:00
Michael Crosby 3adac26772 Merge pull request #706 from q384566678/fix-seecomp
config-linux.md: fix seccomp
2017-03-27 10:24:44 -07:00
zhouhao 8c12f6038c config-linux.md: fix info
Signed-off-by: zhouhao <zhouhao@cn.fujitsu.com>
2017-03-24 14:02:02 +08:00
zhouhao 3ca5c6c58e config-linux.md: fix seccomp
Signed-off-by: zhouhao <zhouhao@cn.fujitsu.com>
2017-03-20 13:32:30 +08:00
Vincent Batts 55e1a84c1f Merge pull request #720 from Mashimiao/config-linux-fix-namespace-path
config-linux.md: clearly require absolute path for namespace
2017-03-10 18:06:17 -05:00
Mrunal Patel 76159da8ca Merge pull request #630 from xiaochenshen/rdt-cat-resctrl-cgroup-v1
specs-go/config: add Intel RDT/CAT Linux support
2017-03-10 09:41:16 -08:00
Xiaochen Shen 73a6002bf3 specs-go/config: add Intel RDT/CAT Linux support
Add support for Intel Resource Director Technology (RDT) / Cache Allocation
Technology (CAT). Add L3 cache resource constraints in Linux-specific
configuration.

This is the prerequisite of this runc proposal:
https://github.com/opencontainers/runc/issues/433

For more information about Intel RDT/CAT, please refer to:
https://github.com/opencontainers/runc/issues/433

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
2017-03-10 17:29:08 +08:00
Ma Shimiao 72cbff6786 config-linux.md: clearly require absolute path for namespace
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-03-10 12:00:16 +08:00
zhouhao 90427c9345 remove comment
Signed-off-by: zhouhao <zhouhao@cn.fujitsu.com>
2017-03-10 09:19:28 +08:00
Ma Shimiao 9f5ed02bc4 config-linux: make interface name clear
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-03-07 14:04:19 +08:00
Jesse Butler f9dc90b05a make link usage consistent across the specification
Signed-off-by: Jesse Butler <jesse.butler@oracle.com>
2017-03-03 14:43:09 -05:00
Mrunal Patel f47e43c643 Merge pull request #705 from q384566678/test-seecomp
Add new architectures from libseccomp 2.3.2
2017-03-03 11:36:27 -08:00
Mrunal Patel d01ef9a806 Add anchors to config and config linux
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-02 11:00:31 -08:00
zhouhao 513ab686e9 Add new architectures from libseccomp 2.3.2
Signed-off-by: zhouhao <zhouhao@cn.fujitsu.com>
2017-03-02 14:33:06 +08:00
Qiang Huang ec9449187b Set specs value the same as kernel API input
This partially revert #648 , after a second thought, I think we
should use specs value the same as kernel API input, see:
https://github.com/opencontainers/runtime-spec/issues/692#issuecomment-281889852

For memory and hugetlb limits *.limit_in_bytes, cgroup APIs take the values
as string, but the parsed values are unsigned long, see:
https://github.com/torvalds/linux/blob/v4.10/mm/page_counter.c#L175-L193

For `cpu.cfs_quota_us` and `cpu.rt_runtime_us`, cgroup APIs take the input
value as signed long long, while `cpu.cfs_period_us` and `cpu.rt_periof_us`
take the input value as unsigned long long.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-03-01 09:10:43 +08:00
zhouhao 5a470213e7 config-linux.md: fix info
Signed-off-by: zhouhao <zhouhao@cn.fujitsu.com>
2017-02-27 16:07:52 +08:00
Mrunal Patel ae7a541930 Merge pull request #657 from GrantSeltzer/improve-seccomp-spec
config: Improve seccomp format to be more expressive
2017-02-24 18:59:49 -08:00
grantseltzer 652323cd77 improve seccomp format to be more expressive
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
2017-02-22 18:17:16 -05:00
Qiang Huang a5c4e91dae Remove uid/gid mapping limit depend on kernel
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-02-22 14:43:18 -08:00
Daniel Dao 279c3c095c
linux: relax filesystem requirements for container
change MUST to SHOULD so containers are not required to have all these
filesystems mounted.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2017-01-23 12:44:36 +00:00
Rob Dolin (MSFT) 646826658d [Config Linux] Clarify: App --> Container
Replaces #577

Signed-off-by: Rob Dolin (MSFT) <robdolin@microsoft.com>
2017-01-18 10:29:13 -08:00
Mrunal Patel c0206be451 Merge pull request #647 from Mashimiao/config-linux-fix-device-path
config-linux: Add restriction for duplicated device path
2017-01-12 09:57:11 -08:00
Ma Shimiao 1fc1464dbc config-linux: Add restriction for duplicated device path
I think runtime should generate an error, if devices has
duplicated device path.
Because we don't know which one is really needed.

Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-01-12 14:24:52 +08:00
W. Trevor King d43fc428aa config-linux: Lift no-tweaking namespace restriction
This restriction originally landed via 02b456e9 (Clarify behavior
around namespaces paths, 2015-09-08, #158).  The hostname case landed
via 66a0543e (config: Require a new UTS namespace for config.json's
hostname, 2015-10-05, #214) citing the namespace restriction.  The
restriciton extended to runtime namespaces in 01c2d55f (config-linux:
Extend no-tweak requirement to runtime namespaces, 2016-08-24, #538).
There was a proposal in-flight to get config-wide consistency around
the no-tweaking concept [1].

In today's meeting, the maintainer consensus was to strike the
no-tweaking restriction [2], which is what I've done here.  I've
removed the ROADMAP entry because this gives folks a way to adjust
existing containers (launch a new container which joins and tweaks the
original).

The hostname entry still mentions the UTS namespace to provide a guard
against accidental foot-gunning.  There was no no-tweaking language
for properties related to other namespaces (e.g. 'mounts').
Maybe the other namespaces have more obvious names.

[1]: https://github.com/opencontainers/runtime-spec/pull/540
[2]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2017/opencontainers.2017-01-11-22.04.log.html#l-117

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-01-11 15:16:54 -08:00
Qiang Huang 082e93a2bd Allow negative value for some resource fields
Carry #499

For these values, cgroup kernal APIs accept -1 to set
them as unlimited, as docker and runc all support
update resources, we should not set drawbacks in spec.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2017-01-05 19:03:57 +08:00
Doug Davis e7be40f0c3 Cleanup the spec a bit to remove WG/git text that's not really part of the spec
renamed an href to "container-namespace2" to avoid a dup-warning msg from
the PDF generator

Signed-off-by: Doug Davis <dug@us.ibm.com>
2016-11-16 09:50:03 -08:00