jasder/runc - runc - 军科开源项目托管

Commit Graph

Author	SHA1	Message	Date
Alban Crequy	3321aa1af7	Fix regression with mounts with non-absolute source path PR #1753 introduced a test on the mount flags but the binary operator was wrong, see https://github.com/opencontainers/runc/pull/1753#discussion_r203445652 This was noticed when investigating https://github.com/opencontainers/runtime-tools/issues/651 Symptoms: in the container, /proc/self/mountinfo displays some mounts as follow: 296 279 0:67 / /tmp rw,nosuid - tmpfs /home/dpark/go/src/github.com/opencontainers/runc/tmpfs rw,size=65536k,mode=755 Signed-off-by: Alban Crequy <alban@kinvolk.io>	2018-07-18 18:30:49 +02:00
Qiang Huang	dd67ab10d7	Merge pull request #1759 from cyphar/rootless-erofs-as-eperm rootless: cgroup: treat EROFS as a skippable error	2018-05-25 09:24:16 +08:00
dlorenc	40680b2d37	Make the setupSeccomp function public. This function is useful for converting from the OCI spec format to the one used by runC/libcontainer. Signed-off-by: dlorenc <lorenc.d@gmail.com>	2018-04-17 10:47:22 -07:00
Michael Crosby	d56f6cc202	Merge pull request #1753 from wking/do-not-require-bind-mount-type libcontainer/specconv/spec_linux: Support empty 'type' for bind mounts	2018-04-16 11:01:53 -04:00
Michael Crosby	9f0eca2a94	Merge pull request #1777 from nalind/no-config-for-extant-netns Only configure networking when creating a net ns	2018-04-12 10:55:02 -04:00
Nalin Dahyabhai	4521d4b19c	Only configure networking when creating a net ns When joining an existing namespace, don't default to configuring a loopback interface in that namespace. Its creator should have done that, and we don't want to fail to create the container when we don't have sufficient privileges to configure the network namespace. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>	2018-04-11 13:28:19 -04:00
Aleksa Sarai	fd3a6e6c83	libcontainer: handle unset oomScoreAdj corectly Previously if oomScoreAdj was not set in config.json we would implicitly set oom_score_adj to 0. This is not allowed according to the spec: > If oomScoreAdj is not set, the runtime MUST NOT change the value of > oom_score_adj. Change this so that we do not modify oom_score_adj if oomScoreAdj is not present in the configuration. While this modifies our internal configuration types, the on-disk format is still compatible. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-03-17 13:53:42 +11:00
W. Trevor King	0aa6e4e5d3	libcontainer/specconv/spec_linux: Support empty 'type' for bind mounts From the "Creating a bind mount" section of mount(2) [1]: > If mountflags includes MS_BIND (available since Linux 2.4), then > perform a bind mount... > > The filesystemtype and data arguments are ignored. This commit adds support for configurations that leave the OPTIONAL type [2] unset for bind mounts. There's a related spec-example change in flight with [3], although my personal preference would be a more explicit spec for the whole mount structure [4]. [1]: http://man7.org/linux/man-pages/man2/mount.2.html [2]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/config.md#L102 [3]: https://github.com/opencontainers/runtime-spec/pull/954 [4]: https://github.com/opencontainers/runtime-spec/pull/771 Signed-off-by: W. Trevor King <wking@tremily.us>	2018-03-07 10:23:42 -08:00
Aleksa Sarai	757e78bebd	merge branch 'pr-1743' The setupUserNamespace function is always called. LGTMs: @crosbymichael @mrunalp @cyphar Closes #1743	2018-02-27 12:22:52 +11:00
ynirk	2420eb1f4d	The setupUserNamespace function is always called. The function is called even if the usernamespace is not set. This results having wrong uid/gid set on devices. This fix add a test to check if usernamespace is set befor calling setupUserNamespace. Fixes #1742 Signed-off-by: Julien Lavesque <julien.lavesque@gmail.com>	2018-02-26 14:27:11 +01:00
Allen Sun	3f32e72963	fix lint error in specconv Signed-off-by: Allen Sun <allensun.shl@alibaba-inc.com>	2018-02-26 15:39:54 +08:00
Mrunal Patel	c6e4a1ebeb	Merge pull request #1665 from Mashimiao/gidmapping-valid-fix specconv: avoid skipping gidmappings applied when uidmappings is empty	2017-12-11 09:50:54 -08:00
Ma Shimiao	57edfbbaf2	specconv: avoid skipping gidmappings applied when uidmappings is empty Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>	2017-11-30 16:24:36 +08:00
Ma Shimiao	17db6560be	support unbindable,runbindable for rootfs propagation Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>	2017-11-17 16:14:15 +08:00
Akihiro Suda	0aac2368e4	specconv.Example(): add /proc/scsi to masked paths Port over https://github.com/moby/moby/pull/35399 Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2017-11-04 17:38:14 +00:00
Lorenzo Fontana	780f8ef567	Specconv: Test create command hooks and seccomp setup Signed-off-by: Lorenzo Fontana <lo@linux.com>	2017-10-28 21:46:46 +02:00
Lorenzo Fontana	c0e6e12f9d	Test Cgroup creation and memory allocations Signed-off-by: Lorenzo Fontana <lo@linux.com>	2017-10-25 01:58:10 +02:00
Aleksa Sarai	d4f0f9a52b	specconv: emit an error when using MS_PRIVATE with --no-pivot Due to the semantics of chroot(2) when it comes to mount namespaces, it is not generally safe to use MS_PRIVATE as a mount propgation when using chroot(2). The reason for this is that this effectively results in a set of mount references being held by the chroot'd namespace which the namespace cannot free. pivot_root(2) does not have this issue because the @old_root can be unmounted by the process. Ultimately, --no-pivot is not really necessary anymore as a commonly used option since `f8e6b5af5e` ("rootfs: make pivot_root not use a temporary directory") resolved the read-only issue. But if someone really needs to use it, MS_PRIVATE is never a good idea. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-10-08 17:50:55 +11:00
Qiang Huang	79ad714374	Merge pull request #1598 from euank/ragent libcontainer: default mount propagation correctly	2017-09-25 11:55:29 +08:00
Euan Kemp	4301b440d6	libcontainer: default mount propagation correctly The code in prepareRoot (`e385f67a0e/libcontainer/rootfs_linux.go (L599-L605)`) attempts to default the rootfs mount to `rslave`. However, since the spec conversion has already defaulted it to `rprivate`, that code doesn't actually ever do anything. This changes the spec conversion code to accept "" and treat it as 0. Implicitly, this makes rootfs propagation default to `rslave`, which is a part of fixing the moby bug https://github.com/moby/moby/issues/34672 Alternate implementatoins include changing this defaulting to be `rslave` and removing the defaulting code in prepareRoot, or skipping the mapping entirely for "", but I think this change is the cleanest of those options. Signed-off-by: Euan Kemp <euan.kemp@coreos.com>	2017-09-22 13:36:23 -07:00
Mrunal Patel	13fa5d2953	Merge pull request #1588 from s7v7nislands/delete_unused Delete unused function	2017-09-08 17:34:00 -07:00
s7v7nislands	c795b8690b	Delete unused function Signed-off-by: Xiaobing Jiang <s7v7nislands@gmail.com>	2017-09-08 10:35:46 +08:00
Ma Shimiao	c3d20e7817	Fixes #1585 config.Namespaces is empty when accessed Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>	2017-09-08 09:30:07 +08:00
Xiaochen Shen	692f6e1e27	libcontainer: add support for Intel RDT/CAT in runc About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Intel Resource Director Technology (RDT). Cache Allocation Technology (CAT) is a sub-feature of RDT, which currently supports L3 cache resource allocation. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). For more information about Intel RDT/CAT can be found in the section 17.17 of Intel Software Developer Manual. About Intel RDT/CAT kernel interface: In Linux 4.10 kernel or newer, the interface is defined and exposed via "resource control" filesystem, which is a "cgroup-like" interface. Comparing with cgroups, it has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout. Intel RDT "resource control" filesystem hierarchy: mount -t resctrl resctrl /sys/fs/resctrl tree /sys/fs/resctrl /sys/fs/resctrl/ \|-- info \| \|-- L3 \| \|-- cbm_mask \| \|-- min_cbm_bits \| \|-- num_closids \|-- cpus \|-- schemata \|-- tasks \|-- <container_id> \|-- cpus \|-- schemata \|-- tasks For runc, we can make use of `tasks` and `schemata` configuration for L3 cache resource constraints. The file `tasks` has a list of tasks that belongs to this group (e.g., <container_id>" group). Tasks can be added to a group by writing the task ID to the "tasks" file (which will automatically remove them from the previous group to which they belonged). New tasks created by fork(2) and clone(2) are added to the same group as their parent. If a pid is not in any sub group, it Is in root group. The file `schemata` has allocation bitmasks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM). Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0` which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0. The valid L3 cache CBM is a contiguous bits set and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a group should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a group: 0xf, 0xf0, 0x3ff, 0x1f00 and etc. For more information about Intel RDT/CAT kernel interface: https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt An example for runc: Consider a two-socket machine with two L3 caches where the default CBM is 0xfffff and the max CBM length is 20 bits. With this configuration, tasks inside the container only have access to the "upper" 80% of L3 cache id 0 and the "lower" 50% L3 cache id 1: "linux": { "intelRdt": { "l3CacheSchema": "L3:0=ffff0;1=3ff" } } Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>	2017-09-01 14:26:33 +08:00
Ma Shimiao	2333e7dc67	fix panic when Linux is nil for rootless case congfig.Sysctl setting is duplicated. when contianer is rootless and Linux is nil, runc will panic. Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>	2017-08-16 09:11:13 +08:00
Ma Shimiao	527dc5acbb	fix panic when Linux is nil Linux is not always not nil. If Linux is nil, panic will occur. Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com> Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2017-08-10 15:57:49 -04:00
Michael Crosby	eb70c213ba	Update runtime-spec to rc6 Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2017-07-12 16:24:04 -07:00
Michael Crosby	fef3aced0e	Merge pull request #1460 from wking/mount-option-lazytime libcontainer/specconv/spec_linux: Add support for (no)lazytime	2017-06-29 10:06:23 -07:00
Justin Cormack	e1146182a8	Remove Platform as no longer in OCI spec This was never used, just validated, so was removed from spec. Signed-off-by: Justin Cormack <justin.cormack@docker.com>	2017-06-27 12:16:07 +01:00
W. Trevor King	4f81337e95	libcontainer/specconv/spec_linux: Add support for (no)lazytime And also silent, loud, (no)iversion, and (no)acl. This is part of catching runC up with the spec, which punts valid options to mount(8) [1,2]. (no)acl is a filesystem-specific entry in mount(8), but it's represented by a MS_* flag in mount(2) so we need an entry in the translation table. [1]: https://github.com/opencontainers/runtime-spec/blame/v1.0.0-rc5/config.md#L68 [2]: https://github.com/opencontainers/runtime-spec/pull/771 Signed-off-by: W. Trevor King <wking@tremily.us>	2017-06-01 20:43:35 -07:00
Michael Crosby	854b41d81e	Update spec to `239c4e44f2` This provides updates to runc for the spec changes with *Process and OOMScoreAdj Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2017-06-01 16:29:47 -07:00
Christy Perez	3d7cb4293c	Move libcontainer to x/sys/unix Since syscall is outdated and broken for some architectures, use x/sys/unix instead. There are still some dependencies on the syscall package that will remain in syscall for the forseeable future: Errno Signal SysProcAttr Additionally: - os still uses syscall, so it needs to be kept for anything returning *os.ProcessState, such as process.Wait. Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>	2017-05-22 17:35:20 -05:00
Aleksa Sarai	d04cbc49d2	rootless: add autogenerated rootless config from `runc spec` Since this is a runC-specific feature, this belongs here over in opencontainers/ocitools (which is for generic OCI runtimes). In addition, we don't create a new network namespace. This is because currently if you want to set up a veth bridge you need CAP_NET_ADMIN in both network namespaces' pinned user namespace to create the necessary interfaces in each network namespace. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-03-23 20:46:21 +11:00
Aleksa Sarai	f0876b0427	libcontainer: configs: add proper HostUID and HostGID Previously Host{U,G}ID only gave you the root mapping, which isn't very useful if you are trying to do other things with the IDMaps. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-03-23 20:46:20 +11:00
Aleksa Sarai	d2f49696b0	runc: add support for rootless containers This enables the support for the rootless container mode. There are many restrictions on what rootless containers can do, so many different runC commands have been disabled: * runc checkpoint * runc events * runc pause * runc ps * runc restore * runc resume * runc update The following commands work: * runc create * runc delete * runc exec * runc kill * runc list * runc run * runc spec * runc state In addition, any specification options that imply joining cgroups have also been disabled. This is due to support for unprivileged subtree management not being available from Linux upstream. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-03-23 20:45:24 +11:00
Qiang Huang	8430cc4f48	Use uint64 for resources to keep consistency with runtime-spec Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2017-03-20 18:51:39 +08:00
Mrunal Patel	4f9cb13b64	Update runtime spec to 1.0.0.rc5 Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-03-15 11:38:37 -07:00
Ma Shimiao	06e27471bb	support create device with type p and u Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>	2017-02-10 14:45:15 +08:00
Zhang Wei	8eea644ccc	Bump runtime-spec to v1.0.0-rc3 * Bump underlying runtime-spec to version 1.0.0-rc3 * Fix related changed struct names in config.go Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2016-12-17 14:02:35 +08:00
Zhang Wei	a0f7977f0f	Detect and forbid duplicated namespace in spec When spec file contains duplicated namespaces, e.g. specs: specs.Spec{ Linux: &specs.Linux{ Namespaces: []specs.Namespace{ { Type: "pid", }, { Type: "pid", Path: "/proc/1/ns/pid", }, }, }, } runc should report malformed spec instead of using latest one by default, because this spec could be quite confusing. Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2016-10-27 00:44:36 +08:00
Alexander Morozov	1ab9d5e6f4	Merge pull request #845 from mrunalp/cp_tmpfs Add support for copying up directories into tmpfs when a tmpfs is mounted over them	2016-10-21 13:47:16 -07:00
rajasec	034cba6af0	Fixing runc panic for missing file mode Signed-off-by: rajasec <rajasec79@gmail.com> Fixing runc panic for missing file mode Signed-off-by: rajasec <rajasec79@gmail.com>	2016-10-16 20:39:44 +05:30
rajasec	4b263c9594	Fixing runc panic during hugetlb pages Signed-off-by: rajasec <rajasec79@gmail.com> Fixing runc panic during hugetlb pages Signed-off-by: rajasec <rajasec79@gmail.com>	2016-10-15 19:47:33 +05:30
Shukui Yang	affc105264	tiny fix, add a null check for specs.Resources.Pids.Limit Signed-off-by: Shukui Yang <yangshukui@huawei.com>	2016-10-13 15:55:30 +08:00
Mrunal Patel	4356468f49	Parse the new extension flags Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-09-30 09:48:03 -07:00
Adam Thomason	83cbdbd64c	Add checks for nil spec.Linux Signed-off-by: Adam Thomason <ad@mthomason.net>	2016-09-11 16:31:34 -07:00
Zhang Wei	7303a9a720	Tiny refactor: remove unused local variables Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2016-09-06 23:41:40 +08:00
Qiang Huang	aa2dd02f5a	Fix null point reference panic Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-09-01 08:34:22 +08:00
Qiang Huang	220e5098a8	Fix default cgroup path Alternative of #895 , part of #892 The intension of current behavior if to create cgroup in parent cgroup of current process, but we did this in a wrong way, we used devices cgroup path of current process as the default parent path for all subsystems, this is wrong because we don't always have the same cgroup path for all subsystems. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-08-30 14:12:15 +08:00
Mrunal Patel	4dedd09396	Merge pull request #937 from hushan/net_cls-classid fix setting net_cls classid	2016-07-18 17:18:23 -04:00

1 2

69 Commits