Seting CPU quota and period independently does not make much sense,
but historically runc allowed it and this needs to be supported
to not break compatibility.
For systemd cgroup drivers to set CPU quota/period correctly,
it needs to know both values. For fs2 cgroup driver to be compatible
with the fs driver, it also needs to know both values.
Here in update, previously set values are available from config.
If only one of {quota,period} is set and the other is not, leave
the unset parameter at the old value (don't overwrite config).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This (and the converting function) is only used by one of the four
cgroup drivers. The other three do some checking and conversion in
place, so let the fs2 do the same.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Memory Bandwidth Allocation (MBA) is a resource allocation sub-feature
of Intel Resource Director Technology (RDT) which is supported on some
Intel Xeon platforms. Intel RDT/MBA provides indirect and approximate
throttle over memory bandwidth for the software. A user controls the
resource by indicating the percentage of maximum memory bandwidth.
Hardware details of Intel RDT/MBA can be found in section 17.18 of
Intel Software Developer Manual:
https://software.intel.com/en-us/articles/intel-sdm
In Linux 4.12 kernel and newer, Intel RDT/MBA is enabled by kernel
config CONFIG_INTEL_RDT. If hardware support, CPU flags `rdt_a` and
`mba` will be set in /proc/cpuinfo.
Intel RDT "resource control" filesystem hierarchy:
mount -t resctrl resctrl /sys/fs/resctrl
tree /sys/fs/resctrl
/sys/fs/resctrl/
|-- info
| |-- L3
| | |-- cbm_mask
| | |-- min_cbm_bits
| | |-- num_closids
| |-- MB
| |-- bandwidth_gran
| |-- delay_linear
| |-- min_bandwidth
| |-- num_closids
|-- ...
|-- schemata
|-- tasks
|-- <container_id>
|-- ...
|-- schemata
|-- tasks
For MBA support for `runc`, we will reuse the infrastructure and code
base of Intel RDT/CAT which implemented in #1279. We could also make
use of `tasks` and `schemata` configuration for memory bandwidth
resource constraints.
The file `tasks` has a list of tasks that belongs to this group (e.g.,
<container_id>" group). Tasks can be added to a group by writing the
task ID to the "tasks" file (which will automatically remove them from
the previous group to which they belonged). New tasks created by
fork(2) and clone(2) are added to the same group as their parent.
The file `schemata` has a list of all the resources available to this
group. Each resource (L3 cache, memory bandwidth) has its own line and
format.
Memory bandwidth schema:
It has allocation values for memory bandwidth on each socket, which
contains L3 cache id and memory bandwidth percentage.
Format: "MB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;..."
The minimum bandwidth percentage value for each CPU model is predefined
and can be looked up through "info/MB/min_bandwidth". The bandwidth
granularity that is allocated is also dependent on the CPU model and
can be looked up at "info/MB/bandwidth_gran". The available bandwidth
control steps are: min_bw + N * bw_gran. Intermediate values are
rounded to the next control step available on the hardware.
For more information about Intel RDT kernel interface:
https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt
An example for runc:
Consider a two-socket machine with two L3 caches where the minimum
memory bandwidth of 10% with a memory bandwidth granularity of 10%.
Tasks inside the container may use a maximum memory bandwidth of 20%
on socket 0 and 70% on socket 1.
"linux": {
"intelRdt": {
"memBwSchema": "MB:0=20;1=70"
}
}
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
replace #1492#1494fix#1422
Since https://github.com/opencontainers/runtime-spec/pull/876 the memory
specifications are now `int64`, as that better matches the visible interface where
`-1` is a valid value. Otherwise finding the correct value was difficult as it
was kernel dependent.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Currently runc already supports setting realtime runtime and period
before container processes start, this commit will add update support
for realtime scheduler resources.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
grep -r "range map" showw 3 parts use map to
range enum types, use slice instead can get
better performance and less memory usage.
Signed-off-by: Peng Gao <peng.gao.dut@gmail.com>
Back quotes are the placeholder feature described here:
https://github.com/urfave/cli#placeholder-values
Without this, cli will take `-1` as default value as:
```
--memory-swap -1 Total memory usage (memory + swap); set `-1` to enable unlimited swap
```
After this patch, it'll act correctly
```
--memory-swap value Total memory usage (memory + swap); set '-1' to enable unlimited swap
```
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
The major change is the description of options, change
it as the latest cli help message shows, which specify
a "value" after an option if it takes value, and add
(default: xxx) if the option has a default value.
This also includes some other minor consistency fixes.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: rajasec <rajasec79@gmail.com>
Adding kernel mem tcp for update command
Signed-off-by: rajasec <rajasec79@gmail.com>
Fixing update.bats to reduce the TCP value
Signed-off-by: rajasec <rajasec79@gmail.com>
Updated the kernelTCP in bats as per json
Signed-off-by: rajasec <rajasec79@gmail.com>
Fixed some minor issue in bats file
Signed-off-by: rajasec <rajasec79@gmail.com>
Rounded off to right bytes for kernel TCP
Signed-off-by: rajasec <rajasec79@gmail.com>
Updating man file for update command
Signed-off-by: rajasec <rajasec79@gmail.com>