1. The command `runc checkpoint --lazy-server --status-fd $FD` actually
accepts a file name as an $FD. Make it accept a file descriptor,
like its name implies and the documentation states.
In addition, since runc itself does not use the result of CRIU status
fd, remove the code which relays it, and pass the FD directly to CRIU.
Note 1: runc should close this file descriptor itself after passing it
to criu, otherwise whoever waits on it might wait forever.
Note 2: due to the way criu swrk consumes the fd (it reopens
/proc/$SENDER_PID/fd/$FD), runc can't close it as soon as criu swrk has
started. There is no good way to know when criu swrk has reopened the
fd, so we assume that as soon as we have received something back, the
fd is already reopened.
2. Since the meaning of --status-fd has changed, the test case using
it needs to be fixed as well.
Modify the lazy migration test to remove "sleep 2", actually waiting
for the the lazy page server to be ready.
While at it,
- remove the double fork (using shell's background process is
sufficient here);
- check the exit code for "runc checkpoint" and "criu lazy-pages";
- remove the check for no errors in dump.log after restore, as we
are already checking its exit code.
[v2: properly close status fd after spawning criu]
[v3: move close status fd to after the first read]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
With the help of userfaultfd CRIU supports lazy migration. Lazy
migration means that memory pages are only transferred from the
migration source to the migration destination on page fault.
This enables to reduce the downtime during process or container
migration to a minimum as the memory does not need to be transferred
during migration.
Lazy migration currently depends on userfaultfd being available on the
current Linux kernel and if the used CRIU version supports lazy
migration. Both dependencies can be checked by querying CRIU via RPC if
the lazy migration feature is available. Using feature checking instead
of version comparison enables runC to use CRIU features from the
criu-dev branch. This way the user can decide if lazy migration should
be available by choosing the right kernel and CRIU branch.
To use lazy migration the CRIU process during dump needs to dump
everything besides the memory pages and then it opens a network port
waiting for remote page fault requests:
# runc checkpoint httpd --lazy-pages --page-server 0.0.0.0:27 \
--status-fd /tmp/postcopy-pipe
In this example CRIU will hang/wait once it has opened the network port
and wait for network connection. As runC waits for CRIU to finish it
will also hang until the lazy migration has finished. To know when the
restore on the destination side can start the '--status-fd' parameter is
used:
#️ runc checkpoint --help | grep status
--status-fd value criu writes \0 to this FD once lazy-pages is ready
The parameter '--status-fd' is directly from CRIU and this way the
process outside of runC which controls the migration knows exactly when
to transfer the checkpoint (without memory pages) to the destination and
that the restore can be started.
On the destination side it is necessary to start CRIU in 'lazy-pages'
mode like this:
# criu lazy-pages --page-server --address 192.168.122.3 --port 27 \
-D checkpoint
and tell runC to do a lazy restore:
# runc restore -d --image-path checkpoint --work-path checkpoint \
--lazy-pages httpd
If both processes on the restore side have the same working directory
'criu lazy-pages' creates a unix domain socket where it waits for
requests from the actual restore. runC starts CRIU restore in lazy
restore mode and talks to 'criu lazy-pages' that it wants to restore
memory pages on demand. CRIU continues to restore the process and once
the process is running and accesses the first non-existing memory page
the 'criu lazy-pages' server will request the page from the source
system. Thus all pages from the source system will be transferred to the
destination system. Once all pages have been transferred runC on the
source system will end and the container will have finished migration.
This can also be combined with CRIU's pre-copy support. The combination
of pre-copy and post-copy (lazy migration) provides the possibility to
migrate containers with minimal downtimes.
Some additional background about post-copy migration can be found in
these articles:
https://lisas.de/~adrian/?p=1253https://lisas.de/~adrian/?p=1183
Signed-off-by: Adrian Reber <areber@redhat.com>
FreeBSD does not support cgroups or namespaces, which the code suggested, and is not supported
in runc anyway right now. So clean up the file naming to use `_linux` where appropriate.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>