Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRIU Doesn't Restore TCP Socket I/O between Remote Host Machines #2457

Open
h2wonS opened this issue Jul 31, 2024 · 4 comments
Open

CRIU Doesn't Restore TCP Socket I/O between Remote Host Machines #2457

h2wonS opened this issue Jul 31, 2024 · 4 comments

Comments

@h2wonS
Copy link

h2wonS commented Jul 31, 2024

Description

Hi, All

I was just wondering if the CRIU does restore in the remote (or multiple) hosts.
There are some of the research which claim that the CRIU only restore the process, expecially related on TCP connection, in the single machine(host). (ref: test )

AFAIK, CRIU supports the useful options in restoring tcp conection, --tcp-established.
I've tried some of the test codes in the CRIU homepages, 'tcp_howto.c', for evaluating the c/r of TCP connection and Socket I/O.
(ref:simple_tcp_pair_CRIU)
These are the c/r commands that I used:

  • $ sudo criu dump -t {$PID} -D ${img_dir} --tcp-established --shell-job --ext-unix-sk -v4 -o output_dump.log
  • $ sudo criu restore -t {$PID} -D ${img_dir} --tcp-established --shell-job -o output_restore.log

It worked well in the single host machine. However, the thing is, Although I used the option properly (on the same way), I couldn't reproduce your Demo especially on diffrent hosts.

  • ex) Dump the client process on Server1, Restore the client process img on Server 2.
    => the process restores successfully, but they cannot communicate.

Do I need to modify the 'tcp_howto.c' code? It would be great if you clarify for that... (where to I modify and how)

Steps to reproduce the issue:

  1. Run tcp_howto.c server process & client process in a single host machine.
  2. Dump client process into image, and copy to another host machine.
  3. Restore copied client image in second host machine.

Describe the results you received:
In log file, the dump / restore command succeed but the restored process run unexpectedly.
For example, the restored client process should write and read the socket , then print it out maintaining the context. However, the client just print '-1'.
Here's the part of source code(tcp_howto.c):
'''
while (1) {
write(sk, &val, sizeof(val));
rval = -1;
read(sk, &rval, sizeof(rval));
printf("PP %d -> %d\n", val, rval);
sleep(2);
val++;
}
'''

Describe the results you expected:
=> If the dumped process paused when the count = 150, it should be like this...
PP 151 -> 151
=> However, when we restore the process on the other machine, it prints out like this...
PP 151 -> -1

Additional environment details:
We use CRIU v3.19 and saw the same results in v3.15.
Our host machine OS is Ubuntu 20.04.6 LTS.

@adrianreber
Copy link
Member

This works for me. Your client and server need to use the same IP address after migration. You need to use bind() in the client part to bind to your IP address that you also need to migrate. I am using Fedora 40.

@h2wonS
Copy link
Author

h2wonS commented Aug 2, 2024

Thanks for your help, Adrian.

I modified the code, and then I could bind the static IP addresses each of the client / server.
(Also, I checked the server -client connected)

However, I've got the new error on Restore phase, like this....

(00.019254)     Running iptables [iptables -w -t filter -D INPUT --protocol tcp -m mark ! --mark 0xC114 --source 10.10.10.1 --sport 1234 --destination 10.10.10.2 --dport 40124 -j DROP]
iptables: Bad rule (does a matching rule exist in that chain?).
(00.025411) Error (criu/util.c:642): exited, status=1
(00.025447) Error (criu/netfilter.c:105): Iptables configuration failed
(00.025457)     Running iptables [iptables -w -t filter -D OUTPUT --protocol tcp -m mark ! --mark 0xC114 --source 10.10.10.2 --sport 40124 --destination 10.10.10.1 --dport 1234 -j DROP]
iptables: Bad rule (does a matching rule exist in that chain?).
(00.031632) Error (criu/util.c:642): exited, status=1
(00.031669) Error (criu/netfilter.c:105): Iptables configuration failed

Although I've added new rules to iptables, it stills the same. I don't know why...

BTW, Should the host machines be exactly same environment for the c/r? For example, I was wondering if the OS or kernel version should be exactly same between host machines.

@adrianreber
Copy link
Member

Is the iptables errors fatal? I also see it in the log but it does not stop the restore. I am using CRIU 3.19. I think this was a newer change to make iptables errors not fatal. CRIU tries to lock and unlock the firewall to ensure no packets like RST are transmitted. This logic does not really work if you migrate the process to another machine. But it should not abort the restore.

BTW, Should the host machines be exactly same environment for the c/r? For example, I was wondering if the OS or kernel version should be exactly same between host machines.

All linked libraries, binaries and open files need to be the same. Kernel version can be different. I would recommend to have similar kernels to avoid problems. Different distributions will most likely not work because of different libraries. Static linking or containers can work around this problem.

Copy link

github-actions bot commented Sep 2, 2024

A friendly reminder that this issue had no activity for 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants