Skip to content
This repository has been archived by the owner on Oct 29, 2021. It is now read-only.

Improved test in examples/load-balancer #83

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

uablrek
Copy link
Contributor

@uablrek uablrek commented Feb 13, 2020

Tests load-balancing from the NSE-pod.

Fixes #82

The vip address is accessed 30 times and the lb-data is collected and the frequence of targets computed. Example of test printout;

application-server-584d4c94b6-2nh5n:11
application-server-584d4c94b6-jwl72:9
application-server-584d4c94b6-s7mhz:10

If only one target is found there is no load-balancing and the test fails.

@uablrek
Copy link
Contributor Author

uablrek commented Feb 13, 2020

@nickolaev Is there a way to trig retest?
First run my fault, the access to the vip address seemed to hang. When I added more printouts an unrelated test failed.

@nickolaev
Copy link
Member

I am not aware of any way to trigger tests when you do not have admin rights, except for pushing on the PR. I guess security limitations.

@uablrek
Copy link
Contributor Author

uablrek commented Feb 13, 2020

Ok thanks. Pushing works for me. I will squash all commits when it works anyway.

@uablrek
Copy link
Contributor Author

uablrek commented Feb 14, 2020

There is some problem that I don't see when I run this locally. Access through the load-balances does not work. Access to the application-servers direct (via the bridge-domain) works. The LB seems to be configured correctly and I am currently trouble-shooting access to the application-servers via a gre tunnel, which is the only difference I can find between direct and LB access to the application-servers

@uablrek
Copy link
Contributor Author

uablrek commented Feb 14, 2020

It seems like traffic with gre tunnels does not work in CI environment. I added a test with a gre tunnel without the load-blancer and it fails;

==== Check GRE access
Cmd [ip tunnel add foo4 mode gre remote 10.60.1.4]
Cmd [ip addr add 10.70.0.5/32 dev foo4]
Cmd [ip link set up dev foo4]
Cmd [ip ro add 10.2.2.3/32 dev foo4]
Cmd [ping -c1 -W1 10.2.2.3]
PING 10.2.2.3 (10.2.2.3) 56(84) bytes of data.

--- 10.2.2.3 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

command terminated with exit code 1

The same test works in my environment.

Any ideas why gre traffic wouldn't work? Is it used already in the CI environment so it becomes tunnel-in-tunnel (which normally does not work)?

@uablrek
Copy link
Contributor Author

uablrek commented Feb 14, 2020

Can it be that the icmp-echo goes via gre, i.e has a gre header, but the icmp-echo-reply (pong) has not? Sometimes the network has stateful firewalls in place that does not permit this.

@nickolaev
Copy link
Member

The tests are running in a kind environment on top of a single host Docker. Meaning that whatever happens is on the same host. I can't tell is there a firewall in the CircleCI environment, but I doubt it can influence this test.

@nickolaev
Copy link
Member

Can you please rebase it and see how it goes. We are using VPP v3 now.

@uablrek
Copy link
Contributor Author

uablrek commented Mar 9, 2020

I got a new very weird fault; If the VIP address, e.g 10.2.2.22 is accessed with tcp from the K8s node where the load-balancer NSE process is running then the source address gets mangled. When accessed from another machine everything works!

On the K8s node where the load-balancer NSE POD is running;

$ ip ro add 10.2.2.0/24 via 11.0.2.3
$ nc -s 192.168.1.4 10.2.2.22 5001 < /dev/null

Tcpdump from within the load-balancer NSE POD;

$ tcpdump -eni vpp1host
12:40:12.688326 0a:71:16:a0:75:88 > 02:fe:70:2b:c7:88, ethertype IPv4 (0x0800), length 74: 192.168.1.4.41749 > 10.2.2.22.5001: Flags [S], seq 1542741591, win 64240, options [mss 1460,sackOK,TS val 1249805913 ecr 0,nop,wscale 7], length 0
12:40:13.711290 0a:71:16:a0:75:88 > 02:fe:70:2b:c7:88, ethertype IPv4 (0x0800), length 74: 192.168.1.4.41749 > 10.2.2.22.5001: Flags [S], seq 1542741591, win 64240, options [mss 1460,sackOK,TS val 1249806936 ecr 0,nop,wscale 7], length 0

Same stream on the receiving application-server;

$ tcpdump -ni gre0
12:40:12.690678 IP 173.58.1.4.41749 > 10.2.2.22.5001: Flags [S], seq 1542741591, win 64240, options [mss 1460,sackOK,TS val 1249805913 ecr 0,nop,wscale 7], length 0
12:40:13.718245 IP 169.59.1.4.41749 > 10.2.2.22.5001: Flags [S], seq 1542741591, win 64240, options [mss 1460,sackOK,TS val 1249806936 ecr 0,nop,wscale 7], length 0

On another machine (external or another K8s node);

$ ip ro add 10.2.2.0/24 via 192.168.1.4
$ nc 10.2.2.22 5001 < /dev/null
application-server-79cf4f5f66-mv9q4
$ nc 10.2.2.22 5001 < /dev/null
application-server-79cf4f5f66-mv9q4
$ nc 10.2.2.22 5001 < /dev/null
application-server-79cf4f5f66-qpc8w
$ nc 10.2.2.22 5001 < /dev/null
application-server-79cf4f5f66-g6l5g

@uablrek
Copy link
Contributor Author

uablrek commented Mar 9, 2020

I squashed the commits. And BTW tcpdump must be installed;

apt update; apt install -y tcpdump   # On the lb POD
apk add tcpdump                            # On the application-servers

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Load balancer example check improvement
2 participants