Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low throughput + NetEm delay creates gaps in upload data #265

Open
upnix opened this issue Apr 22, 2022 · 1 comment
Open

Low throughput + NetEm delay creates gaps in upload data #265

upnix opened this issue Apr 22, 2022 · 1 comment

Comments

@upnix
Copy link

upnix commented Apr 22, 2022

The problem: In Mininet, when limiting link speed to 10Mbps (via TBF or NetEm) and adding any amount of delay with NetEm, Flent using Netperf+TCP_STREAM will return large gaps in upload data - both in CSV output and resulting charts. While Netperf acts strangely in this scenario (which I'll describe below), I believe it is Flent and the use of apply_to in the DATA_SETS data structure that causes this problem.

The setup:

  • Mininet 2.3.0, installed from Github
  • Flent 2.0.1 installed with pip3 install flent
  • All on the same Ubuntu 20.0.4.4 install, directly on hardware (no VM).

With a network configuration of 1 router, 2 subnets, and 2 hosts (h1, h2), I use TBF to rate limit all links to 10Mbit/s, and NetEm to add ~28ms of delay between hosts (7ms on each link, but any amount of delay will do). I run Netserver on host h2, and the Flent test on h1, with traffic crossing the router. I'll attach my configuration files.

image

Commands:

$ sudo python3 ~/mininet_networks/1Router_2Networks_3Hosts.py
mininet> h2 pkill netserver
mininet> h2 netserver
mininet> h1 ethtool -K h1-eth0 tso off gso off gro off
mininet> h2 ethtool -K h2-eth0 tso off gso off gro off
mininet> h3 ethtool -K h3-eth0 tso off gso off gro off
mininet> r0 ethtool -K r0-eth1 tso off gso off gro off
mininet> r0 ethtool -K r0-eth2 tso off gso off gro off
mininet> r0 tc qdisc add dev r0-eth1 root tbf rate 10mbit burst 4096kbit latency 5ms
mininet> r0 tc qdisc add dev r0-eth2 root tbf rate 10mbit burst 4096kbit latency 5ms
mininet> r0 tc qdisc add dev r0-eth1 parent 8001: netem delay 7ms
mininet> r0 tc qdisc add dev r0-eth2 parent 8002: netem delay 7ms
mininet> h1 tc qdisc add dev h1-eth0 root tbf rate 10mbit burst 4096kbit latency 5ms
mininet> h1 tc qdisc add dev h1-eth0 parent 8005: netem delay 7ms
mininet> h2 tc qdisc add dev h2-eth0 root tbf rate 10mbit burst 4096kbit latency 5ms
mininet> h2 tc qdisc add dev h2-eth0 parent 8007: netem delay 7ms
mininet> h1 flent -H 10.0.0.100 -x --socket-stats -d 0 -l 60 tcp_2up -f csv -D ~chris/ -t 'TCP 2 Up ' -o ~chris/tcp_2up.csv

The result:
There are large gaps in the results reported by Flent.
image
image

Narrowing the problem down
Above, I showed the problem with the Flent-included tcp_2up test, but because I believe the issue lies with the use of apply_to I had to do some retooling of the test to exclude its use. So I have two new test configurations:

  1. tcp_nup_2.conf - This is the Flent-included tcp_nup.conf, modified by commenting out the function add_stream, the call to for_stream_config() and the DATA_SETS entry "TCP upload avg". I then hard-code in what is essentially a single "TCP upload::1" test.
  2. tcp_1up_from_nup_2.conf - This is tcp_2up.conf, but it includes tcp_nup_2.conf instead of tcp_nup.conf

Now, running the Flent test tcp_1up_from_nup_2.conf, upload data is shown as continuous, as you'd expect.

Why?
I don't know. What I do know is that the Flent test tcp_2down has no problems, and when I run the related Netperf command directly, TCP_MAERTS will return results with with expected regularity (NETPERF_INTERVAL[xx]=0.2 more or less). However, the Netperf test TCP_STREAM, which tcp_2up uses will have spaces between results of 4 seconds (NETPERF_INTERVAL[xx]=4 more or less). The results returned still seem accurate to me, there's just longer pauses between reporting.

But this can't be the entire story, because Flent tests that don't use apply_to when building DATA_SETS use the exact same Netperf command, gaps and all, yet don't have this problem.

So it would seem to me that somehow Flent isn't properly handling gaps in reporting when apply_to is used for DATA_SETS.

What else fixes the problem?

  • Removing any delay on the link, whether from removing NetEm or going to a hardware switch.
  • Increasing the link speed, but keeping the delay.
  • Changing the TCP CCA to BBR, rather than CUBIC.

Note that these are probably things that just make Netperf return results every 0.2 seconds (I haven't checked though), so they're probably not directly related to Flent.

Files of interest
Flent results when running the included tcp_2up test:
tcp_2up-2022-04-22T095700.876743.TCP_2_Up.flent.gz

My Flent test that avoids gaps in upload data:
tcp_1up_from_nup_2.txt
tcp_nup_2.txt

The Mininet network used:
1Router_2Networks_3Hosts.txt

@tohojo
Copy link
Owner

tohojo commented Apr 25, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants