Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure TCP socket to reduce latency #1683

Merged

Conversation

faisal-shah
Copy link
Contributor

TCP_NODELAY disables Nagles algorithm. This improves latency (reduces), but worsens overall throughput. For the purpose of bridging a CAN bus over a network connection to socketcand (and given the relatively low overall bandwidth of CAN), optimizing for latency is more important.

TCP_QUICKACK disables the default delayed ACK timer. This is ~40ms in linux (not sure about windows). The thing is, TCP_QUICKACK is reset when you send or receive on the socket, so it needs reenabling each time. Also, TCP_QUICKACK doesn't seem to be available in windows.

Here's a comment by John Nagle himself that some may find useful: https://news.ycombinator.com/item?id=10608356

"That still irks me. The real problem is not tinygram prevention. It's ACK delays, and that stupid fixed timer. They both went into TCP around the same time, but independently. I did tinygram prevention (the Nagle algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The combination of the two is awful. Unfortunately by the time I found about delayed ACKs, I had changed jobs, was out of networking, and doing a product for Autodesk on non-networked PCs. Delayed ACKs are a win only in certain circumstances - mostly character echo for Telnet. (When Berkeley installed delayed ACKs, they were doing a lot of Telnet from terminal concentrators in student terminal rooms to host VAX machines doing the work. For that particular situation, it made sense.) The delayed ACK timer is scaled to expected human response time. A delayed ACK is a bet that the other end will reply to what you just sent almost immediately. Except for some RPC protocols, this is unlikely. So the ACK delay mechanism loses the bet, over and over, delaying the ACK, waiting for a packet on which the ACK can be piggybacked, not getting it, and then sending the ACK, delayed. There's nothing in TCP to automatically turn this off. However, Linux (and I think Windows) now have a TCP_QUICKACK socket option. Turn that on unless you have a very unusual application.

"Turning on TCP_NODELAY has similar effects, but can make throughput worse for small writes. If you write a loop which sends just a few bytes (worst case, one byte) to a socket with "write()", and the Nagle algorithm is disabled with TCP_NODELAY, each write becomes one IP packet. This increases traffic by a factor of 40, with IP and TCP headers for each payload. Tinygram prevention won't let you send a second packet if you have one in flight, unless you have enough data to fill the maximum sized packet. It accumulates bytes for one round trip time, then sends everything in the queue. That's almost always what you want. If you have TCP_NODELAY set, you need to be much more aware of buffering and flushing issues.

"None of this matters for bulk one-way transfers, which is most HTTP today. (I've never looked at the impact of this on the SSL handshake, where it might matter.)

"Short version: set TCP_QUICKACK. If you find a case where that makes things worse, let me know.

John Nagle"

TCP_NODELAY disables Nagles algorithm. This improves latency (reduces),
but worsens overall throughput. For the purpose of bridging a CAN bus
over a network connection to socketcand (and given the relatively low
overall bandwidth of CAN), optimizing for latency is more important.

TCP_QUICKACK disables the default delayed ACK timer. This is ~40ms in
linux (not sure about windows). The thing is, TCP_QUICKACK is reset when
you send or receive on the socket, so it needs reenabling each time.
Also, TCP_QUICKACK doesn't seem to be available in windows.

Here's a comment by John Nagle himself that some may find useful:
https://news.ycombinator.com/item?id=10608356

"That still irks me. The real problem is not tinygram prevention. It's
ACK delays, and that stupid fixed timer. They both went into TCP around
the same time, but independently. I did tinygram prevention (the Nagle
algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The
combination of the two is awful. Unfortunately by the time I found about
delayed ACKs, I had changed jobs, was out of networking, and doing a
product for Autodesk on non-networked PCs.  Delayed ACKs are a win only
in certain circumstances - mostly character echo for Telnet. (When
Berkeley installed delayed ACKs, they were doing a lot of Telnet from
terminal concentrators in student terminal rooms to host VAX machines
doing the work. For that particular situation, it made sense.) The
delayed ACK timer is scaled to expected human response time. A delayed
ACK is a bet that the other end will reply to what you just sent almost
immediately. Except for some RPC protocols, this is unlikely. So the ACK
delay mechanism loses the bet, over and over, delaying the ACK, waiting
for a packet on which the ACK can be piggybacked, not getting it, and
then sending the ACK, delayed. There's nothing in TCP to automatically
turn this off. However, Linux (and I think Windows) now have a
TCP_QUICKACK socket option. Turn that on unless you have a very unusual
application.

"Turning on TCP_NODELAY has similar effects, but can make throughput
worse for small writes. If you write a loop which sends just a few bytes
(worst case, one byte) to a socket with "write()", and the Nagle
algorithm is disabled with TCP_NODELAY, each write becomes one IP
packet. This increases traffic by a factor of 40, with IP and TCP
headers for each payload. Tinygram prevention won't let you send a
second packet if you have one in flight, unless you have enough data to
fill the maximum sized packet. It accumulates bytes for one round trip
time, then sends everything in the queue. That's almost always what you
want. If you have TCP_NODELAY set, you need to be much more aware of
buffering and flushing issues.

"None of this matters for bulk one-way transfers, which is most HTTP
today. (I've never looked at the impact of this on the SSL handshake,
where it might matter.)

"Short version: set TCP_QUICKACK. If you find a case where that makes
things worse, let me know.

John Nagle"
@faisal-shah faisal-shah force-pushed the feat/socketcand-improve-tcp-latency branch from 5dcb016 to b73fc87 Compare October 19, 2023 21:43
@zariiii9003
Copy link
Collaborator

Can you provide any performance comparison? What's the benefit? Are there any side effects?

@faisal-shah
Copy link
Contributor Author

faisal-shah commented Oct 23, 2023

Can you provide any performance comparison? What's the benefit? Are there any side effects?

TCP_NODELAY is pretty much a no-brainer. Whenever you require low latency on every packet, this option must be enabled. You can see socketcand itself enabling this on the host socket:
https://github.com/linux-can/socketcand/blob/331a6d0fc6fdfc87d02ddf3f0dc4e7b4ca02f4af/socketcand.c#L451
(I have an open PR over there to add TCP_QUICKACK as well: linux-can/socketcand#28)

With regards to TCP_QUICKACK, I wrote a Windows COM port to Linux tty TCP bridge (among other things in https://github.com/faisal-shah/python-ttybus). I noticed in Wireshark that responses were being delayed by up to 40ms. Upon further digging I came across this:
https://ntrs.nasa.gov/api/citations/20200002393/downloads/20200002393.pdf
... which lead me to other resources and the comments from John Nagle. Upon enabling TCP_QUICKACK, latency caused by TCP delayed ACK were eliminated and the connection was solid.

The side effects of the above changes are the reasons why TCP delays sending, and ACKS to begin with - to increase overall network throughput (by way of reducing packet overhead) at the cost of latency. For the purpose of socketcand client and server, it makes sense to make that tradeoff in the direction of lowered latency.

@zariiii9003
Copy link
Collaborator

ping @hartkopp, any objections?

@hartkopp
Copy link
Collaborator

Thanks @faisal-shah for providing the additional documentation - especially the NASA summary.
Btw. both improvements with TCP_NODELAY and (on Linux) with TCP_QUICKACK change the TCP behaviour in a way that it increases the number of TCP packets and it removes the the ACK delay on Linux systems.
The socketcand is a 10+ year old hack, and no one ever complained about latency issues so far.

I like this improvement, but it should be an opt-in option for 'power users' that explicit focus on latency and know what they are doing (when the system behaviour changes).

So I would suggest to add this feature - but make it an option.
Either for the socketcand.py and for the socketcand in the can-utils linux-can/socketcand#28

@faisal-shah
Copy link
Contributor Author

Alrighty, any strong opinions on how to handle this option and what to call it?

ISO-TP is probably one of the few popular segmented data transfer protocols over CAN - and its timeouts are typically 1000ms if I remember correctly. I'm using a proprietary CAN block transfer protocol which has timeouts on the order of 10ms, and with my other serial bridging project more like 5ms. So I ran into the limitations of delayed ack pretty quickly.

@@ -173,9 +180,15 @@ def _recv_internal(self, timeout):
def _tcp_send(self, msg: str):
log.debug(f"Sending TCP Message: '{msg}'")
self.__socket.sendall(msg.encode("ascii"))
if os.name != "nt":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could move the OS check into __init__ and warn the user, if the parameter was set on windows. Set self.__tcp_tune = False on Windows and ignore the user input.

@@ -75,10 +76,13 @@ def connect_to_server(s, host, port):


class SocketCanDaemonBus(can.BusABC):
def __init__(self, channel, host, port, can_filters=None, **kwargs):
def __init__(self, channel, host, port, tcp_tune=False, can_filters=None, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if you could add a proper docstring like

"""Creates a new socketcan bus.

The socketcand interface API is not really well documented in doc/interfaces/socketcand.rst

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty, added docstrings to public methods of the Bus object.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks! You can see the docs for your PR here, your changes do not show up yet.

You need to update socketcand.rst.

Add something like

.. autoclass:: can.interfaces.socketcand.SocketCanDaemonBus
   :show-inheritance:
   :member-order: bysource
   :members:

You can build the docs locally with python -m sphinx -Wan --keep-going doc build.

@faisal-shah faisal-shah force-pushed the feat/socketcand-improve-tcp-latency branch from 1c8941d to 306051c Compare October 27, 2023 19:55
@faisal-shah
Copy link
Contributor Author

This test is failing:
test (macos-latest, false, pypy-3.8): FAILED test/test_util.py::TestCheckAdjustTimingClock::test_adjust_timing_fd - AssertionError: unclosed file

Doesn't look relevant to my changes. Any idea what's going on? Was this platform-test recently added/changed?

@faisal-shah
Copy link
Contributor Author

Ok, all tests passing. This looks ready to go.

While I have your attention here, would you have appetite for another PR which adds functionality to listen for socketcand UDP broadcasts and automagically connects to it (say if host/port were not specified when creating the Bus object)?

See here for implementation:
https://github.com/faisal-shah/python-can-gateway/blob/ad161e020f6349c1cb2a9e4bbb2e45b268930baa/socketcand_utils.py#L14

@zariiii9003
Copy link
Collaborator

Ok, all tests passing. This looks ready to go.

One last thing: please add an entry to the changelog. We'll include this in 4.3.0

While I have your attention here, would you have appetite for another PR which adds functionality to listen for socketcand UDP broadcasts and automagically connects to it (say if host/port were not specified when creating the Bus object)?

See here for implementation: https://github.com/faisal-shah/python-can-gateway/blob/ad161e020f6349c1cb2a9e4bbb2e45b268930baa/socketcand_utils.py#L14

As long as new features are documented and tested, PRs are always welcome 😄

@zariiii9003 zariiii9003 merged commit 5c1c46f into hardbyte:develop Oct 30, 2023
30 checks passed
@faisal-shah faisal-shah deleted the feat/socketcand-improve-tcp-latency branch December 1, 2023 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants