-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure TCP socket to reduce latency #1683
Configure TCP socket to reduce latency #1683
Conversation
TCP_NODELAY disables Nagles algorithm. This improves latency (reduces), but worsens overall throughput. For the purpose of bridging a CAN bus over a network connection to socketcand (and given the relatively low overall bandwidth of CAN), optimizing for latency is more important. TCP_QUICKACK disables the default delayed ACK timer. This is ~40ms in linux (not sure about windows). The thing is, TCP_QUICKACK is reset when you send or receive on the socket, so it needs reenabling each time. Also, TCP_QUICKACK doesn't seem to be available in windows. Here's a comment by John Nagle himself that some may find useful: https://news.ycombinator.com/item?id=10608356 "That still irks me. The real problem is not tinygram prevention. It's ACK delays, and that stupid fixed timer. They both went into TCP around the same time, but independently. I did tinygram prevention (the Nagle algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The combination of the two is awful. Unfortunately by the time I found about delayed ACKs, I had changed jobs, was out of networking, and doing a product for Autodesk on non-networked PCs. Delayed ACKs are a win only in certain circumstances - mostly character echo for Telnet. (When Berkeley installed delayed ACKs, they were doing a lot of Telnet from terminal concentrators in student terminal rooms to host VAX machines doing the work. For that particular situation, it made sense.) The delayed ACK timer is scaled to expected human response time. A delayed ACK is a bet that the other end will reply to what you just sent almost immediately. Except for some RPC protocols, this is unlikely. So the ACK delay mechanism loses the bet, over and over, delaying the ACK, waiting for a packet on which the ACK can be piggybacked, not getting it, and then sending the ACK, delayed. There's nothing in TCP to automatically turn this off. However, Linux (and I think Windows) now have a TCP_QUICKACK socket option. Turn that on unless you have a very unusual application. "Turning on TCP_NODELAY has similar effects, but can make throughput worse for small writes. If you write a loop which sends just a few bytes (worst case, one byte) to a socket with "write()", and the Nagle algorithm is disabled with TCP_NODELAY, each write becomes one IP packet. This increases traffic by a factor of 40, with IP and TCP headers for each payload. Tinygram prevention won't let you send a second packet if you have one in flight, unless you have enough data to fill the maximum sized packet. It accumulates bytes for one round trip time, then sends everything in the queue. That's almost always what you want. If you have TCP_NODELAY set, you need to be much more aware of buffering and flushing issues. "None of this matters for bulk one-way transfers, which is most HTTP today. (I've never looked at the impact of this on the SSL handshake, where it might matter.) "Short version: set TCP_QUICKACK. If you find a case where that makes things worse, let me know. John Nagle"
5dcb016
to
b73fc87
Compare
Can you provide any performance comparison? What's the benefit? Are there any side effects? |
TCP_NODELAY is pretty much a no-brainer. Whenever you require low latency on every packet, this option must be enabled. You can see socketcand itself enabling this on the host socket: With regards to TCP_QUICKACK, I wrote a Windows COM port to Linux tty TCP bridge (among other things in https://github.com/faisal-shah/python-ttybus). I noticed in Wireshark that responses were being delayed by up to 40ms. Upon further digging I came across this: The side effects of the above changes are the reasons why TCP delays sending, and ACKS to begin with - to increase overall network throughput (by way of reducing packet overhead) at the cost of latency. For the purpose of socketcand client and server, it makes sense to make that tradeoff in the direction of lowered latency. |
ping @hartkopp, any objections? |
Thanks @faisal-shah for providing the additional documentation - especially the NASA summary. I like this improvement, but it should be an opt-in option for 'power users' that explicit focus on latency and know what they are doing (when the system behaviour changes). So I would suggest to add this feature - but make it an option. |
Alrighty, any strong opinions on how to handle this option and what to call it? ISO-TP is probably one of the few popular segmented data transfer protocols over CAN - and its timeouts are typically 1000ms if I remember correctly. I'm using a proprietary CAN block transfer protocol which has timeouts on the order of 10ms, and with my other serial bridging project more like 5ms. So I ran into the limitations of delayed ack pretty quickly. |
@@ -173,9 +180,15 @@ def _recv_internal(self, timeout): | |||
def _tcp_send(self, msg: str): | |||
log.debug(f"Sending TCP Message: '{msg}'") | |||
self.__socket.sendall(msg.encode("ascii")) | |||
if os.name != "nt": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could move the OS check into __init__
and warn the user, if the parameter was set on windows. Set self.__tcp_tune = False
on Windows and ignore the user input.
@@ -75,10 +76,13 @@ def connect_to_server(s, host, port): | |||
|
|||
|
|||
class SocketCanDaemonBus(can.BusABC): | |||
def __init__(self, channel, host, port, can_filters=None, **kwargs): | |||
def __init__(self, channel, host, port, tcp_tune=False, can_filters=None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if you could add a proper docstring like
"""Creates a new socketcan bus. |
The socketcand interface API is not really well documented in doc/interfaces/socketcand.rst
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alrighty, added docstrings to public methods of the Bus object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks! You can see the docs for your PR here, your changes do not show up yet.
You need to update socketcand.rst.
Add something like
.. autoclass:: can.interfaces.socketcand.SocketCanDaemonBus
:show-inheritance:
:member-order: bysource
:members:
You can build the docs locally with python -m sphinx -Wan --keep-going doc build
.
1c8941d
to
306051c
Compare
This test is failing: Doesn't look relevant to my changes. Any idea what's going on? Was this platform-test recently added/changed? |
Ok, all tests passing. This looks ready to go. While I have your attention here, would you have appetite for another PR which adds functionality to listen for socketcand UDP broadcasts and automagically connects to it (say if host/port were not specified when creating the Bus object)? See here for implementation: |
One last thing: please add an entry to the changelog. We'll include this in 4.3.0
As long as new features are documented and tested, PRs are always welcome 😄 |
TCP_NODELAY disables Nagles algorithm. This improves latency (reduces), but worsens overall throughput. For the purpose of bridging a CAN bus over a network connection to socketcand (and given the relatively low overall bandwidth of CAN), optimizing for latency is more important.
TCP_QUICKACK disables the default delayed ACK timer. This is ~40ms in linux (not sure about windows). The thing is, TCP_QUICKACK is reset when you send or receive on the socket, so it needs reenabling each time. Also, TCP_QUICKACK doesn't seem to be available in windows.
Here's a comment by John Nagle himself that some may find useful: https://news.ycombinator.com/item?id=10608356
"That still irks me. The real problem is not tinygram prevention. It's ACK delays, and that stupid fixed timer. They both went into TCP around the same time, but independently. I did tinygram prevention (the Nagle algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The combination of the two is awful. Unfortunately by the time I found about delayed ACKs, I had changed jobs, was out of networking, and doing a product for Autodesk on non-networked PCs. Delayed ACKs are a win only in certain circumstances - mostly character echo for Telnet. (When Berkeley installed delayed ACKs, they were doing a lot of Telnet from terminal concentrators in student terminal rooms to host VAX machines doing the work. For that particular situation, it made sense.) The delayed ACK timer is scaled to expected human response time. A delayed ACK is a bet that the other end will reply to what you just sent almost immediately. Except for some RPC protocols, this is unlikely. So the ACK delay mechanism loses the bet, over and over, delaying the ACK, waiting for a packet on which the ACK can be piggybacked, not getting it, and then sending the ACK, delayed. There's nothing in TCP to automatically turn this off. However, Linux (and I think Windows) now have a TCP_QUICKACK socket option. Turn that on unless you have a very unusual application.
"Turning on TCP_NODELAY has similar effects, but can make throughput worse for small writes. If you write a loop which sends just a few bytes (worst case, one byte) to a socket with "write()", and the Nagle algorithm is disabled with TCP_NODELAY, each write becomes one IP packet. This increases traffic by a factor of 40, with IP and TCP headers for each payload. Tinygram prevention won't let you send a second packet if you have one in flight, unless you have enough data to fill the maximum sized packet. It accumulates bytes for one round trip time, then sends everything in the queue. That's almost always what you want. If you have TCP_NODELAY set, you need to be much more aware of buffering and flushing issues.
"None of this matters for bulk one-way transfers, which is most HTTP today. (I've never looked at the impact of this on the SSL handshake, where it might matter.)
"Short version: set TCP_QUICKACK. If you find a case where that makes things worse, let me know.
John Nagle"