Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable dynamic TCP receive window resizing #933

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

XOR-op
Copy link
Contributor

@XOR-op XOR-op commented May 23, 2024

Closes #927.

This PR introduces the ability to resize TCP receive buffer after the connection is established, as well as helper getter functions. This feature will help implement "rwnd auto-tuning", for example in Linux kernel, which can reduce memory footprint when establishing a lot of connections without sacrificing the maximum achievable throughput.

Current requirements include:

  1. The new buffer must be larger than the length of remaining data in the current buffer.
  2. The new buffer must be multiple of (1 << self.remote_win_shift), which is the window scaling negotiated in the handshake.

src/socket/tcp.rs Outdated Show resolved Hide resolved
@XOR-op XOR-op changed the title Enable resizing receive buffer after TCP connection is established Enable dymanic TCP receive window resizing Jul 12, 2024
@XOR-op
Copy link
Contributor Author

XOR-op commented Jul 12, 2024

@whitequark / @thvdveld Would you mind giving some forms of review? It has been quite some time, but I haven't receive any feedback from maintainers yet.

@whitequark
Copy link
Contributor

Ping me again on a workday on UK time and I'll see if I can do it. Ideally Mon/Tue.

@XOR-op
Copy link
Contributor Author

XOR-op commented Jul 15, 2024

@whitequark Thanks. It's workday today.

@whitequark whitequark changed the title Enable dymanic TCP receive window resizing Enable dynamic TCP receive window resizing Jul 17, 2024
/// Error returned by set_*
#[derive(Debug, PartialEq, Eq, Clone, Copy)]
#[cfg_attr(feature = "defmt", derive(defmt::Format))]
pub enum ArgumentError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the creation of a new error type that is used only in this API follows our API design principles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect a new error type for this API, too. However, I didn't find an appropriate existent error type to add enum variants. Do you have any suggestion for which of the existent {Send, Recv, Listen, Connect}Error should I add variants?

/// Return the local receive window scaling factor defined in [RFC 1323].
///
/// The value will become constant after the connection is established.
/// It may be reset to 0 during the handshake if remote side does not support window scaling.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that this API has a bidirectional data flow, where a value set by the consumer is read by the TCP/IP stack, but also then is updated by the TCP/IP stack. Is there any precedent (whether in smoltcp or in BSD TCP/IP) to have an option like this? I would there to be two API entry points, one to request an option, one to see if it was applied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In google's gVisor userspace TCP/IP stack, they have similar options. I believe the issue is that, smoltcp has its default, and we want to override this default. That's why we have such a bidirectional data flow. Our TCP/IP stack reads the value (after connect/accept), and sets the value if no appropriate default value is set (before connect/accept).

As for the "two API entry points", I don't quite get that. Could you give some examples?

/// otherwise, the old buffer is returned as an Ok value.
///
/// See also the [local_recv_win_scale](struct.Socket.html#method.local_recv_win_scale) methods.
pub fn replace_recv_buffer<T: Into<SocketBuffer<'a>>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a generic method that has no direct correspondence to TCP receive window resizing and as such I don't see why it should be a part of this PR at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method does enable the TCP receive window resizing. Smoltcp uses rx_buffer.capacity()-rx_buffer.len() to compute the current receive window. Without this method, it's impossible to adjust rx_buffer.capacity(). By replaced with a larger receive buffer, smoltcp knows it can advertise a larger receive window size safely without correctness issue.

Copy link
Contributor

@whitequark whitequark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a number of design issues with this API and I think it needs to be discussed further before being implemented. As it is I don't think it is a good fit for smoltcp.

@XOR-op
Copy link
Contributor Author

XOR-op commented Jul 17, 2024

I believe some explanations are needed for this PR: the original purpose of this PR is to enable using a larger (or smaller) receive window after a TCP connection is established for better resource utilization. However, in the current implementation of smoltcp, the window scale is automatically computed, which will be the minimum value that can fully utilize the initial receive buffer. Assuming the initial buffer is 4MB, than the auto-computed window scale is log_2(4MB/64K)=6, which means we cannot replace with a buffer > 4MB. Thus we need to have ability to set window scale manually.

TL;DR:

  1. We want to replace receive buffer dynamically (controlled by users).
  2. To make sure we can use a larger buffer than the initial one, window scale should be able to be set manually.

@whitequark
Copy link
Contributor

I feel like these two goals shouldn't be addressed in the same PR. For example, given that a buffer is replaced (which I'm still not sure is the right way to handle this, but I'm not saying it's not) why not recompute the window size for the buffer size automatically?

@XOR-op
Copy link
Contributor Author

XOR-op commented Jul 17, 2024

I feel like these two goals shouldn't be addressed in the same PR. For example, given that a buffer is replaced (which I'm still not sure is the right way to handle this, but I'm not saying it's not) why not recompute the window size for the buffer size automatically?

If the new buffer is always smaller than the initial buffer, than it has been automatically recomputed. But that will be less useful, since in most situations we want to increase the buffer size rather than reduce it. To be able to use a larger buffer after the establishment, we need to allow specifying window scale manually. (I don't add code for automatic re-computation, since that has been implemented even if without this PR.)

Also, to ensure the correctness of replacement, we need to access window scale anyway. That's why I see the two goals are related. And I'm afraid the smaller-only replacement will be accepted as a standalone PR due to lack of real usefulness. But I'll respect your opinion if you still think two PRs are needed.

@XOR-op
Copy link
Contributor Author

XOR-op commented Aug 8, 2024

@whitequark Can I know your current opinion on our previous discussion (API designs and whether to split PR)? If you insist, I can split this change into two separate PRs (ability to replace with a smaller buffer is the first, ability to replace with a larger buffer & change window scale before establishing connections is the second). Without your comments it's difficult for me to advance this PR. Thanks in advance for your time and effort.

@whitequark
Copy link
Contributor

I'm overwhelmed at work currently so I'll only be able to handle this 1-2 weeks later. Feel free to ping me.

@XOR-op
Copy link
Contributor Author

XOR-op commented Aug 26, 2024

@whitequark It has been two weeks now. Are you not so busy recently?

@whitequark
Copy link
Contributor

I'm on vacation currently, so I might be able to look at this PR in depth soon.

Copy link

codecov bot commented Sep 22, 2024

Codecov Report

Attention: Patch coverage is 86.20690% with 16 lines in your changes missing coverage. Please review.

Project coverage is 80.75%. Comparing base (336e7e5) to head (175a1ef).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/socket/tcp.rs 86.20% 16 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #933      +/-   ##
==========================================
+ Coverage   80.73%   80.75%   +0.02%     
==========================================
  Files          81       81              
  Lines       28335    28450     +115     
==========================================
+ Hits        22877    22976      +99     
- Misses       5458     5474      +16     
Flag Coverage Δ
80.75% <86.20%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@XOR-op
Copy link
Contributor Author

XOR-op commented Oct 7, 2024

@whitequark It has been another month since our last activity in this PR and hope you well. I think the two unresolved concerns are 1) whether we should add a new Error type; 2) how should we override the default window_scaling for API design. I hope to know thoughts from maintainers given the design challenges I faced, which are important for me to change my implementation.

@whitequark
Copy link
Contributor

I'll do my best to address this PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Resize TCP socket buffer dynamically after TCP socket has been created
3 participants