Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoids SET NAMES commands if not needed #1494

Merged
merged 1 commit into from
Jul 17, 2024
Merged

Conversation

rusher
Copy link
Contributor

@rusher rusher commented Jul 12, 2024

For MariaDB since 15.1 (https://jira.mariadb.org/browse/MDEV-31609) session variables tracked (session_track_system_variables) are sent in initial OK_Packet. This PR permits to avoids initial query SET NAMES command, since charset is really know, defaulting to server corresponding default collation.

using MySqlConnector.Protocol.Serialization;
using MySqlConnector.Utilities;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This using is required to bring in backwards-compatibility extension methods (for older frameworks).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@bgrainger
Copy link
Member

The SET NAMES is only there to work around a MySQL Server bug: #1236. It's possible that if we detect MariaDB we could just assume that its implementation of COM_RESET_CONNECTION actually works, and skip the SET NAMES without the unnecessary effort of parsing system variables just to confirm that it's actually doing the right thing?

@rusher
Copy link
Contributor Author

rusher commented Jul 15, 2024

i think, this is a good default command. Yes, MariaDB doesn't have the com reset issue, but ensuring using utf8 exchanges in all connector exchanges is a good thing (avoiding errors and for security)

for MariaDB since 15.1 (https://jira.mariadb.org/browse/MDEV-31609) followed session variables are sent in initial OK_Packet. This permits to avoids  initial query SET NAMES command, since charset is really know, defaulting to server corresponding default collation.

Signed-off-by: rusher <[email protected]>
@bgrainger
Copy link
Member

I misunderstood the point of this change; I now see that it handles system variables in the initial handshake, and thus avoids a SET NAMES query if the server already has the right client charset. 👍

I'll leave a little more PR feedback later, but hope to get this merged in soon.

public bool SupportsCachedPreparedMetadata { get; }
public string? ClientCharset { get; set; }

public string? Database { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the appeal in moving the connection-context-specific data into this new class, but I think it makes it harder to understand to have state spread across ServerSession and now Context. I'd rather this class be an immutable collection of data that's passed into methods that need to query the current capabilities.

I also think it's a little "unpredictable" to have these properties updated without any notification that it's happening. (The old way wasn't great, either, in that code had to read the OkPayload.NewSchema and act on it, and if it didn't, that was a bug.) I also don't like the OkPayload parsing code having the side-effect of updating this mutable state, instead of just being a property-bag/DTO of all the data that was read out of the OK packet.

Made it immutable here: 77ad191

@@ -47,7 +47,7 @@ internal static partial class Log
public static partial void AutoDetectedAurora57(ILogger logger, string sessionId, string hostName);

[LoggerMessage(EventIds.SessionMadeConnection, LogLevel.Debug, "Session {SessionId} made connection; server version {ServerVersion}; connection ID {ConnectionId}; supports: compression {SupportsCompression}, attributes {SupportsAttributes}, deprecate EOF {SupportsDeprecateEof}, cached metadata {SupportsCachedMetadata}, SSL {SupportsSsl}, session track {SupportsSessionTrack}, pipelining {SupportsPipelining}, query attributes {SupportsQueryAttributes}")]
public static partial void SessionMadeConnection(ILogger logger, string sessionId, string serverVersion, int connectionId, bool supportsCompression, bool supportsAttributes, bool supportsDeprecateEof, bool supportsCachedMetadata, bool supportsSsl, bool supportsSessionTrack, bool supportsPipelining, bool supportsQueryAttributes);
public static partial void SessionMadeConnection(ILogger logger, string sessionId, string serverVersion, long connectionId, bool supportsCompression, bool supportsAttributes, bool supportsDeprecateEof, bool supportsCachedMetadata, bool supportsSsl, bool supportsSessionTrack, bool supportsPipelining, bool supportsQueryAttributes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a plan to make this variable a long throughout the rest of the code? Does MariaDB use 64-bit connection IDs?

https://github.com/mysql-net/MySqlConnector/pull/1494/files#diff-abcf6fc970e193538504a795eb54ea931b52864f45ef4fa0e8040ee1273dac9aR96 still converts to an Int32.

Reverted to int here: b19f1ee.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my fault, i began changing that and revert it, but not completly.
technically, mariadb use internally long, but using the integer truncated value on initial handshake works ok (see https://jira.mariadb.org/browse/MDEV-15089). To be honest, i've only heard about wikipedia having an issue with that, creating more than 2147483647 connections without restart is an achievement !

@@ -38,7 +38,7 @@ public async Task ReadResultSetHeaderAsync(IOBehavior ioBehavior)
var firstByte = payload.HeaderByte;
if (firstByte == OkPayload.Signature)
{
var ok = OkPayload.Create(payload.Span, Session.SupportsDeprecateEof, Session.SupportsSessionTrack);
var ok = OkPayload.Create(payload.Span, Session.Context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like bundling these together, but it could be done using a new interface that ServerSession implements. That would avoid allocating a second object every time a ServerSession is created. Changed here: 8466783.

@@ -454,30 +452,26 @@ public async Task DisposeAsync(IOBehavior ioBehavior, CancellationToken cancella
}

ServerVersion = new(initialHandshake.ServerVersion);
ConnectionId = initialHandshake.ConnectionId;
Context = new Context(initialHandshake.ProtocolCapabilities, cs.Database, initialHandshake.ConnectionId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By implementing an interface on ServerSession we also avoid this second Context allocation every time a session is connected (so reduced from three back down to one object).

var systemVariableOffset = reader.Offset + dataLength;
do
{
var variableSv = Encoding.ASCII.GetString(reader.ReadLengthEncodedByteString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is potentially a lot of string allocations for server variables we don't even read. We can perform a string comparison on the UTF-8 data by using ReadOnlySpan<byte> and ""u8 string literals: 3326a0e.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

: Encoding.ASCII.GetString(reader.ReadByteString(lenSv));
switch (variableSv)
{
case "character_set_client":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just one of the three possible connection-level character set variables, including character_set_connection and character_set_results. It seems prudent to verify that all three have the expected value (since the server will send them to us). Updated here: 3f50c14.

context.ClientCharset = valueSv;
break;
case "connection_id":
context.ConnectionId = Convert.ToInt32(valueSv, CultureInfo.InvariantCulture);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also use Utf8Parser to convert UTF-8 bytes directly to an int without allocating a temporary string.

@bgrainger bgrainger merged commit ef40088 into mysql-net:master Jul 17, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants