-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MDEV-31585] Stop trusting or relying on client identifying information sent prior to the TLS handshake #2684
base: 10.4
Are you sure you want to change the base?
Conversation
|
if (pkt_len < 32) | ||
return packet_error; | ||
client_capabilities|= ((ulong) uint2korr(net->read_pos+2)) << 16; | ||
if (!(client_capabilities & CLIENT_MYSQL)) | ||
{ | ||
// it is client with mariadb extensions | ||
ulonglong ext_client_capabilities= | ||
(((ulonglong)uint4korr(net->read_pos + 28)) << 32); | ||
client_capabilities|= ext_client_capabilities; | ||
} | ||
} | ||
|
||
/* Disable those bits which are not supported by the client. */ | ||
compile_time_assert(sizeof(thd->client_capabilities) >= 8); | ||
thd->client_capabilities&= client_capabilities; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff makes this look more complicated than it really is.
All that is really changing is that these few lines move from before the TLS handshake to after the TLS handshake.
These line tells the server to read capability bits sent by the client and to trust them: the server alters its view of how the connection should work (thd->client_capabilities&=
) based on them.
- If the server reads these bits from the pre-TLS handshake plaintext version of the login request packet (https://mariadb.com/kb/en/connection/#sslrequest-packet), it forces the client to send this juicy identifying information in the clear.
- Making the server instead read these bits from the post-TLS handshake version of the login request packet (https://mariadb.com/kb/en/connection/#handshake-response-packet), it allows the client to only send its complete capability bits (and other identifying information) over the TLS-secured connection.
- … and the corresponding Connector/C change allows the client to do just that: [CONC-654] Stop leaking client identifying information to the server before the TLS handshake mariadb-corporation/mariadb-connector-c#227
7550af8
to
84aa88a
Compare
I've rebased this on top of 10.2, since it's a bugfix. This required an additional cherry-pick commit, e54c24c, to add some support code ( Merging this correctly into all versions ranging from 10.2-11.2 is non-trivial. The version of this branch targeted at 11.2 is available at: https://github.com/dlenski/mariadb-server/tree/dlenski-github/MDEV-31585_targeting_11.2 |
Looks rather risky to me. The server was ignoring capability flags from the after-ssl client packet for >20 years. It is totally possible that some third-party clients don't even send legitimate flags in that packet. The part with |
@vuvova wrote:
Indeed, I would consider that highly probable… It's essentially a consequence of Hyrum's Law: "with enough users, all the observable behaviors of a system are relied on by at least one user". And indeed, it is an observable behavior of MariaDB servers that they do not pay attention to to the client capability flags sent after TLS handshake. That is precisely what this PR is addressing. But so what 🤷♂️? To the extent that the client/server protocol is documented, the protocol documentation clearly states that the post-handshake packet SHOULD INCLUDE the capability flags, so any client that doesn't do this is not conforming to the documented protocol. Unless you think that the server must continue indefinitely to support buggy and insecure behaviors by default in the name of backwards compatibility, with an unknown universe of client implementations? That basically means these vulnerabilities will never get fixed. A reasonable approach, followed by much other software, would be to aggressively fix the TLS vulnerabilities in both the client and server in a backwards-incompatible way… but to allow the client and/or server to "opt in" to the old/buggy/insecure behavior ( |
I've already implemented this. See:
|
That's pretty much what I'd suggest. But don't forget, after your mariadb-corporation/mariadb-connector-c#227 this new mode will be enabled by default, so only old third-party clients will still have the issue and it'll be their vulnerability, not MariaDB's — we'll update all our connectors by the next release. |
@vuvova, I've added an additional commit 056d083 which will make the server:
This won't prevent an active MITM attacker from stripping the bit to gather information on the client, but it will cause a connection MITM'ed in this way to promptly fail, which should discourage attackers from using it. Lines 12793 to 12811 in 056d083
|
Could you document any changes to the MariaDB protocol on the MariaDB KB when the pull request gets accepted? This makes it easier to refer to the protocol without having to dig through the server source code. |
@markus456 wrote:
Yes, I'm definitely planning on it. There's a structural problem with the MariaDB KB, though, which is that it's just a big unversioned wiki. There's no consistent way to indicate substantial differences between individual versions. Compare this with, say, the MySQL docs which at least have separate 5.7 and 8.0 docs: |
6d4da44
to
6e43035
Compare
The CI won't run on this because it is against 10.2, which is way past EOL. If you make this against 10.4, it should run. |
This additional logging is intended to demonstrate exactly when/where/how the switch from a plaintext TCP/IP socket to a TLS/SSL-wrapped socket occurs. The short summary is that the TLS-ification of the connection is completely entangled with the authentication process: - The mechanism for ensuring confidentiality and authenticity of the client/server communications at the TRANSPORT-layer is TLS (which literally stands for Transport Layer Security) - The APPLICATION-layer authentication mechanisms involve traffic exchanged above the transport layer, and support multiple plugins, such as the default "native" authentication plugin using usernames and passwords. - The transport-layer security mechanism is logically distinct from the application-layer authentication mechanism, but in the MariaDB server codebase these are thoroughly entangled and interdependent. This is a network-mislayering design problem with significant consequences for code complexity and flexibility. It leads to several potential and actual security vulnerabilities whereby information is improperly transmitted and accepted in plaintext prior to the TLS handshake. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.
@LinuxJedi wrote:
Done, rebased on Are you actually planning to merge this and mariadb-corporation/mariadb-connector-c#227? The Connector/C PR has received absolutely no feedback from its maintainer. |
This one I would like to see included in some form, it won't be my final decision to make. But I don't see the intention being difficult to argue. That being said, the submodule update is breaking tests and I suspect this would either need to go or be done differently. The Connector/C one I have zero influence over. This is a decision for @9E0R9 to make. I agree that "something" needs to be done. I have not looked at the current state of the protocol on the MySQL side to see whether this is the correct way of resolving the problem. The spare bit appears to be contentious. In an ideal world that would be a separate port that does a TLS handshake first, but I can not see a way of doing that in a MySQL compatible way. At which point it might as well be part of a new protocol implementation. |
@LinuxJedi wrote:
What does "be done differently" mean? There is absolutely nothing that I as an internal contributor can do to remedy this situation.
The substantial flaws in the protocol appears to have been ignored for many years, which is why we're in this situation. The server and client implementation are tightly entangled, in terms of how they implement the {MySQL,MariaDB}-over-TLS protocol. How can either side possibly make a decision about how to proceed him if the developers aren't actively communicating around such issues?
I don't know what this means. Are you referring to the discussion (starting with mariadb-corporation/mariadb-connector-c#227 (comment)) about exactly where we should place the "SSL_V2" capability bit? Or something else?
Indeed, that might be ideal, but it is also not necessary. There is a ton of low-hanging fruit in terms of poor design of the {MySQL,MariaDB}-over-TLS protocol, and poor implementation. All of this can be simplified, improved, and made more secure without switching to an entirely separate TCP port. |
Sorry, I should have expanded on that. I meant that it appears the submodule is pushing Connector/C quite far ahead of its current position, to a different major version. Which is not something we normally do in an older release. Usually, such changes would be in the major version of Connector/C paired with this MariaDB version.
Is it not possible to make the server rely on post-TLS handshake without a Connector/C change?
It breaks in buildbot because it is switching major versions which have different error messages.
I suspect it would still break the regression suite.
I know this situation is frustrating, and things in the background are actively changing to improve this situation. The Unconference in particular will help with trying to put better processes in place.
Yes, this.
I agree, but we also have to be careful not to break compatibility with MySQL applications and connectors. This is why we are particularly cautious with protocol changes. |
…on sent prior to the TLS handshake The server has heretofore improperly mishandled—and TRUSTED—information sent in the plaintext login request packet sent prior to the TLS handshake. As a result of this, the client is *forced* to send excessive and exploitable identifying information in the pre-TLS-handshake plaintext login packet. That client-side vulnerability is CONC-654. This modifies the server to stop relying on any of the information in the pre-TLS-handshake plaintext login packet EXCEPT for the single bit that tells it that a TLS handshake will follow. It furthermore adds a capability bit to the server greeting packet, which informs the client that it is safe to send a bare-bones dummy packet containing ONLY the instruction that a TLS handshake will follow: /* This capability is set if: * * - The CLIENT knows how to send a truncated 2-byte SSLRequest * packet, containing no information other than the CLIENT_SSL flag * which is necessary to trigger the TLS handshake, and to send its * complete capability flags and other identifying information after * the TLS handshake. * - The SERVER knows how to receive this truncated 2-byte SSLRequest * packet, and to receive the client's complete capability bits * after the TLS handshake. * */ #define CLIENT_CAN_SSL_V2 (1ULL << 37) Because the client cannot safely send the SSL_V2 SSLRequest packet unless the server has advertised support for it in its (plaintext) Server Greeting packet, an active MITM could strip the CLIENT_CAN_SSL_V2 bit from that Server Greeting packet. This downgrade attack will force the client to continue exhibiting the CONC-654 vulnerability. The server is also modified to detect this case and abort the connection; this won't fix the one-time client information leakage of the CONC-654 vulnerability, but it is intended to discourage the MITM attack by making it highly visible. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.
…iners Often there are small "nitpicky" changes that need to be done to a PR in order to get it merged. To speed up the merge process, we will ask contributors if they are ok with the reviewer making those changes. Other fixups: Improve PR template rendering in browser's textarea field. Line wrapping at 80 characters provides a bad user experience within the browser, which handles word wrapping separately. Thus, prefer long lines in this markdown file. Remove the "backwards compatibility section". While the contributor should ideally care about the impact of their patch on the server's backwards compatibility, this is an advanced topic that is better presented in a separate document.
Indeed, it is not possible to do so in a way that would be guaranteed to be backwards-compatible with existing clients.
Personally, I think the value of such bug-for-bug backwards-compatibility is very low, but if the project insists on maintaining it, then there's very little wiggle room here. 🤷♂️
What is the regression suite? How could I run or check this? |
I'll also comment the same thing here that I commented on the MariaDB Zulip. Instead of reserving a bit in the MariaDB extended capabilities, you could reserve the first byte of the filler space for this capability. This way the patch could be applied to all MariaDB and MySQL compatible databases without forcing them to adapt the MariaDB extended capability bits. This would allow you to offer the patch to all implementations which would make it easier to adopt into use. |
So far I have proposed…
@markus456 wrote:
That would be fine with me but… do you think that Oracle/MySQL is more likely to adopt a protocol change that uses a bit in the "unused" area than they are to agree with using a bit in the area that MariaDB currently uses for its extended capabilities? From the perspective of Oracle/MySQL servers and clients, the "unused" bits and the "MariaDB extended capabilities" are all unused bits 🥲 |
Based on the discussion on Zulip, @markus456 thinks this will have a better chance of adoption by MySQL if we pick a new part of the unused bits and declare that to be something like the "cross-implementation extended capabilities". I'm all for it. 👍 I assume it makes sense to start using the bits at the beginning, rather than the end, of the space that's currently unused by any implementation. @LinuxJedi @vuvova, are you on board with proceeding in this direction? |
I am a little cautious that it is possible for MySQL to use those bits for anything in the future and could in-theory break other connectors from talking to us. I'm wondering if this is something we could somehow work together with the MySQL community on, to ensure that whatever change is made will be supported both sides in the future. My personal ideal solution (which @markus456 mentioned) is to have a MARIADBS port that does an immediate initial TLS handshake. Then a vast majority of problems immediately go away. This will require adoption over time by other connectors. But with some upstream community work for things such as PHP, as long as the rest of the protocol doesn't deviate too much, I think this is doable. |
Alternatively, if enough of the community group together, we can build a protocol definition working group (suggested by @grooverdan) |
From what I've seen, one of the major arguments against the changes in pull request has been that it doesn't improve the security enough to be worth the effort of getting this change accepted in all connectors and server implementations. As for whether the changes are "big enough" is not something that can be logically proven and the discussion around it is, in my opinion, going in circles. I think that nobody can deny the fact that a separate TLS port is more secure and would immediately improve the security situation. It also improves the performance of the connection creation by removing the redundant SSLRequest packet. Performance is always something that people care about even if security is ranked above it. In the TLS port case, you get the best of both worlds: better security with less data sent over the wire. Something which has not been discussed is the need for documenting how the protocol is used and what sorts of combinations of packets might be sent. This pull request introduces a new packet type that requires its own documentation and it changes the "state machine" required to implement the MariaDB/MySQL protocol. As someone who works on a project that will eventually have to implement both the client and the server side implementations of this, simplicity of the protocol extensions is a factor that affects us the most. The more complex a packet is to parse, the more open it is for errors and misinterpretations of the protocol. The TLS-first approach, on the other hand, simplifies the protocol by eventually removing the need for the SSLRequest packet. The TLS-first approach also simplifies the server-side implementation compared to this pull request by removing the need to check if the capabilities have been downgraded. The big problem with the TLS port is the need to add a new option flag for all client programs and any connector that implements it. The practical effort of having to then configure two ports for nearly all installations is something that cannot be ignored. This is a heroic amount of effort from the community but it allows all MariaDB/MySQL compatible client and server programs to implement it rather trivially. The only other approach is a new network protocol but I believe that this is an idea that's not going to be picked up simply because it requires several competing server implementations to implement a new protocol in unison and a large change on the client side for all connectors. The current protocol also offers no backwards compatible way of downgrading to an earlier protocol which requires that the client side sends some form of a "I want the new protocol" packet to the server's handshake response that further slows down and complicates the implementation. This is kind of what this pull request implements by adding a new packet to the network protocol. @dlenski also pointed out after I asked on Zulip about the importance of the server capabilities and the version string being "leaked" in the handshake packet: neither the current suggested protocol change nor the TLS port approach solve this problem as the server still sends the exact version in the first packet. The TLS port would make it encrypted so it's not broadcast over the network but it would still be trivial to find out if one is actively scanning for ports. I think that @dlenski 's suggestion to "freeze" the version to something like Like @LinuxJedi mentioned (and @grooverdan suggested), perhaps it would be smart to first reach out to the other involved parties (Oracle, community connectors, hosting providers etc.) and see if a modified version of this patch (the "shared capability bits" thing) is acceptable for them. In addition, asking how willing they'd be to adapt to a separate TLS port would be a good idea. |
To clarify the comment
A separate port, while, indeed, "nobody can deny the fact that a separate TLS port is more secure", would mean
So, I suspect that this approach, while, indeed, undeniably more secure, has a rather slim chance of being adopted by other connectors and clients. Not unless it'll provide distinct benefits clearly outweighing the efforts of implementing (and explaining/documenting) it. Reaching out first, as @markus456 suggested, could be a sensible way forward. |
To give an humble opinion on this an extra port that only response to TLS from the first handshake looks better for any layer 4 proxies like nginx and proxysql that can route TCP traffic base on the Certificate SNI extension and this could really help adoption in the cloud and with security pentests |
While separate SSL port would require a couple of server options, there is no substantial effort for the client. Client only has to specify if he connects to SSL port, that's one bit of information. Many popular connectors have an idea of "connection string,URL, or DNS. and adding useSSLPort=true to that is not a huge deal. Doing SSL handshake before first packet, based on that bit, is not that huge deal either. |
Description
The server has heretofore improperly mishandled—and TRUSTED—information sent
in the plaintext login request packet sent prior to the TLS handshake.
As a result of this, the client is forced to send excessive and
exploitable identifying information in the pre-TLS-handshake plaintext login
packet. That client-side vulnerability is CONC-654.
This modifies the server to stop relying on any of the information in the
pre-TLS-handshake plaintext login packet EXCEPT for the single bit that
tells it that a TLS handshake will follow. It furthermore adds a
capability bit to the server greeting packet, which informs the client that
it is safe to send a bare-bones dummy packet containing ONLY the instruction
that a TLS handshake will follow:
Because the client cannot safely send the SSL_V2 SSLRequest packet unless
the server has advertised support for it in its (plaintext) Server Greeting
packet, an active MITM could strip the
CLIENT_CAN_SSL_V2
bit from thatServer Greeting packet. This downgrade attack will force the client to
continue exhibiting the CONC-654 vulnerability. The server is also modified
to detect this case and abort the connection; this won't fix the one-time
client information leakage of the CONC-654 vulnerability, but it is intended
to discourage the MITM attack by making it highly visible.
How can this PR be tested?
CLIENT_SSL
bit, but the connection continues to work correctly.Basing the PR against the correct MariaDB version
This is a new feature and the PR is based against the latest MariaDB development branch.
This is a bug fix and the PR is based against the earliest maintained branch in which the bug can be reproduced.
This is a bugfix and it is based on
10.210.4, but in fact the bug goes back much further in MariaDB's history; as far as I can tell, there has never been a version of MariaDB that supports TLS and does not have this bug.PR quality check