Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly close connection on EOF #1111

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

cabarnes
Copy link

I ran into #1029 . The workaround in that issue works for most clients but fails with PuTTY scp (pscp). Looking into it further, it seems that these clients send CHANNEL_EOF when exit, bye, or quit is executed. When the server does not close the connection, the session hangs. This PR addresses the issue by handling the eof signal and properly closing the connection. This change has been tested to work for sftp, pscp, WinSCP, and paramiko.

I'm not completely sure that this is the correct place, but hopefully if it is not, it is helpful in pointing to the correct place to fix this.

@mscdex
Copy link
Owner

mscdex commented Dec 28, 2021

I'm not sure that this is the correct behavior, especially if the local side still wants to send data. The EOF from the server just means it will send no more data.

If we send a channel close request to the server, it should send a close event back.... AFAIK that's what ssh2 should already be doing.

@cabarnes
Copy link
Author

@mscdex This is changing the server response to the client. The client sends CHANNEL_EOF and the server responds with closing the connection. Clients such as sftp send CHANNEL_EOF when the user runs exit, quit, or bye. Currently, the server implementation causes sftp to just hang. That's what was happening in the issue that I referenced. This change causes that client to exit successfully as well as many other clients that I tested. If the client has signaled it will no longer send data, I believe it would be correct for the server to close the connection. That also seems to be what all those clients are expecting.

@cabarnes
Copy link
Author

cabarnes commented Jan 4, 2022

There is definitely a bug in the server implementation. It is easy to reproduce as #1029 detailed. It's possible that this change is negatively affecting the client implementation as I have only been using this as a server. Perhaps there is a better place to handle this @mscdex?

This is the relevant part of the spec: https://datatracker.ietf.org/doc/html/rfc4254#section-5.3
I do not know why clients like sftp send SSH_MSG_CHANNEL_EOF instead of SSH_MSG_CHANNEL_CLOSE. But, the client sending SSH_MSG_CHANNEL_EOF specifically means that the party will no longer send more data to a channel. At that point, the server initiating the close is valid according to the spec and all the clients I have tested with seem to expect that behavior and exit successfully.

@cabarnes
Copy link
Author

cabarnes commented Jan 5, 2022

@mscdex The change has been updated to only operate this way when the channel is SFTP. That has been the only instance identified where this issue exists. The tests pass in this configuration

@cabarnes
Copy link
Author

cabarnes commented Jan 5, 2022

The master branch currently has the lint error that these checks are indicating

@mscdex
Copy link
Owner

mscdex commented Jan 5, 2022

Fixed.

@kolban-google
Copy link

On of my users of a project that I manage that is based on SSH2 just reported the same symptoms. This is just a "+1" on this issue. Will be watching here to review. Assuming that the code changes will be published as a new "npm" distribution when ready.

@jogoldberg
Copy link

I'm the afformentioned user. I've manually applied the fix and will let you know if this is indeed the solution I needed.

@jogoldberg
Copy link

jogoldberg commented Jan 6, 2022

@cabarnes - Lint says you've got an unused variable here: https://github.com/smartfile/ssh2/blob/fix-disconnect/lib/protocol/SFTP.js#L2596 which came from f763271

@cabarnes
Copy link
Author

cabarnes commented Jan 6, 2022

@jogoldberg I did not change that file in this PR. It was a lint error that was in master when the tests were run. The error has since been fixed in master, but the tests haven't been re-run

@jogoldberg
Copy link

jogoldberg commented Jan 6, 2022

Ah. If you do

git fetch upstream
git merge upstream/master
git push

that should sort you.

@jogoldberg
Copy link

If those commands don't work, you might first need to do:

git remote add upstream https://github.com/mscdex/ssh2.git

@cabarnes
Copy link
Author

cabarnes commented Jan 6, 2022

@jogoldberg
Copy link

The tests won't run again until @mscdex runs them due to this: https://github.blog/changelog/2021-04-22-github-actions-maintainers-must-approve-first-time-contributor-workflow-runs/

Yes, but when he does re-run the tests, the tests will be re-run against your fork. And your fork is 5 commits behind upstream master.

@cabarnes
Copy link
Author

cabarnes commented Jan 6, 2022

Done

@jogoldberg
Copy link

@mscdex - I can now confirm that this solved my issue. Can you please merge/release this?

@jogoldberg
Copy link

@mscdex - Just to add to the details, this solves the issue being experienced by client using OpenSSH_7.4p1

@jogoldberg
Copy link

@mscdex - Can you please merge/release this? It is urgently needed for a production issue we're facing

@jogoldberg
Copy link

@mscdex - This is urgently needed. Can you please have a look?

@mscdex
Copy link
Owner

mscdex commented Jan 18, 2022

@jogoldberg I'd like to have a test for this first, I just haven't had time to write one. If @cabarnes can include one, that will help get this merged quicker. AFAIK this should be easily achievable using ssh2 SSH/SFTP implementations on both ends.

@cabarnes
Copy link
Author

@mscdex I have added a test for this condition. It fails (hangs) without the change in this PR

@rk-coles
Copy link

@mscdex Requesting you to kindly approve. We too are in dire need of this in production

@mscdex
Copy link
Owner

mscdex commented Jan 27, 2022

I should be able to take a look at it this weekend.

@cabarnes
Copy link
Author

@mscdex I have it skipping the integration test on Windows like the other integration tests. I don't know why the Linux tests are failing. They pass for me on my Linux system with the same version of node and OpenSSH:

$ npm test

> [email protected] test
> node test/test.js

> Running test-exec.js ...
> Running test-integration-openssh-sftp.js ...
Testing with OpenSSH version: 8.2
> Running test-integration-openssh.js ...
Testing with OpenSSH version: 8.2
> Running test-misc-client-server.js ...
> Running test-openssh.js ...
> Running test-protocol-crypto.js ...
Crypto binding available
Testing cipher: null, mac: <none> (native encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (native encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (binding encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (native encrypt, binding decrypt) ...
Testing cipher: [email protected], mac: <implicit> (binding encrypt, binding decrypt) ...
Testing cipher: [email protected], mac: <implicit> (native encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (binding encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (native encrypt, binding decrypt) ...
Testing cipher: [email protected], mac: <implicit> (binding encrypt, binding decrypt) ...
Testing cipher: aes128-cbc, mac: [email protected] (native encrypt, native decrypt) ...
Testing cipher: aes128-cbc, mac: [email protected] (binding encrypt, native decrypt) ...
Testing cipher: aes128-cbc, mac: [email protected] (native encrypt, binding decrypt) ...
Testing cipher: aes128-cbc, mac: [email protected] (binding encrypt, binding decrypt) ...
Testing cipher: aes128-ctr, mac: hmac-sha1 (native encrypt, native decrypt) ...
Testing cipher: aes128-ctr, mac: hmac-sha1 (binding encrypt, native decrypt) ...
Testing cipher: aes128-ctr, mac: hmac-sha1 (native encrypt, binding decrypt) ...
Testing cipher: aes128-ctr, mac: hmac-sha1 (binding encrypt, binding decrypt) ...
Testing cipher: arcfour, mac: hmac-sha2-256-96 (native encrypt, native decrypt) ...
Testing cipher: arcfour, mac: hmac-sha2-256-96 (binding encrypt, native decrypt) ...
Testing cipher: arcfour, mac: hmac-sha2-256-96 (native encrypt, binding decrypt) ...
Testing cipher: arcfour, mac: hmac-sha2-256-96 (binding encrypt, binding decrypt) ...
> Running test-protocol-keyparser.js ...
> Running test-server-hostkeys.js ...
> Running test-sftp.js ...
> Running test-shell.js ...
> Running test-userauth-agent-openssh.js ...
> Running test-userauth-agent.js ...
> Running test-userauth.js ...

@@ -207,6 +207,11 @@ class Session extends EventEmitter {
state: 'open'
}
};

this.on('eof', () => {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might include a comment here about this being necessary to properly close the connection for some SFTP clients.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems to be addressed

@cabarnes
Copy link
Author

@mscdex I have added the comment and the test. Let me know what you think

@jogoldberg
Copy link

Hi @mscdex - cabarnes made your requested changes a little over a week ago. Do you think you might have time to review this today? :)

@jogoldberg
Copy link

Hi @mscdex - Looks like the latest revision passed all your automated checks now. Can you please merge/release this fix? We've been operating based on a manual patching of the library now for quite some time. Thanks in advance! :)

@@ -207,6 +207,11 @@ class Session extends EventEmitter {
state: 'open'
}
};

this.on('eof', () => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems to be addressed

@rk-coles
Copy link

@mscdex Kindly approve the pull request. We have been impacted by this in production

@mscdex
Copy link
Owner

mscdex commented Feb 28, 2022

The test in this PR passes for me without the runtime changes with various node versions. Is there something missing?

@cabarnes
Copy link
Author

@mscdex It's probably because there isn't much that I could figure out to assert on. I did manually verify that it covered the new code. I can keep looking into it. Do you have any ideas for an assertion?

@cabarnes
Copy link
Author

@mscdex I can't reproduce what you're seeing with node v16. The test hangs and times out when I remove the runtime changes. The test is relying on the client receiving the close from the server for it to complete successfully. Unless there's some way to add an assertion that this._channel.end() is being called, I don't know what else to add to it

@rk-coles
Copy link

@mscdex There have been a lot of noise from the customers as we are using this library for sftp connections in production. Can you please make this available soon?

@mscdex
Copy link
Owner

mscdex commented Mar 22, 2022

@rk-coles I haven't had time to investigate a working test that actually fails before the change and succeeds after the change.

@jogoldberg
Copy link

Hi @mscdex : This PR has been opened for quite some time now.
We have multiple package authors who use SSH2 confirming the behavior before/after the fix.
Any chance of getting it merged in sometime soon?
This is a real show-stopper.

@WafflesMcDuff
Copy link

@mscdex - can we please get this merged?

@edwhittle
Copy link

@mscdex Bumping this, any updates on the blockers?

@jogoldberg
Copy link

@mscdex - This has been waiting 6 months now. This is a 1 line code change that multiple developers have confirmed fixes a major show-stopper issue. This change is badly needed. Can you please prioritize the merge?

@ed-whittle
Copy link

@mscdex Bumping this again, we've had another client unable to close their connection as they send SSH_MSG_CHANNEL_EOF

@jogoldberg
Copy link

@mscdex - This PR is now waiting 1 year since the comment and test you requested were added. Can you please merge this?

@slifty
Copy link

slifty commented Feb 8, 2023

Thank you all for the amazing work!

I wanted to bump this as well as I'm facing this exact issue.

The only workaround I've found is to add the following to my server implementation:

    session.on('eof', () => {
      session._channel.end());
    });

But this is not safe for the long term as it requires use of the the private _channel attribute.

slifty added a commit to PermanentOrg/sftp-service that referenced this pull request Feb 8, 2023
There is an open PR in the upstream / ssh2 library related to properly
closing SFTP channels when a `CHANNEL_EOF` message is received[1].
Unfortunately that PR has not yet been merged.

Until that happens we need to handle the signal ourselves.
Unfortunately this requires us to access "private" attributes
(JavaScript doesn't support the concept of private attributes, so we are
able to do this...). This is, of course a terrible and horrible idea and
all of our tooling gets very upset about it.  As a result I had to
disable the tooling for the relevant line.

Once the PR is merged in upstream we should upgrade to it and delete the
contents of this commit.

Issue #45 rclone hangs after completion

[1] mscdex/ssh2#1111
slifty added a commit to PermanentOrg/sftp-service that referenced this pull request Feb 8, 2023
There is an open PR in the upstream / ssh2 library related to properly
closing SFTP channels when a `CHANNEL_EOF` message is received[1].
Unfortunately that PR has not yet been merged.

Until that happens we need to handle the signal ourselves.
Unfortunately this requires us to access "private" attributes
(JavaScript doesn't support the concept of private attributes, so we are
able to do this...). This is, of course a terrible and horrible idea and
all of our tooling gets very upset about it.  As a result I had to
disable the tooling for the relevant line.

Once the PR is merged in upstream we should upgrade to it and delete the
contents of this commit.

Issue #45 rclone hangs after completion

[1] mscdex/ssh2#1111
slifty added a commit to PermanentOrg/sftp-service that referenced this pull request Feb 8, 2023
There is an open PR in the upstream / ssh2 library related to properly
closing SFTP channels when a `CHANNEL_EOF` message is received[1].
Unfortunately that PR has not yet been merged.

Until that happens we need to handle the signal ourselves.
Unfortunately this requires us to access "private" attributes
(JavaScript doesn't support the concept of private attributes, so we are
able to do this...). This is, of course a terrible and horrible idea and
all of our tooling gets very upset about it.  As a result I had to
disable the tooling for the relevant line.

Once the PR is merged in upstream we should upgrade to it and delete the
contents of this commit.

Issue #45 rclone hangs after completion

[1] mscdex/ssh2#1111
slifty added a commit to PermanentOrg/sftp-service that referenced this pull request Feb 8, 2023
There is an open PR in the upstream / ssh2 library related to properly
closing SFTP channels when a `CHANNEL_EOF` message is received[1].
Unfortunately that PR has not yet been merged.

Until that happens we need to handle the signal ourselves.
Unfortunately this requires us to access "private" attributes
(JavaScript doesn't support the concept of private attributes, so we are
able to do this...). This is, of course a terrible and horrible idea and
all of our tooling gets very upset about it.  As a result I had to
disable the tooling for the relevant line.

Once the PR is merged in upstream we should upgrade to it and delete the
contents of this commit.

Issue #45 rclone hangs after completion

[1] mscdex/ssh2#1111
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants