Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections are not terminated after ending the session #379

Open
tunahanertekin opened this issue Mar 25, 2024 · 5 comments
Open

Connections are not terminated after ending the session #379

tunahanertekin opened this issue Mar 25, 2024 · 5 comments
Labels
bug Something isn't working
Milestone

Comments

@tunahanertekin
Copy link

tunahanertekin commented Mar 25, 2024

Hey there,

I recently discovered that neko possibly fails to terminate UDP connections when the user ended session. I have tested the situation with the following configuration:

version: "3.4"
services:
  neko:
    image: "m1k1o/neko:xfce"
    restart: "unless-stopped"
    shm_size: "2gb"
    ports:
      - "8080:8080"
      - "52000-52003:52000-52003/udp"
    environment:
      NEKO_SCREEN: 1920x1080@30
      NEKO_PASSWORD: neko
      NEKO_PASSWORD_ADMIN: admin
      NEKO_EPR: 52000-52003
      NEKO_ICELITE: 0
      NEKO_NAT1TO1: 172.16.44.194

When running the container using docker-compose up, and go to the localhost:8080, neko and XFCE window opens successfully. However, after two or three refreshes in browser, neko fails to deliver XFCE screen and gives black screen with an error message saying "Disconnected peer failed".

Logs

Here are the logs between two browser refreshes:

neko_1  | 2024-03-25 11:54:24,207 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF creating pipeline codec=opus module=capture src="pulsesrc device=audio_output.monitor ! audio/x-raw,channels=2 ! audioconvert ! opusenc inband-fec=true bitrate=128000 ! appsink name=appsinkaudio" submodule=stream-sink video_id=audio
neko_1  | 
neko_1  | 2024-03-25 11:54:24,208 DEBG 'pulseaudio' stdout output:
neko_1  | I: [pulseaudio] client.c: Created 5 "Native client (UNIX socket client)"
neko_1  | I: [pulseaudio] protocol-native.c: Client authenticated anonymously.
neko_1  | 
neko_1  | 2024-03-25 11:54:24,208 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF first listener, starting module=capture submodule=stream-sink video_id=audio
neko_1  | 
neko_1  | 2024-03-25 11:54:24,209 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF creating pipeline codec=vp8 module=capture src="ximagesrc display-name=:99.0 show-pointer=true use-damage=false ! video/x-raw,framerate=25/1 ! videoconvert ! queue !  vp8enc target-bitrate=1996800 cpu-used=4 end-usage=cbr threads=4 deadline=1 undershoot=95 buffer-size=12288 buffer-initial-size=6144 buffer-optimal-size=9216 keyframe-max-dist=25 min-quantizer=4 max-quantizer=20  ! appsink name=appsinkvideo" submodule=stream-sink video_id=video
neko_1  | 
neko_1  | 2024-03-25 11:54:24,209 DEBG 'pulseaudio' stdout output:
neko_1  | I: [pulseaudio] source-output.c: Created output 3 "Record Stream" on audio_output.monitor with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
neko_1  | I: [pulseaudio] source-output.c:     media.name = "Record Stream"
neko_1  | I: [pulseaudio] source-output.c:     application.name = "neko"
neko_1  | I: [pulseaudio] source-output.c:     native-protocol.peer = "UNIX socket client"
neko_1  | I: [pulseaudio] source-output.c:     native-protocol.version = "34"
neko_1  | I: [pulseaudio] source-output.c:     application.process.id = "12"
neko_1  | I: [pulseaudio] source-output.c:     application.process.user = "neko"
neko_1  | I: [pulseaudio] source-output.c:     application.process.host = "790ae454c872"
neko_1  | I: [pulseaudio] source-output.c:     application.process.binary = "neko"
neko_1  | I: [pulseaudio] source-output.c:     application.language = "C"
neko_1  | I: [pulseaudio] source-output.c:     window.x11.display = ":99.0"
neko_1  | I: [pulseaudio] source-output.c:     application.process.machine_id = "7e362915756081a70c3ca0126600e578"
neko_1  | 
neko_1  | 2024-03-25 11:54:24,209 DEBG 'pulseaudio' stdout output:
neko_1  | I: [pulseaudio] protocol-native.c: Final latency 20.00 ms = 10.00 ms + 10.00 ms
neko_1  | 
neko_1  | 2024-03-25 11:54:24,210 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF first listener, starting module=capture submodule=stream-sink video_id=video
neko_1  | 
neko_1  | 2024-03-25 11:54:24,211 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF signaling state changed to have-local-offer module=webrtc submodule=pion subsystem=pc
neko_1  | 
neko_1  | 2024-03-25 11:54:24,212 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN could not listen udp 172.23.0.2 module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:24,212 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN Conn is not allocated (Failed to listen for 142.251.27.127:19302: invalid port) module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:24,212 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN could not get server reflexive address udp6 stun:stun.l.google.com:19302: write udp6 [::]:52000->[2a00:1450:4025:1801::7f]:19302: sendto: network is unreachable module=webrtc submodule=pion subsystem=ice
neko_1  | 11:54AM INF sent all ICECandidates module=webrtc
neko_1  | 
neko_1  | 2024-03-25 11:54:24,251 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF signal update - RemoteAnswer id=2wByQAtLiPwb8bHhvDrGPcNoPJ1cra6C module=session
neko_1  | 11:54AM INF signaling state changed to stable module=webrtc submodule=pion subsystem=pc
neko_1  | 
neko_1  | 2024-03-25 11:54:24,251 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF Setting new connection state: Checking module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:24,252 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF ICE connection state changed: checking module=webrtc submodule=pion subsystem=pc
neko_1  | 11:54AM INF connection state has changed connection_state=checking module=webrtc
neko_1  | 
neko_1  | 2024-03-25 11:54:24,252 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:24,254 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF signal update - RemoteCandidate id=2wByQAtLiPwb8bHhvDrGPcNoPJ1cra6C module=session
neko_1  | 
neko_1  | 2024-03-25 11:54:24,254 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF signal update - RemoteCandidate id=2wByQAtLiPwb8bHhvDrGPcNoPJ1cra6C module=session
neko_1  | 11:54AM INF Ignoring remote candidate with tcpType active: udp4 host bafacdb2-187c-4ec4-a99d-cf7cc49db4d3.local:9 module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:24,358 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF signal update - RemoteCandidate id=2wByQAtLiPwb8bHhvDrGPcNoPJ1cra6C module=session
neko_1  | 
neko_1  | 2024-03-25 11:54:24,358 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 11:54AM INF signal update - RemoteCandidate id=2wByQAtLiPwb8bHhvDrGPcNoPJ1cra6C module=session
neko_1  | 
neko_1  | 2024-03-25 11:54:24,558 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:24,759 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:24,960 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:25,160 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:25,360 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:25,561 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:25,762 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:25,963 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:26,165 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:26,365 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:26,566 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:26,767 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:26,967 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:27,168 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:27,368 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:27,569 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:27,770 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:27,971 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:28,172 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:28,372 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:28,573 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:28,774 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:28,975 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:29,176 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN pingAllCandidates called with no candidate pairs. Connection is not possible yet. module=webrtc submodule=pion subsystem=ice
neko_1  | 
neko_1  | 2024-03-25 11:54:29,254 DEBG 'neko' stdout output:
neko_1  | 11:54AM WRN read message error error="websocket: close 1005 (no status)" module=websocket
neko_1  | 
neko_1  | 2024-03-25 11:54:29,254 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF Setting new connection state: Closed module=webrtc submodule=pion subsystem=ice
neko_1  | 11:54AM WRN Failed to discover mDNS candidate bafacdb2-187c-4ec4-a99d-cf7cc49db4d3.local: mDNS: connection is closed module=webrtc submodule=pion subsystem=ice
neko_1  | 11:54AM INF peer connection state changed: closed module=webrtc submodule=pion subsystem=pc
neko_1  | 11:54AM INF ICE connection state changed: closed module=webrtc submodule=pion subsystem=pc
neko_1  | 11:54AM INF connection state has changed connection_state=closed module=webrtc
neko_1  | 11:54AM WRN Failed to start manager: connecting canceled by caller module=webrtc submodule=pion subsystem=pc
neko_1  | 11:54AM WRN Failed to start SCTP: DTLS not established module=webrtc submodule=pion subsystem=pc
neko_1  | 
neko_1  | 2024-03-25 11:54:29,254 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF peer closed id=2wByQAtLiPwb8bHhvDrGPcNoPJ1cra6C module=webrtc
neko_1  | 11:54AM WRN undeclaredMediaProcessor failed to open SrtcpSession: the DTLS transport has not started yet module=webrtc submodule=pion subsystem=pc
neko_1  | 11:54AM WRN undeclaredMediaProcessor failed to open SrtpSession: the DTLS transport has not started yet module=webrtc submodule=pion subsystem=pc
neko_1  | 
neko_1  | 2024-03-25 11:54:29,260 DEBG 'pulseaudio' stdout output:
neko_1  | I: [pulseaudio] source-output.c: Freeing output 3 "Record Stream"
neko_1  | 
neko_1  | 2024-03-25 11:54:29,260 DEBG 'pulseaudio' stdout output:
neko_1  | I: [pulseaudio] client.c: Freed 5 "neko"
neko_1  | I: [pulseaudio] protocol-native.c: Connection died.
neko_1  | 
neko_1  | 2024-03-25 11:54:29,260 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF destroying pipeline module=capture submodule=stream-sink video_id=audio
neko_1  | 11:54AM INF last listener, stopping module=capture submodule=stream-sink video_id=audio
neko_1  | 
neko_1  | 2024-03-25 11:54:29,281 DEBG 'neko' stdout output:
neko_1  | 11:54AM INF destroying pipeline module=capture submodule=stream-sink video_id=video
neko_1  | 11:54AM INF last listener, stopping module=capture submodule=stream-sink video_id=video
neko_1  | 11:54AM INF destroying session module=session session_id=2wByQAtLiPwb8bHhvDrGPcNoPJ1cra6C
neko_1  |

Further Observations

I have managed to reproduce the issue in AWS and on-premise environments as well, here are my notes:

  • [AWS] When NEKO_ICELITE is true, NEKO_EPR is 52000-52007 and NEKO_NAT1TO1 is unset, neko fails after ~8-10 refreshes.
  • [AWS] When NEKO_ICELITE is false, NEKO_EPR is 52000-52007 and NEKO_NAT1TO1 is the public IP of the EC2 instance, neko fails after ~4-5 refreshes.
  • [on-premise] When NEKO_ICELITE is true, NEKO_EPR is 52000-52007 and NEKO_NAT1TO1 is unset, neko fails after ~8-10 refreshes.
  • [on-premise] When NEKO_ICELITE is false, NEKO_EPR is 52000-52007 and NEKO_NAT1TO1 is the internal IP of the server, neko fails after ~4-5 refreshes.

Additionally, in AWS and on-premise tests, neko produced multiple logs for each UDP port in failure (black screen) as below:

2:55PM DBG failed to listen :30013: listen udp4 :30013: bind: address already in use module=webrtc submodule=pion subsystem=ice

I guess that the refresh count before failure is close to the max allowed sessions in neko configuration. (since 8 ports correspond to 4 user sessions when NAT1TO1 is used)

Any kind of help is appreciated. Thanks!

@m1k1o
Copy link
Owner

m1k1o commented Mar 25, 2024

It is normal that with ice lite turned off it uses twice as much ports, because it needs to establish bidirectional communication - to allocate one port for sending and one port for receiving data, to be able to work around various NATs.

UDP is stateless and there is no handshake, therefore the lifecycle of UDP connection must be closed explicitly by the app when the user has been detected to be disconnected. There are as well various timeouts that handle this connection, in our case 6 seconds. See in library source code.

I assume you refreshed the page quicky, and probably the connection was not yet closed. From your logs i see that the connection was closed 11:54:29 what is actually the last time entry in your log. That's when your ports were freed.

Could you please join on the browser in neko, wait until it comes online and then close the browser. Look into logs and wait until you see INF connection state has changed connection_state=closed module=webrtc. It might not happen immediately but ports should be claimed back within 6 seconds, worst case 12sec - when the connection is pronounced as failed.

Reason for this behavior is: When the user is unexpectedly disconnected, because this could happen in real use case when your computer/mobile would switch networks or you would be in a train and your cellular network would switch transmitters.

However, I agree, that page reload could be probably somehow detected and the ports could be claimed quicky without waiting for the timeout.

@tunahanertekin
Copy link
Author

tunahanertekin commented Mar 26, 2024

Thanks for your reply @m1k1o!

I noticed the logs about destroying session and pipeline after waiting a couple of seconds without refreshing, however the ports still do not seem like they are freed since I am not able to reconnect again even after waiting one hour. Once the connection is gone like I stated in my first post, I never succeeded to connect neko regardless of the amount of time I waited. Is it still the expected behavior from neko or is there a possibility that ports remain busy unexpectedly after terminating UDP connection?

@m1k1o
Copy link
Owner

m1k1o commented Mar 26, 2024

No. That's definitely unexpected. Ports should be freed after timeout. I'll take a look.

@m1k1o m1k1o added the bug Something isn't working label Mar 26, 2024
@tunahanertekin
Copy link
Author

Any updates on this?

@m1k1o
Copy link
Owner

m1k1o commented Apr 2, 2024

I was able to reproduce this one. Simply setting EPR port tange to two ports and the third refresh is not working anymore.

When using NEKO_UDPMUX this issue is not present, since it only listens to one port.

I also checked demodesk fork and there the issue does not exists, therefore I would address this as part of v3 rewrite #371.

@m1k1o m1k1o added this to the n.eko v3 milestone Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

2 participants