Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task Runner needs timeout on socket operations #32

Open
mthuurne opened this issue Mar 28, 2020 · 0 comments
Open

Task Runner needs timeout on socket operations #32

mthuurne opened this issue Mar 28, 2020 · 0 comments
Labels
bug Something isn't working

Comments

@mthuurne
Copy link
Member

We had an issue where a Task Runner had an established socket (checked with netstat -tpn) and was waiting forever for traffic on that socket to occur. However, the other side had no corresponding socket and therefore couldn't send anything. My guess is that the router had dropped the socket from its NAT tables.

After killing the socket using ss -K sport = <port>, the Task Runner resumed normal operations. So the Task Runner was still operational, just waiting forever for a reply that wouldn't come. To make the Task Runner robust against situations like this, we should put a timeout on socket operations, so the operation fails if it doesn't make progress for a long time and a new socket can be opened on the next try.

This issue is very rare: I've had three Task Runners on the same machine with the same router for over half a year and it happened only once. So we can set the timeout value relatively high, for example a minute.

Note that the Task Runner doesn't do low-level socket operations directly: it uses java.net.HttpURLConnection instead. That class has setConnectTimeout() and setReadTimeout() methods that we can use. But those were added in Java 1.5, while the Task Runner was originally written in Java 1.4.

@mthuurne mthuurne added the bug Something isn't working label Mar 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant