-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Worker autorestarts cause worker-dependent duplicate logs #3218
Comments
About asking cherrypy to disable its own autoreloading and cleanup its "bus" on shutdown.. we should probably improve documentation, but I don't expect that to be related.
|
Thanks. I still believe the most obviously smoking gun is your mention of "up to the current max workers number of times". I do not understand those smoke signs; so for now, I keep asking:
For reliable reproduction, you may need not just significant request counts (in relation to) configuration to trigger regular reloads, but also hanging requests, either by client decision or by processing/db bottleneck. Try firing a benchmarking tool and a manually issued "slow" request at the same (multi-processed) test instance. |
I'm currently having an issue where when a worker hits its max requests and auto-restarts as usual, there is a traffic-dependent chance that the new worker will be initialized to a state where all logs to a WatchedFileHandler are being duplicated up to the current max workers number of times.
The way our logs are currently configured have them logging to different files per worker so it's clear to see that it's an issue contained within a single worker post auto-restart. This implementation was recently added as all workers used to be logging to a single file. The hope was changing it to logging to different files per worker would resolve the issue (as this had been going on long before), but it made no change to the problem.
This is a decently high traffic application (10k per minute at peak traffic) so we are currently configured at 3 workers and 50 threads per worker. Ideally, we would like to bump the number of workers and lower the number of threads in order to closer match gunicorn doc recommendations but after testing a worker increase, the max number of logs that can appear increased to the new value.
We have some code to set numbered worker id's within gunicorn server hooks so those are reflected in the pre_fork and post_fork logging here, but this is what the docker logs show during an auto-restart:
I've searched around a while and have had a hard time finding information on how to resolve the RuntimeWarning shown so it's entirely possible that that's the issue, in which case this issue becomes a question of how to resolve that.
This is our gunicorn configuration at the moment :
For example, if the application is deployed in the morning (low traffic) none of the workers will be showing any number of duplicate logs. The max request value might be hit around mid-afternoon and an auto-restart initialized. After initialization, ALL logs from that worker could begin logging to it's appropriate log file with a number of duplicates ranging from 0-3. When a worker begins this, every single log originating from that worker will be duplicated the same number of times. This possibly points to the WatchedFileHandler being added a number of times, but no indication as to why a gunicorn worker restart would trigger this, or why it's not consistent and traffic-dependent. The worker would continue like this until its next auto-restart and once again could be initialized with any number of duplicate logs ranging from 0-3. All other workers act in this same way but independent of each other.
The reason I'm posting this here is because this only ever happens when a worker is initialized and seems to occur more frequently with high traffic so possibly some process/thread unsafety to blame. We recently implemented similar logging in another much lower traffic application and haven't seen this issue yet.
If any more information is needed then I can definitely provide and any help would be greatly appreciated, even if the help is '{other location} might be a better place to ask this question'.
The text was updated successfully, but these errors were encountered: