forked from rails/solid_queue
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fixes rails#262: Automatic worker process recycling
This PR adds two new configuration parameters: * recycle_on_oom to the Worker (via queue.yml) * calc_memory_usage as a global parameter (via application.rb, environment/*.rb, or an initializer) There are no specific unit requirements placed on either of these new parameters. What's important is: They use the same order of magnitude and they are comparable. For example, if the calc_memory_usage proc returns 300Mb as 300 (as in Megabytes) then the recycle_on_oom set on the work should be 300 too. Any worker without recycle_on_oom is not impacted in anyway. If the calc_memory_usage is nil (default), then this oom checking it off for workers under the control of this Supervisor. The check for OOM is made after the Job has run to completion and before the SolidQueue worker does any additional processing. The single biggest change to SolidQueue, that probably requires the most review is moving the job.unblock_next_blocked_job out of ClaimedExecution and up one level into Pool. The rational for this change is that the ensure block on the Job execution is not guarrenteed to run if the system / thread is forcibly shutdown while the job is inflight. However, the Thread.ensure *does* seem to get called reliably on forced shutdowns. Give my almost assuredly incomplete understanding of the concurrency implementation despite Rosa working very hard to help me to grok it, there is some risk here that this change is wrong. My logic for this change is as follows: * A job that complete successfully would have release its lock -- no change * A job that completes by way of an unhandled exception would have released its lock -- no change * A job that was killed inflight because of a worker recycle_on_oom (or an ugly restart out of the users control -- again, looking at you Heroku) needs to release its lock -- there is no guarantee that its going to be the job that starts on the worker restart. If release its lock in this use-case, then it doesn't, then that worker could find itself waiting on the dispatcher (I think) to expire Semaphores before it is able to take on new work. Small fix
- Loading branch information
Showing
10 changed files
with
422 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -65,8 +65,6 @@ def perform | |
else | ||
failed_with(result.error) | ||
end | ||
ensure | ||
job.unblock_next_blocked_job | ||
end | ||
|
||
def release | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# frozen_string_literal: true | ||
|
||
require "active_support/concern" | ||
|
||
module SolidQueue::Processes | ||
module Recyclable | ||
extend ActiveSupport::Concern | ||
|
||
included do | ||
attr_reader :max_memory, :calc_memory_usage | ||
end | ||
|
||
def recyclable_setup(**options) | ||
return unless configured?(options) | ||
|
||
set_max_memory(options[:recycle_on_oom]) | ||
set_calc_memory_usage if max_memory | ||
SolidQueue.logger.error { "Recycle on OOM is disabled for worker #{pid}" } unless oom_configured? | ||
end | ||
|
||
def recycle(execution = nil) | ||
return false if !oom_configured? || stopped? | ||
|
||
memory_used = calc_memory_usage.call(pid) | ||
return false unless memory_exceeded?(memory_used) | ||
|
||
SolidQueue.instrument(:recycle_worker, process: self, memory_used: memory_used, class_name: execution&.job&.class_name) do | ||
pool.shutdown | ||
stop | ||
end | ||
|
||
true | ||
end | ||
|
||
def oom? | ||
oom_configured? && calc_memory_usage.call(pid) > max_memory | ||
end | ||
|
||
private | ||
|
||
def configured?(options) | ||
options.key?(:recycle_on_oom) | ||
end | ||
|
||
def oom_configured? | ||
@oom_configured ||= max_memory.present? && calc_memory_usage.present? | ||
end | ||
|
||
def memory_exceeded?(memory_used) | ||
memory_used > max_memory | ||
end | ||
|
||
def set_max_memory(max_memory) | ||
if max_memory > 0 | ||
@max_memory = max_memory | ||
else | ||
SolidQueue.logger.error { "Invalid value for recycle_on_oom: #{max_memory}." } | ||
end | ||
end | ||
|
||
def set_calc_memory_usage | ||
if SolidQueue.calc_memory_usage.respond_to?(:call) | ||
@calc_memory_usage = SolidQueue.calc_memory_usage | ||
else | ||
SolidQueue.logger.error { "SolidQueue.calc_memory_usage provider not configured." } | ||
end | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# frozen_string_literal: true | ||
|
||
class RecycleJob < ApplicationJob | ||
def perform(nap = nil) | ||
sleep(nap) unless nap.nil? | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# frozen_string_literal: true | ||
|
||
class RecycleWithConcurrencyJob < ApplicationJob | ||
limits_concurrency key: ->(nap = nil) { } | ||
|
||
def perform(nap = nil) | ||
sleep(nap) unless nap.nil? | ||
end | ||
end |
Oops, something went wrong.