Releases: oban-bg/oban
v2.13.2
Bug Fixes
-
[Oban] Fix
insert/3
andinsert_all/3
when using options.Multiple default arguments caused a conflict for function calls with options but without an Oban instance name, e.g.
Oban.insert(changeset, timeout: 500)
-
[Reindexer] Fix the unused index repair query and correctly report errors.
Reindexing and deindexing would fail silently because the results weren't checked, and no exceptions were raised.
v2.13.1
Bug Fixes
-
[Oban] Expand
insert
/insert_all
typespecs for multi arityThis fixes dialyzer issues from the introduction of
opts
toOban.insert
andOban.insert_all
functions. -
[Reindexer] Allow specifying timeouts for all queries
In some cases, applying
REINDEX INDEX CONCURRENTLY
on the indexesoban_jobs_args_index
, andoban_jobs_meta_index
takes more than the default value (15 seconds). This new option allows clients to specify other values than the default.
v2.13.0
Cancel Directly from Job Execution
Discard was initially intended to mean "a job exhausted all retries." Later, it was added as a return type for perform/1
, and it came to mean either "stop retrying" or "exhausted retries" ambiguously, with no clear way to differentiate. Even later, we introduced cancel with a cancelled
state as a way to stop jobs at runtime.
To repair this dichotomy, we're introducing a new {:cancel, reason}
return type that transitions jobs to the cancelled
state:
case do_some_work(job) do
{:ok, _result} = ok ->
ok
{:error, :invalid} ->
- {:discard, :invalid}
+ {:cancel, :invalid}
{:error, _reason} = error ->
error
end
With this change we're also deprecating the use of discard from perform/1
entirely! The meaning of each action/state is now:
-
cancel
—this job was purposefully stopped from retrying, either from a return value or the cancel command triggered by a human -
discard
—this job has exhausted all retries and transitioned by the system
You're encouraged to replace usage of :discard
with :cancel
throughout your application's workers, but :discard
is only soft-deprecated and undocumented now.
Public Engine Behaviour
Engines are responsible for all non-plugin database interaction, from inserting through executing jobs. They're also the intermediate layer that makes Pro's SmartEngine possible.
Along with documenting the Engine this also flattens its name for parity with other "extension" modules. For the sake of consistency with notifiers and peers, the Basic and Inline engines are now Oban.Engines.Basic
and Oban.Engines.Inline
, respectively.
v2.13.0 — 2022-07-22
Enhancements
-
[Telemetry] Add
encode
option to make JSON encoding forattach_default_logger/1
.Now it's possible to use the default logger in applications that prefer structured logging or use a standard JSON log formatter.
-
[Oban] Accept a
DateTime
for the:with_scheduled
option when draining.When a
DateTime
is provided, drains all jobs scheduled up to, and including that point in time. -
[Oban] Accept extra options for
insert/2,4
andinsert_all/2,4
.These are typically the Ecto's standard "Shared Options" such as
log
andtimeout
. Other engines, such as Pro'sSmartEngine
may support additional options. -
[Repo] Add
aggregate/4
wrapper to facilitate aggregates from plugins or other extensions that useOban.Repo
.
Bug Fixes
-
[Oban] Prevent empty maps from matching non-empty maps during uniqueness checks.
-
[Oban] Handle discarded and exhausted states for inline testing mode.
Previously, returning a
:discard
tuple or exhausting attempts would cause an error. -
[Peer] Default
leader?
check to false on peer timeout.Timeouts should be rare, as they're symptoms of application/database overload. If leadership can't be established it's safe to assume an instance isn't leader and log a warning.
-
[Peer] Use node-specific lock requester id for Global peers.
Occasionally a peer module may hang while establishing leadership. In this case the peer isn't yet a leader, and we can fallback to
false
. -
[Config] Validate options only after applying normalizations.
-
[Migrations] Allow any viable
prefix
in migrations. -
[Reindexer] Drop invalid Oban indexes before reindexing again.
Table contention that occurs during concurrent reindexing may leave indexes in an invalid, and unusable state. Those indexes aren't used by Postgres and they take up disk space. Now the Reindexer will drop any invalid indexes before attempting to reindex.
-
[Reindexer] Only concurrently rebuild
args
andmeta
GIN indexes.The new
indexes
option can override the reindexed indexes rather than the defaults.The other two standard indexes (primary key and compound fields) are BTREE based and not as subject to bloat.
-
[Testing] Fix testing mode for
perform_job
and alt engines, e.g. InlineA couple of changes enabled this compound fix:
- Removing the engine override within config and exposing a centralized engine lookup instead.
- Controlling post-execution db interaction with a new
ack
option for the Executor module.
Deprecations
- [Oban] Soft replace discard with cancel return value (#730) [Parker Selbert]
v2.12.1
Bug Fixes
-
[BasicEngine] Never fetch jobs that have reached max attempts
This adds a safeguard to the
fetch_jobs
function to prevent ever hitting theattempt <= max_attempts
check constraint. Hitting the constraint causes the query to fail, which crashes the producer and starts an infinite loop of crashes. The previous commit should prevent this situation from occurring at the "staging" level, but to be absolutely safe this change prevents it at the
"fetching" level too.There is a very minor performance hit from this change because the query can no longer run as an index only scan. For systems with a modest number of available jobs the performance impact is indistinguishable.
-
[Plugins] Prevent unexpectedly modifying jobs selected by subqueries
Most applications don't run at a serializable isolation level. That allows subqueries to run within a transaction without having the conditions rechecked—only predicates on
UPDATE
orDELETE
are re-checked, not on subqueries. That allows a race condition where rows may be updated without another evaluation. -
[Repo] Set
query_opts
inRepo.transaction
options to prevent loggingbegin
andcommit
events in development loggers. -
[BasicEngine] Remove the
ORDER BY
clause from unique queriesThe previous
ORDER BY id DESC
significantly hurts unique query performance when there are a lot of potential jobs to check. The ordering was originally added to make test cases predictable and isn't important for the actual behavior of the unique check.
v2.12.0
Oban v2.12 was dedicated to enriching the testing experience and expanding config, plugin, and queue validation across all environments.
Testing Modes
Testing modes bring a new, vastly improved, way to configure Oban for testing. The new testing
option makes it explicit that Oban should operate in a restricted mode for the given environment.
Behind the scenes, the new testing modes rely on layers of validation within Oban's Config
module. Now production configuration is validated automatically during test runs. Even though queues and plugins aren't started in the test environment, their configuration is still validated.
To switch, stop overriding plugins
and queues
and enable a testing mode in your test.exs
config:
config :my_app, Oban, testing: :manual
Testing in :manual
mode is identical to testing in older versions of Oban: jobs won't run automatically so you can use helpers like assert_enqueued
and execute them manually with Oban.drain_queue/2
.
An alternate :inline
allows Oban to bypass all database interaction and run jobs immediately in the process that enqueued them.
config :my_app, Oban, testing: :inline
Finally, new testing guides cover test setup, unit testing workers, integration testing queues, and testing dynamic configuration.
Global Peer Module
Oban v2.11 introduced centralized leadership via Postgres tables. However, Postgres based leadership isn't always a good fit. For example, an ephemeral leadership mechanism is preferred for integration testing.
In that case, you can make use of the new :global
powered peer module for leadership:
config :my_app, Oban,
peer: Oban.Peers.Global,
...
2.12.0 — 2022-04-21
Enhancements
-
[Oban] Replace queue, plugin, and peer test configuration with a single
:testing
option. Now configuring Oban for testing only requires one change, setting the test mode to either:inline
or:manual
.:inline
—jobs execute immediately within the calling process and without touching the database. This mode is simple and may not be suitable for apps with complex jobs.:manual
—jobs are inserted into the database where they can be verified and executed when desired. This mode is more advanced and trades simplicity for flexibility.
-
[Testing] Add
with_testing_mode/2
to temporarily change testing modes within the context of a function.Once the application starts in a particular testing mode it can't be changed. That's inconvenient if you're running in
:inline
mode and don't want a particular job to execute inline. -
[Config] Add
validate/1
to aid in testing dynamic Oban configuration. -
[Config] Validate full plugin and queue options on init, without the need to start plugins or queues.
-
[Peers.Global] Add an alternate
:global
powered peer module. -
[Plugin] A new
Oban.Plugin
behaviour formalizes starting and validating plugins. The behaviour is implemented by all plugins and is the foundation of enhanced config validation. -
[Plugin] Emit
[:oban, :plugin, :init]
event on init from every plugin.
Bug Fixes
-
[Executor ] Skip timeout check with an unknown worker
When the worker can't be resolved we don't need to check the timeout. Doing so prevents returning a helpful "unknown worker" message and instead causes a function error for
nil.timeout/1
. -
[Testing] Include
log
andprefix
in generated conf forperform_job
.The opts, and subsequent conf, built for
perform_job
didn't include theprefix
orlog
options. That prevented functions that depend on a job'sconf
withinperform/1
from running with the correct options. -
[Drainer] Retain the currently configured engine while draining a queue.
-
[Watchman] Skip pausing queues when shutdown is immediate. This prevents queue's from interacting with the database during short test runs.
v2.11.0
Oban v2.11 focused on reducing database load, bolstering telemetry-powered introspection, and improving the production experience for all users. To that end, we've extracted functionality from Oban Pro and switched to a new global coordination model.
Leadership
Coordination between nodes running Oban is crucial to how many plugins operate. Staging jobs once a second from multiple nodes is wasteful, as is pruning, rescuing, or scheduling cron jobs. Prior Oban versions used transactional advisory locks to prevent plugins from running concurrently, but there were some issues:
-
Plugins don't know if they'll take the advisory lock, so they still need to run a query periodically.
-
Nodes don't usually start simultaneously, and time drifts between machines. There's no guarantee that the top of the minute for one node is the same as another's—chances are, they don't match.
Oban 2.11 introduces a table-based leadership mechanism that guarantees only one node in a cluster, where "cluster" means a bunch of nodes connected to the same Postgres database, will run plugins. Leadership is transparent and designed for resiliency with minimum chatter between nodes.
See the [Upgrade Guide][upg] for instructions on how to create the peers table and get started with leadership. If you're curious about the implementation details or want to use leadership in your application, take a look at docs for Oban.Peer
.
Alternative PG (Process Groups) Notifier
Oban relies heavily on PubSub, and until now it only provided a Postgres adapter. Postres is amazing, and has a highly performant PubSub option, but it doesn't work in every environment (we're looking at you, PG Bouncer).
Fortunately, many Elixir applications run in a cluster connected by distributed Erlang. That means Process Groups, aka PG, is available for many applications.
So, we pulled Oban Pro's PG notifier into Oban to make it available for everyone! If your app runs in a proper cluster, you can switch over to the PG notifier:
config :my_app, Oban,
notifier: Oban.Notifiers.PG,
...
Now there are two notifiers to choose from, each with their own strengths and weaknesses:
-
Oban.Notifiers.Postgres
— Pros: Doesn't require distributed erlang, publishesinsert
events to trigger queues; Cons: Doesn't work with PGBouncer intransaction mode, Doesn't work in tests because of the sandbox. -
Oban.Notifiers.PG
— Pros: Works PG Bouncer in transaction mode, Works in tests; Cons: Requires distributed Erlang, Doesn't publishinsert
events.
Basic Lifeline Plugin
When a queue's producer crashes or a node shuts down before a job finishes executing, the job may be left in an executing
state. The worst part is that these jobs—which we call "orphans"—are completely invisible until you go searching through the jobs table.
Oban Pro has awlays had a "Lifeline" plugin for just this ocassion—and now we've brought a basic Lifeline
plugin to Oban.
To automatically rescue orphaned jobs that are still executing
, include the Oban.Plugins.Lifeline
in your configuration:
config :my_app, Oban,
plugins: [Oban.Plugins.Lifeline],
...
Now the plugin will search and rescue orphans after they've lingered for 60 minutes.
🌟 Note: The Lifeline
plugin may transition jobs that are genuinely executing
and cause duplicate execution. For more accurate rescuing or to rescue jobs that have exhausted retry attempts see the DynamicLifeline
plugin in Oban Pro.
Reindexer Plugin
Over time various Oban indexes (heck, any indexes) may grow without VACUUM
cleaning them up properly. When this happens, rebuilding the indexes will release bloat and free up space in your Postgres instance.
The new Reindexer
plugin makes index maintenance painless and automatic by periodically rebuilding all of your Oban indexes concurrently, without any locks.
By default, reindexing happens once a day at midnight UTC, but it's configurable with a standard cron expression (and timezone).
config :my_app, Oban,
plugins: [Oban.Plugins.Reindexer],
...
See Oban.Plugins.Reindexer
for complete options and implementation details.
Improved Telemetry and Logging
The default telemetry backed logger includes more job fields and metadata about execution. Most notably, the execution state and formatted error reports when jobs fail.
Here's an example of the default output for a successful job:
{
"args":{"action":"OK","ref":1},
"attempt":1,
"duration":4327295,
"event":"job:stop",
"id":123,
"max_attempts":20,
"meta":{},
"queue":"alpha",
"queue_time":3127905,
"source":"oban",
"state":"success",
"tags":[],
"worker":"Oban.Integration.Worker"
}
Now, here's an sample where the job has encountered an error:
{
"attempt": 1,
"duration": 5432,
"error": "** (Oban.PerformError) Oban.Integration.Worker failed with {:error, \"ERROR\"}",
"event": "job:exception",
"state": "failure",
"worker": "Oban.Integration.Worker"
}
2.11.0 — 2022-02-13
Enhancements
-
[Migration] Change the order of fields in the base index used for the primary Oban queries.
The new order is much faster for frequent queries such as scheduled job staging. Check the v2.11 upgrade guide for instructions on swapping the index in existing applications.
-
[Worker] Avoid spawning a separate task for workers that use timeouts.
-
[Engine] Add
insert_job
,insert_all_jobs
,retry_job
, andretry_all_jobs
as required callbacks for all engines. -
[Oban] Raise more informative error messages for missing or malformed plugins.
Now missing plugins have a different error from invalid plugins or invalid options.
-
[Telemetry] Normalize telemetry metadata for all engine operations:
- Include
changeset
forinsert
- Include
changesets
forinsert_all
- Include
job
forcomplete_job
,discard_job
, etc
- Include
-
[Repo] Include
[oban_conf: conf]
intelemetry_options
for all Repo operations.With this change it's possible to differentiate between database calls made by Oban versus the rest of your application.
Bug Fixes
-
[Telemetry] Emit
discard
rather thanerror
events when a job exhausts all retries.Previously
discard_job
was only called for manual discards, i.e., when a job returned:discard
or{:discard, reason}
. Discarding for exhausted attempts was done withinerror_job
in error cases. -
[Cron] Respect the current timezone for
@reboot
jobs. Previously,@reboot
expressions were evaluated on boot without the timezone applied. In that case the expression may not match the calculated time and jobs wouldn't trigger. -
[Cron] Delay CRON evaluation until the next minute after initialization. Now all cron scheduling ocurrs reliably at the top of the minute.
-
[Drainer] Introduce
discard
accumulator for draining results. Now exhausted jobs along with manual discards count as adiscard
rather than afailure
orsuccess
. -
[Oban] Expand changeset wrapper within multi function.
Previously,
Oban.insert_all
could handle a list of changesets, a wrapper map with a:changesets
key, or a function. However, the function had to return a list of changesets rather than a changeset wrapper. This was unexpected and made some multi's awkward. -
[Testing] Preserve
attempted_at/scheduled_at
inperform_job/3
rather than overwriting them with the current time. -
[Oban] Include
false
as a viablequeue
orplugin
option in typespecs
Deprecations
- [Telemetry] Hard deprecate
Telemetry.span/3
, previously it was soft-deprecated.
Removals
- [Telemetry] Remove circuit breaker event documentation because
:circuit
events aren't emitted anymore.
v2.10.1
The previous release, v2.10.0 was immediately retired in favor of this version.
Removed
- [Oban.Telemetry] Remove the customizable prefix for telemetry events in favor of workarounds such as
keep/drop
in Telemetry Metrics.
v2.10.0
Added
-
[Oban.Telemetry] Add customizable prefix for all telemetry events.
For example, a telemetry prefix of
[:my_app, :oban]
would span job start telemetry events as[:my_app, :oban, :job, :start]
. The default is[:oban]
, which matches the existing functionality.
Fixed
-
[Oban.Plugins.Stager] Use the notifier to broadcast inserted and available jobs rather than inlining them into a Postgres query.
With this change the notifier is entirely swappable and there isn't any reason to use the
Repeater
plugin in production. -
[Oban.Plugins.Cron] Validate job options on init.
Providing invalid job args in the cron tab, e.g.
priority: 5
orunique: []
, wasn't caught until runtime. At that point each insert attempt would fail, crashing the plugin. -
[Oban.Queue.Producer] Prevent crashing on exception formatting when a job exits without a stacktrace, most notably with
{:EXIT, pid}
. -
[Oban.Testing] Return invalid results from
perform_job
, rather than always returningnil
. -
[Oban] Validate that a queue exists when controlling or checking locally, e.g. calls to
Oban.check_queue
orOban.scale_queue
. -
[Oban.Telemetry] Use module capture for telemetry logging to prevent warnings.