-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: add app.state
#80
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a quick initial pass. Kind of wish "state" wasn't used for so many different things, it's getting confusing.
This comment was marked as outdated.
This comment was marked as outdated.
20ee2b1
to
0af18b8
Compare
app.state
0af18b8
to
532306b
Compare
532306b
to
d5d44d0
Compare
So given that distributed TaskIQ sequencing currently operates as multi-threading (and not multi-processing), this approach of worker-managed application state is okay as long as there isn't multiple workers trying to write the same state at the same time. We may need to integrate a lock to the |
TODO:
|
d625e68
to
09332ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't really understand why the datastore is on the worker side. I'm going to have to basically re-invent that whole thing for the platform side, more or less reverting or duplicating some of this design. But I think a bunch of this was already touched in previous discussions so probably pointless to mention.
Will try and figure out how to make this work on the cluster side when I get a chance.
Originally I don't think I actually understood what you were saying, but yes it is very clear to me now that needs to be inverted. The application state being worker side is what I wanted here, but the state snapshot loading and backup should be entirely on the runner client to perform. My mistake. I hope you agree the solution should be to update this system function and add |
Adds 2 new system tasks to load and store a snapshot of state. Currently this is just `last_block_processed` and `last_block_seen`. This paves the way for the `Parameter` feature, but does not include that just yet. BREAKING CHANGE: state snapshotting migrates from runner to worker
Also updated docstrings within `silverback/application.py`
c108d89
to
38c747f
Compare
@mikeshultz resolved in 07dd9f9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question on design inline. I'm good with it as is though.
# TODO: Migrate these to parameters (remove explicitly from state) | ||
last_block_seen=self.state.get("system:last_block_seen", -1), | ||
last_block_processed=self.state.get("system:last_block_processed", -1), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Do we have future plans for this handler? This more or less is just the runner sending these two values to the worker to send back to the runner.
Or maybe we intend to make this available to user code since it's called every checkpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
State snapshots in general will be created using this mechanism (parameters will make use of this), but I added system:
to the front of these specific internal state variables because then they cannot conflict with any attribute added to state directly by users (e.g. app.state.something = ...
by definition cannot start with system:
)
One of my ideas when adding parameters is that those are only allowed to be specific types, and only parameters will end up in state snapshots that get communicated back to the cluster services. All other non-parameter state will not be backed up in snapshots.
There may exist other features that make use of this, like nonce tracking, although they may need better concurrency guarantees (like using a lock)
What I did
This PR adds worker shared state (accessible via
app.state
) and adds 2 new system tasks to support loading and storing a snapshot of that managed state. Currently this is justlast_block_processed
andlast_block_seen
.This paves the way for the
Parameter
feature, but does not include that just yet.BREAKING CHANGE: state snapshot features migrates from runner to worker and must be triggered by runner
How I did it
How to verify it
Checklist
- [ ] New test cases have been added and are passing