statetrack_plugin implementation #6321

andresberrios · 2018-11-15T02:42:56Z

Change Description

This PR is for reviewing the implementation of the statetrack_plugin and related code changes to chainbase.

This is an early version of the code and still required more testing. There are various open questions for which we wanted to request feedback from the Block.one developers.
The details on the plugin and motivation are explained in the plugin's readme file: https://github.com/mmcs85/eos/tree/master/plugins/statetrack_plugin

The associated pull request for the chainbase changes is here: EOSIO/chainbase#28

The associated receiver for the operations sent by this plugin is here: https://github.com/andresberrios/statemirror

Consensus Changes

This plugin doesn't directly introduce changes to consensus but there might be some implications regarding transaction processing time limits that should be evaluated. The details are explained in the above referenced plugin readme file.

API Changes

No API changes are introduced, but new hooks would be available for plugins to utilize if the chainbase pull request is merged.

Documentation Additions

Most of the documentation related to the current implementation of the plugin is detailed in the above referenced readme file.

wanderingbort

@andresberrios Thanks for taking on such an ambitious addition to nodeos

I have a high three level concerns with the signal model as presented that I would like to hear your thoughts on before digging in more.

The Signals as spec'd do not seem to provide enough information for a downstream consumer to get a consistent view of the state at a particular point in time. For instance if I wanted to query a few different bits of information at "the end of a block A" or "after applying transaction X" there doesn't seem to be enough information produced to achieve this. The information can be used to create a replicated copy of the consensus state that follows the the database pretty closely in realtime but lacks the contextual information needed to understand if a create/update/delete is associated with a certain higher level construct (like a transaction) or a low level construct (like undoing a speculatively executed transaction).
pulling a page from dmux, if a downstream consumer wanted to "reduce" state into meaningful metadata, perhaps deriving information at state-transitions, this level of signaling does not adequately represent micro forks as orphaned bits outside of consensus. For instance, if Node 1 sees transitions A -> B -> (C' -> C'-rollback -> )C -> D where C' is a microfork and Node 2 sees A -> B -> C -> D it is hard to guarantee that their reduced meta state is equivalent. Put another way, this doesn't abstract away any of the complexity of consensus and (1) implies that you cannot reconstruct a method of handling it downstream.
A more meta concern is that this produces signaling outside of the atomicity of actions. This may create a false value proposition for contract authors to write more often than necessary (multiple times to the same row in a single action). While that is valid, it may present a bad incentive to place more burden on the shared scarce resources of on-chain processing to facilitate off-chain side-effects. I'm not opposed to allowing things like this but, I do want to have a discussion of the trade-offs it creates

Again, thank you for attempting this, it is truly ambitious and starts a very good conversation

mmcs85 · 2018-11-16T23:06:10Z

Hi there @wanderingbort. Thank you from giving a review to this cuz we really think that this projects needs validation from EOSIO devs to evaluate our expectation and understand is viability.

To be honest my ideia (I think Andrés as well) for this plugin at this phase was only to stream the database operations to a queue without any context like trx / block / fork that it belongs to. Same as the chain_plugin fetch's rows every API request and does not take that context in consideration.

I think for supporting this is always possible to subscribe other signals and construct on the plugin side the relation between db op and trx / blocks. For example is possible to add an event on applied action and since (I'm assuming this so need you advice) I can group ops by action by storing the ops that happen before and add them to the action that was applied. It only works on a single thread and if sequential order is guaranteed.
Other possible solution is to add that information in each index operation. could it impact performance?
It depends of the way your reducer function as the ability to correct the algorithm on the undo operations.
Also I understand that in some cases you may want to know the op that is being reversed as well And for that a op needs some kind of ID to identify that is a undo op off a previous operation.

Maybe I'm wrong but since I subscribe the undo and emit their reverse operations In theory I'm streaming forks operations and the reverse operations when they are removed / (orphaned?) as well.

Should I subscribe events on the reversible_blocks chainbase in controller and fork database push_block and pop_block ?

Indeed the price we pay for this features is that operations can be duplicated due to:

Action Multiple changes to same row
Forks applied and undo operations
Failed transaction with exceptions

But the thing is as long there is a use case for tracking this information is ok for me.
Also there is some mechanisms to reduce the impact of this:

Avoid using this plugin on a node that as available public API
Get only previous validated and executed transactions from local nodes if possible.
Filter the tables you want to get information since contract developers mostly care for a small subset of contract - tables.

Also just realized that there is a new state history plugin being developed by you guys so I'm not sure this is the best way to approach the need to track only table state.

andresberrios · 2018-11-17T05:17:16Z

Thanks @wanderingbort for taking the time to review this. To expand on what @mmcs85 has mentioned:

It is true that we initially were looking to only stream the state DB operations, without them needing much associated context like the action/transaction that caused them. This is simply because we were pursuing the use cases where that would be enough, considering that after querying the state, the client could get notified of further modifications to the snapshot that was received via WebSockets or whatever other mechanism that the developer would like to implement or use in the receiver side, thus maintaining a consistent view of the state after any necessary undos are executed. This does not mean that there aren't more use cases that this system can cover very effectively, and that's also why we needed to get in touch with you guys and see what you consider important to support. On top of that, as @mmcs85 explained a bit, we already had some ideas for solutions to support further use cases.

As @mmcs85 mentioned, there are some straightforward potential ways to support more context in the operations:
- Regarding action/transaction context: Assuming a single thread for action processing and consistent sequential order of events (both assumptions seem to be currently valid), one could use the applied action event to track the checkpoints of action start and finish, and consider all state DB operations in between to belong to that particular action. This could be done either on the plugin side or on the receiver side, avoiding putting more load on the nodeos instance for tracking this.
- Regarding undo operations: In a way, we already had this functionality supported in a previous version of the implementation. We originally were sending the commit, squash, and undo operations as well, along with the chainbase revision they corresponded to. This gave us context in the receiver to keep track of revisions (which are just denoted by the block number) and allowed us to group "changesets" or groups of state DB operations according to what block they belonged to. We would also keep a set of reversed operations ready to be applied in case of an undo, and these reversed operations were discarded once we received a commit operation, which (as we deduced) should signal block finality by committing the specified revision, making it un-undoable. You can check this implementation here, in a previous version. We eventually discarded this implementation for the time being since we found a way to send the actual reversed operations directly from chainbase, saving a lot of extra queries on the receiver side. We did keep in mind that some context for block finality and undo operations should be reintroduced at some point.
Please correct me if I'm wrong, but I think this point is covered, since your reducer implementation would most likely gracefully handle the undos without any further complexity. For instance, if you would want to keep track fo the total sum of the numbers stored in column amount for each row in the table, you would need to do 3 things:
1. Handle the emplace operations by adding the amount of that row to the total.
2. Handle the erase operations by subtracting the amount of the row to be removed.
3. Handle the modify operations by subtracting the old amount and adding the new one.
This way, your reducer is agnostic of whether there was an undo or a fork, since it will always simply maintain consistency with whatever is actually stored in the table, as the undos will emit the reversed operations required to apply them. We rely on nodeos' own ability to keep it's own chainbase DBs in a consistent state when forks happen, and we just follow those DBs as the single source of truth. As @mmcs85 said:

Maybe I'm wrong but since I subscribe the undo and emit their reverse operations In theory I'm streaming forks operations and the reverse operations when they are removed / (orphaned?) as well.

Should I subscribe events on the reversible_blocks chainbase in controller and fork database push_block and pop_block ?

If this behaves as we believe it does, then we shouldn't need any further checks with reversible blocks or the fork database, as all of that would be handled by nodeos and applied to the state chainbase, which we hooked into.

Besides this, in some cases, the developer might find it more valuable to run a reducer based on actual applied actions and their parameters instead of state DB operations. In this case, using this plugin, they can simply use the actions DB table. Each applied action will be in this table (which has (code, scope, table) as (system, system, actions) in the current implementation), and they can apply a forward reduction step when there is an emplaced action, and a reversed reduction step when actions are erased from the table like in case there's a fork.

I'm glad to see you've thought so deeply about this and found this potential mis-incentive. I can only answer with my personal view, as I have no hard proof, but it seems to me that it's hard to come up with a use case where developers might want to write twice to the same DB row in the same action just to trigger two different side-effect handlers. In any case that I can think of, for side-effects (unlike reducers), it would make more sense to hook into applied actions (and maybe even irreversible ones) than to state DB changes. Even though this opens the possibility for hooking things in weird esoteric ways, I don't see it necessarily as an incentive to do so. The fine-grained (sub-atomic) nature of the state DB operation updates has more benefit for other things, like reducers, state DB mirroring (to support more complex queries and indexing), and frontend real-time updates.

There might be other use cases that we're not yet considering, but for most dapp development needs that we've thought of, this implementation fulfills all needs so far. It's also worth mentioning that we've had strong interest in this system from some of the biggest dapps out there, so there's definitely a demand.

b1bart · 2019-07-30T15:10:05Z

in the time between this PR and now we've successfully created plumbing to allow plugins to exist out of repo. I am going to close this PR as it is stale and should probably exist as a sibling repo that leverages this feature. See #5026

mmcs85 added 30 commits October 24, 2018 21:11

add plugin chain_to_mongo_plugin

935afac

change plugin name

b691bb0

Update statetrack_plugin.cpp

176d73c

Update statetrack_plugin.cpp

800b5ae

fixes

71fd6c4

change signals usage to function callback

cbdee7e

Update statetrack_plugin.cpp

ff889dc

update chainbase

e496e4b

covert kvo to json

407b4c6

fixes to scope

b49720f

Update statetrack_plugin.cpp

4cf4772

Update statetrack_plugin.hpp

7ab5e1a

Update statetrack_plugin.cpp

abfb5d1

Update statetrack_plugin.cpp

4ff38be

Update statetrack_plugin.cpp

f6c501d

Update statetrack_plugin.cpp

85dab8f

update chainbase

032615d

fixes

f3a9a80

Update statetrack_plugin.cpp

ce1329a

Update statetrack_plugin.cpp

8145b9e

Update statetrack_plugin.cpp

e4a300a

Create README.md

89b5727

Update README.md

0c2f6b5

Update README.md

e07a989

Update statetrack_plugin.cpp

3fc8363

Update statetrack_plugin.cpp

5ebaf27

Update statetrack_plugin.cpp

8a5869f

fixes to undo

67e2054

work on account and permissions events

27c777c

remove sym_int log

1580e33

mmcs85 and others added 19 commits November 5, 2018 18:34

Update statetrack_plugin.cpp

f273179

Create statetrack_plugin_impl.cpp

73f47aa

Create statetrack_plugin_impl.hpp

817bd44

Update statetrack_plugin_impl.cpp

3432cd6

Update statetrack_plugin.cpp

0fca161

Update statetrack_plugin_impl.hpp

a1019d9

Update statetrack_plugin.hpp

3d87447

Update statetrack_plugin_impl.cpp

77e1dbf

Update statetrack_plugin.cpp

f702524

Update CMakeLists.txt

7ce6e94

Update README.md

10b203e

dev action traces

d992bed

fixes to action

504162c

improvements to statetrack plugin

fd17898

add generic function to create index events

4ac0e4d

Add info to Readme

a42c948

Improve info in Readme

057ed5a

Update link to referenced code in Readme

6d85219

Add purposes and open questions to Readme

ceb286c

andresberrios mentioned this pull request Nov 15, 2018

Implement signals for adding plugin hook support in nodeos EOSIO/chainbase#28

Open

wanderingbort suggested changes Nov 16, 2018

View reviewed changes

mmcs85 added 5 commits November 26, 2018 23:46

add trx_id and act_index to db operations (work in progress)

3824faa

more work on op trx and action tracking

aec06de

dev sending trx with actions and operations

12ecbd9

adding block fork undo ops

5d75b13

work on adding zmq sending of trx and blocks

5eb50d1

b1bart closed this Jul 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

statetrack_plugin implementation #6321

statetrack_plugin implementation #6321

andresberrios commented Nov 15, 2018 •

edited

Loading

wanderingbort left a comment

mmcs85 commented Nov 16, 2018 •

edited

Loading

andresberrios commented Nov 17, 2018

b1bart commented Jul 30, 2019

statetrack_plugin implementation #6321

statetrack_plugin implementation #6321

Conversation

andresberrios commented Nov 15, 2018 • edited Loading

wanderingbort left a comment

Choose a reason for hiding this comment

mmcs85 commented Nov 16, 2018 • edited Loading

andresberrios commented Nov 17, 2018

b1bart commented Jul 30, 2019

andresberrios commented Nov 15, 2018 •

edited

Loading

mmcs85 commented Nov 16, 2018 •

edited

Loading