-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
indexer: backups are now a deterministic set of SQL statements #1338
base: main
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 9498381141Details
💛 - Coveralls |
062a1a5
to
cc46776
Compare
Pull Request Test Coverage Report for Build 9498587081Details
💛 - Coveralls |
Pull Request Test Coverage Report for Build 9498735272Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. I said I would, but haven't had the time. LGTM with some nits.
Did you consider a test? Just to make sure that we don't regress with any other non-determinism in SQL backups in the future.
Also, I wonder if we should rename the indexer sql "backup" feature to "snapshot", as the latter is perhaps more well understood to mean "deterministic".
statement.WriteString(line) | ||
statement.WriteString("\n") | ||
|
||
if strings.HasSuffix(line, ";") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure that the entire text ends with ;
? what if it's like
foo;
bar;
baz
in that case you would ignore baz
, so perhaps add a sanity check at the end that there's no remaining statement.
Could you explain why this is now deterministic? thanks |
cc46776
to
47767ee
Compare
Pull Request Test Coverage Report for Build 9546598969Details
💛 - Coveralls |
it's deterministic because the SQL statements produced by each node are the same (bit-by-bit) . this was not the case with the raw db format. |
they are a (zstd compressed) set of SQL statements also, relevant methods now use io.Reader and io.Writer
47767ee
to
f345045
Compare
this is much less performant that simply exporting/importing the raw binary database, but makes the backup deterministic, allowing StateSync to pull the snapshot from many nodes at the same time