Skip to content

Commit

Permalink
cometindex: speedup by committing event changes in batches of 1000 (#…
Browse files Browse the repository at this point in the history
…4854)

Instead of creating one transaction for each event we need to index, we
instead only close this transaction every 1000 events (or when when
we've caught up to the database).

This gives about a 5x performance in catch up speed.

## Checklist before requesting a review

- [x] If this code contains consensus-breaking changes, I have added the
"consensus-breaking" label. Otherwise, I declare my belief that there
are not consensus-breaking changes, for the following reason:

  > indexing only
  • Loading branch information
cronokirby committed Sep 17, 2024
1 parent 6c0ba1c commit ae4ed05
Showing 1 changed file with 9 additions and 3 deletions.
12 changes: 9 additions & 3 deletions crates/util/cometindex/src/indexer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@ impl Indexer {
let mut relevant_events = 0usize;

let mut es = read_events(&src_db, watermark);
let mut dbtx = dst_db.begin().await?;
while let Some(event) = es.next().await.transpose()? {
if scanned_events % 1000 == 0 {
tracing::info!(scanned_events, relevant_events);
Expand All @@ -178,8 +179,6 @@ impl Indexer {

relevant_events += 1;

// Otherwise we have something to process. Make a dbtx
let mut dbtx = dst_db.begin().await?;
for index in indexes {
if index.is_relevant(&event.as_ref().kind) {
tracing::debug!(?event, ?index, "relevant to index");
Expand All @@ -188,8 +187,15 @@ impl Indexer {
}
// Mark that we got to at least this event
update_watermark(&mut dbtx, event.local_rowid).await?;
dbtx.commit().await?;
// Only commit in batches of <= 1000 events, for about a 5x performance increase when
// catching up.
if relevant_events % 1000 == 0 {
dbtx.commit().await?;
dbtx = dst_db.begin().await?;
}
}
// Flush out the remaining changes.
dbtx.commit().await?;

Ok(())
}
Expand Down

0 comments on commit ae4ed05

Please sign in to comment.