-
Notifications
You must be signed in to change notification settings - Fork 178
Attempt to reduce memory footprint of compaction #627
Conversation
Signed-off-by: Ganesh Vernekar <[email protected]>
…d -> new series id) map. Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Results are little disappointing
Memory profile for this PR: compact_opt_PR.zip Memory profile for master: compact_opt_master.zip |
I managed to make it better but has a failing test in vertical compaction. Will push after I fix it.
|
Signed-off-by: Ganesh Vernekar <[email protected]>
This is the profile after the latest commit (getting rid of |
Signed-off-by: Ganesh Vernekar <[email protected]>
Minute improvement after the last commit
|
This zip container memory and cpu profiles for both master and this PR while running those above benchmarks (the block creation was done independently and |
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
…mory. Signed-off-by: Ganesh Vernekar <[email protected]>
While trying to reduce allocs, I was able to get the B/op less than before instead in the last few commits.
Still digging more to reduce allocs. |
Signed-off-by: Ganesh Vernekar <[email protected]>
Found the culprit for increased allocs: https://github.com/prometheus/tsdb/blob/c1c39ed2e7900fb056445c9818754e3254518a49/compact.go#L870 (this line alone) |
Signed-off-by: Ganesh Vernekar <[email protected]>
With the latest commit (re-using the With that came more
I will be looking into reducing more allocs if possible. |
This reduces the allocs of WriteLabelIndex significantly. Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Good news in the last 2 commits. Allocs reduced significantly and the diff for allocs is negative now. EDIT: Numbers updated after fixing the race.
|
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
More reduction in allocs in the last few commits!
|
Signed-off-by: Ganesh Vernekar <[email protected]>
This last commit f8ddc03 shed off about 2-3s of compaction time from a total of 15-17s before. The change was the avoid repeated binary search on the Now it compares like this with the current master.
This benchmark results are on top of recently merged optimizations, and does not relate directly to numbers in this comment #627 (comment) (they were w.r.t. the master without any optimizations). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass, makes sense overall. I want to see if we can simplify things further.
compact.go
Outdated
return errors.Wrap(err, "write postings") | ||
postingBuf := make([]uint64, 0, 1000000) | ||
var bigEndianPost index.Postings = index.NewBigEndianPostings(nil) | ||
var listPost = index.NewListPostings(nil).(*index.ListPostings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make NewListPostings
return a ListPostings
?
head.go
Outdated
@@ -1349,6 +1361,7 @@ func (h *Head) getOrCreateWithID(id, hash uint64, lset labels.Labels) (*memSerie | |||
|
|||
h.metrics.series.Inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're tracking this through NumSeries()
. Can we just drop the gauge and use gaugeFunc()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. Also, I have another PR for NumSeries()
#656 if that would help in reviewing. Do you want to be in a separate PR or inside this PR itself?
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
almost lgtm :) Will take another pass after suggestions are fixed before I approve though.
Signed-off-by: Ganesh Vernekar <[email protected]>
While running prombench with this PR, compactions for head block was failing with the following error.
I am not able to replicate it yet. Compaction from head is working fine in synthetic test. |
Able to re-create by just running prometheus in local machine! Will dig in more. |
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
This fixes series out-of-order bug when compacting from on-disk block, because postings used to contain invalid series references as even the deleted series were present in seriesMap with bogus series references. Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
The previous way of merging remapped postings did not scale well. I was keeping a sorted list of postings. And adding the postings in that list while keeping it sorted. This was slow. During prombench test, compaction of on-disk blocks was stuck at this place and allocations also shot up (it had ~13-14M series in total from all the blocks being compacted). Now I am gathering the postings list from one index reader at a time and merging the sorted list in the old fashion way. I tested this on the indices that I had downloaded from prombench, and they finally don't get stuck here. Below are the updated (synthetic) benchmark results. 2 blocks, same as all the benchmarks in this PR thread.
This is for 3 blocks. An indication that with more blocks the time will get more worse.:
I will test this on prombench now. |
Signed-off-by: Ganesh Vernekar <[email protected]>
Closing it for now. Analysis of this PR vs master is in this doc https://docs.google.com/document/d/1ZzZ8sslkA5LPshr9gqz-MmRpPmyY19jFDz-AKrkwpVw/edit?usp=sharing. |
Thanks for that work guys, I think we learnt a lot from this despite negative outcome of this particular experiment! |
What does this PR do
I will do any more cleanup needed and post the benchmark results tomorrow. Hoping for some good news.
Update: This PR ended up have much more memory allocation savings in many places than just the above. This comment summarizes it all (hopefully).
UPDATE 2: As this PR got very complicated for review, it is broken down into multiple PRs (more to come soon). For now these are the child PRs which originate from this PR
writeOffsetTable
keys
inReadOffsetTable
WriteChunks
andwriteHash
stringTuples.Swap
NumSeries()
method forBlockReader
interface