Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let memdb cache last traversed node #1389

Merged
merged 16 commits into from
Aug 8, 2024
Merged

Conversation

ekexium
Copy link
Contributor

@ekexium ekexium commented Jul 13, 2024

A common usage in TiDB is to access a same key multiple times consecutively. We can save subsequent traversals when this happen.

Benchmark

Sysbench (standard DML)

I tested oltp_write_only, no diff found compared with master.

YCSB benchmark (pipelined DML)

In a YCSB 3M rows benchmark, the computation duration (total duration - flush wait) decreases, proving the optimization works as expected. We also observe a significant increase in flush_wait when applying this optimization. That should be a standalone problem to track and solve, IMO.

Master
insert 47.73-9.74=37.99s
update 30.74-2.78=27.96s
delete 23.81-5.98=17.83s

[pipelined_flush.go:304] ["[pipelined dml] start to commit transaction"] [conn=2669674502] [session_alias=] [keys=5999998] [flush_wait_duration=9.742798374s] [total_duration=47.735029816s] [size=3.506GB] [startTS=451525108111769604]

[pipelined_flush.go:304] ["[pipelined dml] start to commit transaction"] [conn=2669674506] [session_alias=] [keys=2999999] [flush_wait_duration=2.787723191s] [total_duration=30.74270092s] [size=3.243GB] [startTS=451525199285190657]

[pipelined_flush.go:304] ["[pipelined dml] start to commit transaction"] [conn=2669674510] [session_alias=] [keys=5999998] [flush_wait_duration=5.983734665s] [total_duration=23.811122881s] [size=198MB] [startTS=451525285989580803]

This PR
insert 43.12-10.50=32.62s
update 31.76-4.96=26.8s
delete 28.82-15.06=13.76s

[pipelined_flush.go:304] ["[pipelined dml] start to commit transaction"] [conn=3839885318] [session_alias=] [keys=5999998] [flush_wait_duration=10.507358104s] [total_duration=43.121563977s] ["memdb cache hit count"=11999996] ["memdb cache miss count"=11999996] [size=3.506GB] [startTS=451516433638621187]

[pipelined_flush.go:304] ["[pipelined dml] start to commit transaction"] [conn=3839885320] [session_alias=] [keys=2999999] [flush_wait_duration=4.96714099s] [total_duration=31.767803066s] ["memdb cache hit count"=5999998] ["memdb cache miss count"=2999999] [size=3.243GB] [startTS=451516523593334791]

[pipelined_flush.go:304] ["[pipelined dml] start to commit transaction"] [conn=3839885324] [session_alias=] [keys=5999998] [flush_wait_duration=15.061686795s] [total_duration=28.810620825s] ["memdb cache hit count"=8999997] ["memdb cache miss count"=8999997] [size=198MB] [startTS=451516610572976131]

microbenchmark

master:

go test -run=^$ -bench=Record -benchtime=30s
goos: linux
goarch: amd64
pkg: github.com/pingcap/tidb/pkg/table/tables
cpu: AMD Ryzen 5 3600 6-Core Processor
BenchmarkAddRecordInPipelinedDML-12       	    3243	  10031009 ns/op	         0 cacheHit/op	         0 cacheMiss/op	      2006 ns/record
BenchmarkRemoveRecordInPipelinedDML-12    	    5106	   7356123 ns/op	         0 cacheHit/op	         0 cacheMiss/op	      1471 ns/record
BenchmarkUpdateRecordInPipelinedDML-12    	    4496	   8319425 ns/op	         0 cacheHit/op	         0 cacheMiss/op	      1664 ns/record
PASS
ok  	github.com/pingcap/tidb/pkg/table/tables	130.371s

This PR:

go test -run=^$ -bench=Record -benchtime=30s
goos: linux
goarch: amd64
pkg: github.com/pingcap/tidb/pkg/table/tables
cpu: AMD Ryzen 5 3600 6-Core Processor
BenchmarkAddRecordInPipelinedDML-12       	    4514	   8312115 ns/op	         2.000 cacheHit/op	         2.000 cacheMiss/op	      1662 ns/record
BenchmarkRemoveRecordInPipelinedDML-12    	    6220	   5921161 ns/op	         1.000 cacheHit/op	         2.000 cacheMiss/op	      1184 ns/record
BenchmarkUpdateRecordInPipelinedDML-12    	    5998	   6389768 ns/op	         2.000 cacheHit/op	         1.000 cacheMiss/op	      1278 ns/record
PASS
ok  	github.com/pingcap/tidb/pkg/table/tables	133.874s

Signed-off-by: ekexium <[email protected]>
@ekexium ekexium changed the title Let memdb cache last traversed node [WIP] Let memdb cache last traversed node Jul 15, 2024
Signed-off-by: ekexium <[email protected]>
@ekexium ekexium changed the title [WIP] Let memdb cache last traversed node Let memdb cache last traversed node Jul 22, 2024
@ti-chi-bot ti-chi-bot bot added the dco-signoff: yes Indicates the PR's author has signed the dco. label Jul 30, 2024
@ekexium ekexium requested review from cfzjywxk, you06 and MyonKeminta and removed request for cfzjywxk and you06 July 31, 2024 14:40
zap.Duration("total_duration", c.txn.GetMemBuffer().GetFlushMetrics().TotalDuration),
zap.Duration("flush_wait_duration", c.txn.GetMemBuffer().GetMetrics().WaitDuration),
zap.Duration("total_duration", c.txn.GetMemBuffer().GetMetrics().TotalDuration),
zap.Uint64("memdb cache hit count", c.txn.GetMemBuffer().GetMetrics().MemDBHitCount),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nameing memdb cache hit count would be confusing for reader who hasn's know the code, perhaps adding traverse or something like that into the log.

Signed-off-by: ekexium <[email protected]>
@cfzjywxk cfzjywxk merged commit aa8b338 into tikv:master Aug 8, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has signed the dco.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants