Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

region cache: batch find regions by key ranges from cache #1410

Merged
merged 8 commits into from
Aug 9, 2024

Conversation

wshwsh12
Copy link
Contributor

Currently, client-go only supports batch reading of region information from pd, but not from region cache.
If there is a large table with 200k regions, even if all regions are in memory, it still takes 0.8-0.9s to read all regions.
This PR optimizes the querying times to the B-tree, the time consumption reduces by about 40%.

This PR
Running tool: /gosdk/go1.22.1/bin/go test -benchmem -run=^$ -bench ^BenchmarkBatchLocateKeyRanges$ github.com/tikv/client-go/v2/internal/locate --tags=intest --count=1 -v

goos: linux
goarch: amd64
pkg: github.com/tikv/client-go/v2/internal/locate
cpu: 13th Gen Intel(R) Core(TM) i9-13900KF
BenchmarkBatchLocateKeyRanges
BenchmarkBatchLocateKeyRanges-32    	    1116	   1305628 ns/op	 1412405 B/op	   10921 allocs/op
PASS
ok  	github.com/tikv/client-go/v2/internal/locate	1.987s

Running tool: /gosdk/go1.22.1/bin/go test -benchmem -run=^$ -bench ^BenchmarkBatchLocateKeyRanges$ github.com/tikv/client-go/v2/internal/locate --tags=intest --count=1 -v

Master Branch
goos: linux
goarch: amd64
pkg: github.com/tikv/client-go/v2/internal/locate
cpu: 13th Gen Intel(R) Core(TM) i9-13900KF
BenchmarkBatchLocateKeyRanges
BenchmarkBatchLocateKeyRanges-32    	     502	   2367271 ns/op	 1512372 B/op	   20020 allocs/op
PASS
ok  	github.com/tikv/client-go/v2/internal/locate	1.831s

@ti-chi-bot ti-chi-bot bot added the dco-signoff: yes Indicates the PR's author has signed the dco. label Jul 31, 2024
@wshwsh12 wshwsh12 requested a review from you06 August 1, 2024 07:46
f := func(item *btreeItem) bool {
region := item.cachedRegion
if len(startKey) > 0 && bytes.Compare(region.StartKey(), startKey) <= 0 {
if !region.checkRegionCacheTTL(time.Now().Unix()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move time.Now().Unix() out of the check loop?

var regions []*Region
c.mu.RLock()
defer c.mu.RUnlock()
regions = c.mu.sorted.AscendGreaterOrEqual(startKey, endKey, limit)
regions = c.mu.sorted.DescendLessOrEqual(startKey, endKey)
slices.Reverse(regions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use ascend scan instead of using descend scan and revert it.

Copy link
Contributor Author

@wshwsh12 wshwsh12 Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just use ascend scan maybe lost the first region. For example, search range [b,d) from region [a, c) [c, e), use ascend scan can only get region [c, e).

regions = append(regions, region)
return false
}
if !region.checkRegionCacheTTL(time.Now().Unix()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a now~

}

// DescendLess returns all items that are less than or equal to the key.
func (s *SortedRegions) DescendLessOrEqual(startKey, endKey []byte) (regions []*Region) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question, why or-equal for this location func? IMO, endKey should be excluded bound.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTree doesn't provide the function DescendLess...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concened here that this function holds a global lock, if it takes too long time, maybe this sql will affect the other sqls, especially the oltp queries.

}
hit := false
for _, r := range batchRegionInCache {
if !r.Contains(keyRange.StartKey) { // uncached hole, load the rest regions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this code. If [keyRange.StartKey, keyRange.Endkey) contains multiple regions, in my understanding only the first region can contain keyRange.StartKey?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After processing a region, I set keyRange.StartKey = r.endkey(). So the next region also can contain the keyRange.StartKey.

internal/locate/region_cache.go Outdated Show resolved Hide resolved
}

// DescendLess returns all items that are less than or equal to the key.
func (s *SortedRegions) DescendLessOrEqual(startKey, endKey []byte) (regions []*Region) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concened here that this function holds a global lock, if it takes too long time, maybe this sql will affect the other sqls, especially the oltp queries.

add benchmark

Signed-off-by: wshwsh12 <[email protected]>
Signed-off-by: wshwsh12 <[email protected]>
Signed-off-by: wshwsh12 <[email protected]>
Signed-off-by: wshwsh12 <[email protected]>
}
keyRange.StartKey = r.EndKey()
// Batch load rest regions from Cache.
batchSize := 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use defaultRegionsPerBatch as the batchSize?

keyRange.StartKey = r.EndKey()
}
if len(batchRegionInCache) < batchSize { // region cache miss, load the rest regions
break outer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like just use break here is ok

@@ -2078,7 +2105,8 @@ func (c *RegionCache) loadRegionByID(bo *retry.Backoffer, regionID uint64) (*Reg
}
}

// TODO(youjiali1995): for optimizing BatchLoadRegionsWithKeyRange, not used now.
// For optimizing BatchLocateKeyRanges, scanRegionsFromCache scans at most `limit` regions from cache.
// The first region's startKey is `startKey`, all regions returned are continuous and have no hole.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you need to emphasize that it is the caller's responsibility to make sure that startKey is a node in the B-tree, otherwise, the startKey will not be included in the return regions?

Signed-off-by: wshwsh12 <[email protected]>
if !region.checkRegionCacheTTL(now) {
return false
}
if len(lastStartKey) > 0 && !region.Contains(lastStartKey) { // uncached hole
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when will len(lastStartKey) == 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some test cases have the key, for -inf to +inf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case len(lastStartKey) == 0 should never happen in current usage(start key is assigned here, and it's never empty if the code runs here).

When len(lastStartKey) == 0 and the first region is not start from -inf, we should stop scan regions, see the following test.

func (s *testRegionRequestToSingleStoreSuite) TestRegionCacheStartNonEmpty() {
	_ = s.cache.refreshRegionIndex(s.bo)
	r, _ := s.cache.scanRegionsFromCache(s.bo, []byte{}, nil, 10)
	s.Equal(len(r), 1)

	region, _ := s.cache.LocateRegionByID(s.bo, s.region)
	v2 := region.Region.confVer + 1
	r2 := metapb.Region{Id: region.Region.id, RegionEpoch: &metapb.RegionEpoch{Version: region.Region.ver, ConfVer: v2}, StartKey: []byte{1}}
	st := newUninitializedStore(s.store)

	s.cache.mu.Lock()
	s.cache.mu.sorted.Clear()
	s.cache.mu.Unlock()
	// region cache after clear: []

	s.cache.insertRegionToCache(&Region{meta: &r2, store: unsafe.Pointer(st), ttl: nextTTLWithoutJitter(time.Now().Unix())}, true, true)
	// region cache after insert: [[1, +inf)]

	r, _ = s.cache.scanRegionsFromCache(s.bo, []byte{}, nil, 10)
	s.Equal(len(r), 0) // fail in current implementation
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add the test TestRegionCacheStartNonEmpty and remove the len(lastStartKey) == 0 check. Now the code stop scan regions when len(lastStartKey) == 0 and the first region is not start from -inf. PTAL again.

Signed-off-by: wshwsh12 <[email protected]>
@windtalker
Copy link
Contributor

lgtm

Copy link
Contributor

@you06 you06 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

// region cache after insert: [[1, +inf)]

r, _ = s.cache.scanRegionsFromCache(s.bo, []byte{}, nil, 10)
s.Equal(len(r), 0) // fail in current implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the comment // fail ...

@@ -2259,13 +2260,35 @@ func (s *testRegionRequestToSingleStoreSuite) TestRefreshCache() {
s.cache.insertRegionToCache(&Region{meta: &r2, store: unsafe.Pointer(st), ttl: nextTTLWithoutJitter(time.Now().Unix())}, true, true)

r, _ = s.cache.scanRegionsFromCache(s.bo, []byte{}, nil, 10)
s.Equal(len(r), 2)
s.Equal(len(r), 1)
Copy link
Contributor

@you06 you06 Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to this PR I think.

Since region cache doesn't remove the first intersected region(it scan intersected region by AscendGreaterOrEqual), the outdated region (-inf, inf) is still alive and the new inserted valid region [{1}, inf) is ignored because the first seen region (-inf, inf) contains all the required ranges... We finally got outdated region though we know some valid regions.

But this won't affect finding region by single key(it use descend scan).

I don't know if removing all intersected regions is a good idea, what about adding some comments explain it in this test?

Signed-off-by: wshwsh12 <[email protected]>
@wshwsh12 wshwsh12 requested a review from you06 August 8, 2024 06:44
@you06 you06 merged commit f0f57f2 into tikv:master Aug 9, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has signed the dco.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants