Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] memdb: replace the current implementation with ART(adaptive radix tree) #1400

Draft
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

you06
Copy link
Contributor

@you06 you06 commented Jul 29, 2024

ref pingcap/tidb#55287

This PR introduce the ART(adaptive radix tree) as a faster replacement of the current memdb. In the micro bench, this implementation outperforms the current memdb in every case, faster in single thread as well as lower total CPU utilization.

The implementation is inspired by plar/go-adaptive-radix-tree, with some additional work:

  • Support memory arena, which reduces the allocation cost in GC language.
  • Support vlog, and cascade transaction(aka. staging/release/cleanup)
  • Support iterator.
  • Support tracking memory usage.

In a word, the ART has the same interface as current memdb.

…a faster replacement of current memdb.

Signed-off-by: you06 <[email protected]>
Signed-off-by: you06 <[email protected]>
@ti-chi-bot ti-chi-bot bot added the dco-signoff: yes Indicates the PR's author has signed the dco. label Jul 29, 2024
Signed-off-by: you06 <[email protected]>
@ekexium ekexium self-requested a review July 30, 2024 02:00
}
}

func (n4 *node4) init() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we somehow reduce duplicated code in all inits?
Or can we consider using generics to reduce more duplicated code? Maybe a benchmark is needed.

return a.getNode256(n.addr)
}

func (an artNode) matchDeep(a *artAllocator, key Key, depth uint32) uint32 /* mismatch index*/ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we unify all receiver types to use pointers?

Comment on lines 504 to 508
for idx := 0; idx < int(n16.nodeNum); idx++ {
if n16.keys[idx] == c {
return idx, n16.children[idx]
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the paper uses SIMD or binary search here. We may benchmark the difference between the linear search and a binary search

go.mod Outdated
@@ -59,3 +60,5 @@ require (
gopkg.in/natefinch/lumberjack.v2 v2.2.1 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)

replace github.com/plar/go-adaptive-radix-tree => github.com/you06/go-adaptive-radix-tree v0.0.0-20240523051018-0278e8bfcd2b
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to review this? Shall we consider merge the changes to upstream?

Copy link
Contributor Author

@you06 you06 Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to review this, I import it only for benchmark test. My changes in 0278e8bfcd2b is only for the special usage in p-dml. As the mem arena implementation, I don't think the auther would like to accept it, since the impact of this change is too large(all the pointers are replaced by our self-defined address), and it actually sacrifice the read performance by introducing vlog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the memory arena without vlog is useful for them, but I don't have any performance data for it, what about discussing with them to see if they would like to adopt our improvement, which is not related to our implementation in client-go(I think client-go's requirements are quite unique).

@@ -0,0 +1,265 @@
package unionstore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file still needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some benchmark comparations between plar/go-adaptive-radix-tree and ours in memdb_bench_test.go, it should be removed later.

}

// ARTCheckpoint is the checkpoint of memory DB.
type ARTCheckpoint struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can merge it with MemDBCheckPoint, maybe

keptFlags := lf.getKeyFlags()
keptFlags = keptFlags.AndPersistent()
if keptFlags == 0 {
lf.markDelete()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we only mark it deleted, but not freeing the node as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Free the node needs to change the parent of it, and I'm not inclined to store the parent address in leaf (which consumes more memory).

Note delete node in the original memdb actually do not release any space, see comment, the only benefit of it is the suppress the height of tree, which is not a issue for ART.

So I think mark it as deleted won't make it worse at least.

Like the memdb's comment, I do not implement reusing leaf not memdb by now, because of the difficulty of length-variety.

node4size = 80
node16size = 236
node48size = 888
node256size = 3096
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to replace it with node256Size=Unsafe.SizeOf(node256{})?

}

func (n4 *node4) prevPresentIdx(start int) int {
mask := uint8(1<<(start+1) - 1) // e.g. start=3 => 0b000...0001111
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It might be clearer to add another pair of parentheses

return 8 - zeros - 1
}

func (n16 *node16) init() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we somehow reduce duplicated code in all inits?
Or can we consider using generics to reduce more duplicated code? Maybe a benchmark is needed.

)

const (
alignMask = 1<<32 - 8 // 29 bit 1 and 3 bit 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
alignMask = 1<<32 - 8 // 29 bit 1 and 3 bit 0.
alignMask = 0xFFFFFFF8 // 29 bits of 1 and 3 bits of 0

Is it better for 32-bit machine compatibility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied it from memdb's arena.
My understanding is because the address is 32bit integer, 32 bit mask is enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is will it overflow in a 32-bit machine?

Copy link
Contributor

@ekexium ekexium Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, modifying the type would necessitate changes in numerous locations where int is currently implemented. In my opinion, we should avoid using int in these places... I'm comfortable with simply declaring that we don't support 32-bit systems. Just let it panic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is will it overflow in a 32-bit machine?

The maxBlockSize is set to 128MB(128 << 20), which is far below maxuint32. The offset of address is less than maxBlockSize so it won't overflow.


We can declare it don't support 32-bit systems, but I still want to use uint32 address 1 it reduces the memory usage for address. Is const alignMask uint32 = 0xFFFFFFF8 what you mean by avoiding using int.

off uint32
}

func (addr nodeAddr) isNull() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function can be optimized as return addr == nullAddr || addr.idx == math.MaxUint32 || addr.off == math.MaxUint32
So it can be inlined and reduce branches. I've seen a notable time spent in it for the original memdb.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What an inline trick.

it.idxes = it.idxes[:0]
}

type ArtMemKeyHandle struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be declared in art_iterator.go. MemKeyHandle can be used here.
Even if the vanilla memdb will be removed, some structures can be shared, I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use MemKeyHandle will lead to cyclic imports now actually.

ArtMemKeyHandle is same to MemKeyHandle because the mem arena is same, but I think they can be different for different MemBuffer implementation.

The best choice is associated type. But associated type is not supported, maybe try generic type in memBufferMutations if we want it to support all the MemBuffer implementation.

internal/unionstore/art/arena.go Outdated Show resolved Hide resolved
return tombstone
}
valOff := hdrOff - valLen
return block[valOff:hdrOff:hdrOff]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this line is copied from memdbArena. I'm just wondering why its capacity is specified as hdrOff

Copy link
Contributor Author

@you06 you06 Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See "Full slice expressions" sector in Slice expressions, the capacity is set to max - low(hdrOff - valOff in our code).

internal/unionstore/art/node.go Outdated Show resolved Hide resolved
@cfzjywxk
Copy link
Contributor

@you06
Please change it to draft PR or adding [DNM] flag on the title.

@you06 you06 changed the title memdb: replace the current implementation with ART(adaptive radix tree) [DNM] memdb: replace the current implementation with ART(adaptive radix tree) Aug 21, 2024
@you06 you06 marked this pull request as draft August 21, 2024 05:34
@ti-chi-bot ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has signed the dco. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants