Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: merge from develop #224

Merged
merged 38 commits into from
Aug 5, 2024
Merged

chore: merge from develop #224

merged 38 commits into from
Aug 5, 2024

Conversation

joway
Copy link
Collaborator

@joway joway commented Aug 5, 2024

No description provided.

zhangyunhao116 and others added 30 commits January 18, 2022 15:14
use wyrand, see https://github.com/wangyi-fudan/wyhash.

name                                old time/op  new time/op  delta
SingleCore/math/rand-Uint32()-16    10.8ns ± 0%  10.8ns ± 0%     ~     (p=0.913 n=7+10)
SingleCore/fast-rand-Uint32()-16    2.26ns ± 0%  2.25ns ± 0%     ~     (p=0.015 n=10+10)
SingleCore/math/rand-Uint64()-16    11.1ns ± 0%  11.1ns ± 0%     ~     (p=0.014 n=10+8)
SingleCore/fast-rand-Uint64()-16    5.03ns ± 0%  4.75ns ± 0%   -5.50%  (p=0.000 n=10+9)
SingleCore/math/rand-Read1000-16     682ns ± 0%   682ns ± 0%     ~     (p=0.927 n=10+10)
SingleCore/fast-rand-Read1000-16     298ns ± 1%   150ns ± 0%  -49.69%  (p=0.000 n=10+9)
MultipleCore/math/rand-Uint32()-16   114ns ± 3%   113ns ± 4%     ~     (p=0.306 n=10+10)
MultipleCore/fast-rand-Uint32()-16  0.18ns ± 1%  0.18ns ± 2%     ~     (p=0.006 n=9+10)
MultipleCore/math/rand-Uint64()-16   115ns ± 6%   118ns ± 3%     ~     (p=0.018 n=10+9)
MultipleCore/fast-rand-Uint64()-16  0.39ns ± 1%  0.38ns ± 0%   -1.55%  (p=0.000 n=10+8)
MultipleCore/math/rand-Read1000-16  1.02µs ± 3%  1.03µs ± 4%     ~     (p=0.644 n=10+10)
MultipleCore/fast-rand-Read1000-16   112ns ± 0%   102ns ± 1%   -9.38%  (p=0.000 n=10+10)
* feat: support change min/max gc percent limit

* fix: make max/min gc percent private

* feat: add get min/max gc percent interface
Change-Id: I63cc5c0e8d7b69b627322220881ded0746001c18

Co-authored-by: gaohui.000 <[email protected]>
Change-Id: I7bff5f64b3b95eab93f8e46132dd264c519c6fd6

Co-authored-by: zhangjie.001 <[email protected]>
* fix(metainfo): fix value override by append

* ci(github): increase action timeout for benchdiff
Signed-off-by: cui fliter <[email protected]>
* feat: add RangePersistentValues

Change-Id: I60f02e8a0769ea855d775cbfc1980915bb76dfff

Change-Id: I1d9a81d81b7fec7ee90396ebdeaa2e8e6e213cfa

Co-authored-by: caimufu <[email protected]>
* feat(lscq): add arm64 support

Using CASP instruction to implement double-width CAS for arm64.
The CASP instruction is available for instruction set Armv8.1+
(inclusive, Apple M1/M2/A12, Snapdragon 845, etc).
All tests in lscq_test.go passed on Macbook Air M1 2020, macOS 12.5.1.

Change-Id: Ieb89fa9361f1fce8fb52b102dca556867f7e8e8a

* use cpu.ARM64.HasATOMICS to determine whether the arm64 cpu has standalone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead).
Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true.
runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around.

* feat(lscq): use LDAXP/STLXP for armv8.0 instead of CASP

use cpu.ARM64.HasATOMICS to determine whether the arm64 cpu has standalone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead).
Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true.
runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around.

* fix(lscq): fix bad indention

fix bad indention problem.

* fix(lscq): replace ORR with MOVD

Use MOVD instead of ORR to copy between registers. Prettify indentions between operator and operand.

* fix(lscq): detect atomics feature correctly on darwin

On darwin, golang.org/x/sys/cpu.ARM64.HasATOMICS is set to false,
which actually should be true. Therefore, in order to use faster
CASPD instruction on darwin_arm64, we can use sysctl/sysctlbyname
to detect cpu features on darwin, see https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code.
The sysctlEnabled is exported from internal/cpu.sysctlEnabled,
which will call sysctlbyname to detect if specific cpu feature is enabled.
And it's ok to use golang.org/x/sys/cpu.ARM64.HasATOMICS on other OS.
One more bug fixed: MOVD R2, R6 => MOVD R6, R2. The bug wouldn't be found
if arm64HasAtomics is set to false.
All test and bench passed.

Co-authored-by: yuchengye <[email protected]>
* chore: fix gctuner tests

* chore: fix finalizer test

* chore: use manual gc for finalizer test

Co-authored-by: Pure White <[email protected]>
The key problem is that the LoadOrStore uses an outdated highest level to search the skip list, if the new generated level(say newLevel) is bigger than the outdated highest level(say oldLevel), all slots in the slices preds[newLevel-oldLevel:newLevel] are nil, the function will panic.

The solution is that if the newLevel has updated the highest level, we can just continue the loop, and find a new path. At this time, the latest highest level used by the findNode is always bigger than or equal to the newLevel, the function won't panic.

name                            old time/op  new time/op  delta
LoadOrStoreExist-16             4.43ns ±11%  1.06ns ±21%  -76.05%  (p=0.000 n=10+10)
LoadOrStoreLazyExist-16         4.61ns ± 6%  1.16ns ± 0%  -74.90%  (p=0.000 n=9+7)
LoadOrStoreExistSingle-16       37.5ns ± 6%  10.3ns ± 0%  -72.61%  (p=0.000 n=9+9)
LoadOrStoreLazyExistSingle-16   38.3ns ±11%  10.6ns ± 0%  -72.43%  (p=0.000 n=10+10)
LoadOrStoreRandom-16             209ns ±14%   206ns ±14%     ~     (p=0.684 n=10+10)
LoadOrStoreLazyRandom-16         215ns ± 8%   207ns ± 9%     ~     (p=0.139 n=10+10)
LoadOrStoreRandomSingle-16      1.05µs ± 1%  1.04µs ± 4%     ~     (p=0.535 n=9+10)
LoadOrStoreLazyRandomSingle-16  1.07µs ± 1%  1.06µs ± 3%   -1.09%  (p=0.029 n=8+9)
* fix(metainfo): old kvs missed after `WithValues()`

* opt: add Len() API for better performance when dumping KVS
NX-Official and others added 8 commits March 15, 2024 14:28
* chore: fix license check ci

* chore: rm feishu notify since it not work now

* chore: rm bench diff since it not work now

* chore: change license ci

* chore: speed up ci

* chore: fix TestBreakerConcurrent ci

* chore: skip un-stable unit test since its time sensetive

* chore: fix test race

* fix: FromHTTPHeader add stale

* fix: TestChannelNoConsumer race

* chore: grant owner to joway
* chore: rm linked fast rand for test

* chore: rm unstable ci assert

* chore: spped up TestDigest
@joway joway merged commit 3042f73 into main Aug 5, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.