Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: protect epoch checksum with its own mutex #2977

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

buck54321
Copy link
Member

@buck54321 buck54321 commented Sep 17, 2024

The trackedTrade.csum field was protected under the very broad trackedTrade.mtx. This created lock contention when preimage requests came in for cancel orders while we had long running ticks, since acceptCsum had to lock that mutex. With this change, we now protect the checksum fields for trades and cancels under a different mutex.

Additionally, there were two places where the trackedTrade.cancel field was being read unsafely. A solution for that is implemented.

Closes #2964
Supercedes #2973

Copy link
Member

@JoeGruffins JoeGruffins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but trying it out all of my cancels fail.

2024-09-18 08:07:17.441 [INF] CORE: Cancel order d208ee4fa7f536ffb2025bf6c1b31d1fbe9af6c506d3332fb70478ab3cac7e24 targeting order 7f0eec29050cd41ed20ce16aa5aec94c08bfb4cf6aeb8ef60b9c483478c078ab at 127.0.0.1:17273 has been placed
2024-09-18 08:07:17.441 [DBG] CORE: notify: |POKE| (order) Cancelling order - A cancel order has been submitted for order {{{order|7f0eec29050cd41ed20ce16aa5aec94c08bfb4cf6aeb8ef60b9c483478c078ab}}} - Order: 7f0eec29050cd41ed20ce16aa5aec94c08bfb4cf6aeb8ef60b9c483478c078ab
....
2024-09-18 08:07:30.000 [DBG] CORE: Received preimage request for order d208ee4fa7f536ffb2025bf6c1b31d1fbe9af6c506d3332fb70478ab3cac7e24 with known commitment 753dae1f98bd897467f8754912c82533b36612288f448f9dbefb472085a35fe7
2024-09-18 08:07:30.000 [DBG] CORE: async processPreimageRequest for d208ee4fa7f536ffb2025bf6c1b31d1fbe9af6c506d3332fb70478ab3cac7e24 succeeded
2024-09-18 08:07:30.000 [ERR] CORE[wss://127.0.0.1:17273/ws]: No handler found for response: {"type":2,"id":8886,"payload":{"result":null,"error":{"code":45,"message":"preimage hash f655dda0147785bf309308f63fe2c50a548f5ccecd864b91932db04823f23069 does not match order commitment 753dae1f98bd897467f8754912c82533b36612288f448f9dbefb472085a35fe7"}},"sig":""}
2024-09-18 08:07:30.011 [DBG] CORE: Score changed at 127.0.0.1:17273. New score is 13 / 60, tier = 10, penalties = 0
2024-09-18 08:07:30.011 [WRN] CORE: Deleting failed cancel order d208ee4fa7f536ffb2025bf6c1b31d1fbe9af6c506d3332fb70478ab3cac7e24 that targeted trade order 7f0eec29050cd41ed20ce16aa5aec94c08bfb4cf6aeb8ef60b9c483478c078ab
2024-09-18 08:07:30.013 [WRN] CORE: notify: |WARNING| (order) Failed cancel - Cancel order for order 7f0eec29050cd41ed20ce16aa5aec94c08bfb4cf6aeb8ef60b9c483478c078ab failed and is now deleted. - Order: 7f0eec29050cd41ed20ce16aa5aec94c08bfb4cf6aeb8ef60b9c483478c078ab

Comment on lines 4002 to +4012
tracker.mtx.RLock()
if !bytes.Equal(firstCSum, tracker.csum) {
csum := tracker.csum
tracker.mtx.RUnlock()
if !bytes.Equal(firstCSum, csum) {
t.Fatalf(
"[handlePreimageRequest] csum was changed, exp: %s, got: %s",
firstCSum,
tracker.csum,
csum,
)
}
tracker.mtx.RUnlock()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this one should also be holding the new tracker.csumMtx instead

Comment on lines 4073 to +4076
tracker.mtx.RLock()
if !bytes.Equal(csum, tracker.csum) {
checkSum := tracker.csum
tracker.mtx.RUnlock()
if !bytes.Equal(csum, checkSum) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should hold the csum mutex?

Comment on lines 4157 to +4160
tracker.mtx.RLock()
if !bytes.Equal(commitCSum, tracker.cancel.csum) {
cancelCsum := tracker.cancelCsum
tracker.mtx.RUnlock()
if !bytes.Equal(commitCSum, cancelCsum) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should hold the csum mutex?

Comment on lines 4247 to +4249
tracker.mtx.RLock()
if !bytes.Equal(firstCSum, tracker.cancel.csum) {
cancelCsum := tracker.cancelCsum
tracker.mtx.RUnlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should hold the csum mutex instead?

Comment on lines 4333 to +4336
tracker.mtx.RLock()
if !bytes.Equal(csum, tracker.cancel.csum) {
cancelCsum := tracker.cancelCsum
tracker.mtx.RUnlock()
if !bytes.Equal(csum, cancelCsum) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should hold the csum mutex?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

core: panic checking cancel order checksum
2 participants