-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
builder baseMap concurrent write fix lock #11588
Conversation
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Trivy scan found the following vulnerabilities:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple notes, but what's unclear to me is why this would address the issue fully... The panic occurs at pkg/logql/log/labels.go:468
which is when the newly-retrieved map from the pool is clear
-ed, suggesting that we're putting a map back into the pool without removing all references to it. We can add the locks and that'll make the problem go away, but is the underlying issue not the references?
//for _, tt := range tests { | ||
// t.Run(tt.fmt, func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//for _, tt := range tests { | |
// t.Run(tt.fmt, func(t *testing.T) { |
ls := labels.FromStrings("cluster", "us-central-0") | ||
builder := NewBaseLabelsBuilder().ForLabels(ls, ls.Hash()) | ||
wg := sync.WaitGroup{} | ||
wg.Add(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this not need to be in the loop?
|
||
} | ||
wg.Done() | ||
time.Sleep(2 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the sleep for?
superseded by #11607 |
It looks like with either sufficiently large line formats, or line formats that have some kind of operation that results in two line formats being executed concurrently (somehow, I do not understand how), it's possible for us to end up with a concurrent write as part of
LineFormatter.Process
calls.This AFAICT existed before #11484 but a query was executed in one of our environments today that made us think the panic was from those changes. I suspect the baseMap usage is the underlying cause, and can confirm by deploying the changes from here into one of our environments tomorrow and running the same query I saw today that caused the panic.