-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix!(trace): simplify tracing and only write data in complete batches #1437
base: v0.34.x-celestia
Are you sure you want to change the base?
Conversation
pkg/trace/cached_file.go
Outdated
_, err := f.flush(buffer) | ||
if err != nil { | ||
f.logger.Error("tracer failed to write buffered files to file", "error", err) | ||
} | ||
buffer = buffer[:0] // reset buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we're continuing to treat the trace data as expendable and instead optimizing for never blocking. If there's an error writing to the file, then we simply log it and throw away the data.
If this is occurring, we either see it in the logs or the data itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
(I haven't looked into the implementation details of the cached file)
#### Using environment variables for s3 bucket | ||
|
||
Alternatively, you can set the following environment variables: | ||
or using aws s3 (after setting up the aws cli ofc): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or using aws s3 (after setting up the aws cli ofc): | |
or using aws s3 (after [setting up the aws cli](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html)): |
converting to a draft as we're still seeing the issue where the last line doesn't get completely written, even when we sync the file after each flush 🤔 |
Description
This PR removes all ability to push trace data automatically. Instead, we can rely on standard tools such as the aws-cli to push to an s3 bucket.
It also changes the mechanism for buffering, instead of buffering raw bytes, it buffers events atomically. This results in only every writing entire events. This should prevent all (if not the vast majority) of instances where we were writing incomplete json data to the end of the file, leaving it unreadable.
Combined with the removal of the ability to read the json files to push to s3, we now avoid always ignoring data when we push. Instead we just push the data after the experiment is ran. Note that we still ignore data if the buffers are full (should never happen assuming we don't log so much as to overwhelm the ssd and keep the buffers sufficiently high).
Lastly, this PR takes advantage of each file having its own buffer to also serialize and write data in parallel, which should further increase performance. We also replaced the mutex with a channel.
closes #1403
Obligatory Versioning Exclusion Note
While this PR is breaking in that it removes a lot of functionality, it doesn't break the config or comet RPC. This follows the practice that we've been doing with the
trace
package, which is treating it like a dependency. Meaning, this is like if we updated an external package from one major change to another, therefore we don't need to bump core's major release.