Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!(trace): simplify tracing and only write data in complete batches #1437

Draft
wants to merge 9 commits into
base: v0.34.x-celestia
Choose a base branch
from

Conversation

evan-forbes
Copy link
Member

@evan-forbes evan-forbes commented Jul 29, 2024

Description

This PR removes all ability to push trace data automatically. Instead, we can rely on standard tools such as the aws-cli to push to an s3 bucket.

It also changes the mechanism for buffering, instead of buffering raw bytes, it buffers events atomically. This results in only every writing entire events. This should prevent all (if not the vast majority) of instances where we were writing incomplete json data to the end of the file, leaving it unreadable.

Combined with the removal of the ability to read the json files to push to s3, we now avoid always ignoring data when we push. Instead we just push the data after the experiment is ran. Note that we still ignore data if the buffers are full (should never happen assuming we don't log so much as to overwhelm the ssd and keep the buffers sufficiently high).

Lastly, this PR takes advantage of each file having its own buffer to also serialize and write data in parallel, which should further increase performance. We also replaced the mutex with a channel.

closes #1403

Obligatory Versioning Exclusion Note

While this PR is breaking in that it removes a lot of functionality, it doesn't break the config or comet RPC. This follows the practice that we've been doing with the trace package, which is treating it like a dependency. Meaning, this is like if we updated an external package from one major change to another, therefore we don't need to bump core's major release.

@evan-forbes evan-forbes added WS: Big Blonks 🔭 Improving consensus critical gossiping protocols fix WS: Maintenance 🔧 labels Jul 29, 2024
@evan-forbes evan-forbes self-assigned this Jul 29, 2024
@evan-forbes evan-forbes requested a review from a team as a code owner July 29, 2024 12:06
@evan-forbes evan-forbes requested review from cmwaters, ninabarbakadze and staheri14 and removed request for a team July 29, 2024 12:06
Comment on lines 82 to 86
_, err := f.flush(buffer)
if err != nil {
f.logger.Error("tracer failed to write buffered files to file", "error", err)
}
buffer = buffer[:0] // reset buffer
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we're continuing to treat the trace data as expendable and instead optimizing for never blocking. If there's an error writing to the file, then we simply log it and throw away the data.

If this is occurring, we either see it in the logs or the data itself.

@evan-forbes evan-forbes changed the title fix!: simplify tracing and only write data in complete batches fix!(trace): simplify tracing and only write data in complete batches Jul 29, 2024
cmwaters
cmwaters previously approved these changes Jul 31, 2024
Copy link
Contributor

@cmwaters cmwaters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

(I haven't looked into the implementation details of the cached file)

rootulp
rootulp previously approved these changes Aug 5, 2024
#### Using environment variables for s3 bucket

Alternatively, you can set the following environment variables:
or using aws s3 (after setting up the aws cli ofc):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
or using aws s3 (after setting up the aws cli ofc):
or using aws s3 (after [setting up the aws cli](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html)):

@evan-forbes evan-forbes dismissed stale reviews from cmwaters and rootulp via 18ce0ea August 15, 2024 01:27
@evan-forbes evan-forbes marked this pull request as draft August 15, 2024 04:49
@evan-forbes
Copy link
Member Author

converting to a draft as we're still seeing the issue where the last line doesn't get completely written, even when we sync the file after each flush 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix WS: Big Blonks 🔭 Improving consensus critical gossiping protocols WS: Maintenance 🔧
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate malformed data in JSONL traced files
3 participants