fix!(trace): simplify tracing and only write data in complete batches #1437

evan-forbes · 2024-07-29T12:06:04Z

Description

This PR removes all ability to push trace data automatically. Instead, we can rely on standard tools such as the aws-cli to push to an s3 bucket.

It also changes the mechanism for buffering, instead of buffering raw bytes, it buffers events atomically. This results in only every writing entire events. This should prevent all (if not the vast majority) of instances where we were writing incomplete json data to the end of the file, leaving it unreadable.

Combined with the removal of the ability to read the json files to push to s3, we now avoid always ignoring data when we push. Instead we just push the data after the experiment is ran. Note that we still ignore data if the buffers are full (should never happen assuming we don't log so much as to overwhelm the ssd and keep the buffers sufficiently high).

Lastly, this PR takes advantage of each file having its own buffer to also serialize and write data in parallel, which should further increase performance. We also replaced the mutex with a channel.

closes #1403

Obligatory Versioning Exclusion Note

While this PR is breaking in that it removes a lot of functionality, it doesn't break the config or comet RPC. This follows the practice that we've been doing with the trace package, which is treating it like a dependency. Meaning, this is like if we updated an external package from one major change to another, therefore we don't need to bump core's major release.

…eing only full writes occur

evan-forbes · 2024-07-29T12:11:08Z

pkg/trace/cached_file.go

+			_, err := f.flush(buffer)
+			if err != nil {
+				f.logger.Error("tracer failed to write buffered files to file", "error", err)
+			}
+			buffer = buffer[:0] // reset buffer


here we're continuing to treat the trace data as expendable and instead optimizing for never blocking. If there's an error writing to the file, then we simply log it and throw away the data.

If this is occurring, we either see it in the logs or the data itself.

cmwaters

👍

(I haven't looked into the implementation details of the cached file)

rootulp · 2024-08-05T18:06:47Z

pkg/trace/README.md

-#### Using environment variables for s3 bucket
-
-Alternatively, you can set the following environment variables:
+or using aws s3 (after setting up the aws cli ofc):


Suggested change

or using aws s3 (after setting up the aws cli ofc):

or using aws s3 (after [setting up the aws cli](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html)):

evan-forbes · 2024-08-15T04:51:10Z

converting to a draft as we're still seeing the issue where the last line doesn't get completely written, even when we sync the file after each flush 🤔

evan-forbes added 2 commits July 28, 2024 14:27

fix!: simplify the tracer by removing read functionality and guarante…

ab118aa

…eing only full writes occur

chore: clean up and optimize by not syncing too many times

3f7e302

evan-forbes added WS: Big Blonks 🔭 Improving consensus critical gossiping protocols fix WS: Maintenance 🔧 labels Jul 29, 2024

evan-forbes self-assigned this Jul 29, 2024

evan-forbes requested a review from a team as a code owner July 29, 2024 12:06

evan-forbes requested review from cmwaters, ninabarbakadze and staheri14 and removed request for a team July 29, 2024 12:06

evan-forbes commented Jul 29, 2024

View reviewed changes

chore: linter

99dd93e

evan-forbes changed the title ~~fix!: simplify tracing and only write data in complete batches~~ fix!(trace): simplify tracing and only write data in complete batches Jul 29, 2024

docs: update readme

2b66d3f

cmwaters previously approved these changes Jul 31, 2024

View reviewed changes

rootulp previously approved these changes Aug 5, 2024

View reviewed changes

evan-forbes and others added 3 commits August 12, 2024 07:54

Merge branch 'v0.34.x-celestia' into evan/fix-trace-files

a862881

Merge branch 'v0.34.x-celestia' into evan/fix-trace-files

c8dd1d5

fix: readd the sync to finalize flush

18ce0ea

evan-forbes dismissed stale reviews from cmwaters and rootulp via 18ce0ea August 15, 2024 01:27

evan-forbes marked this pull request as draft August 15, 2024 04:49

evan-forbes added 2 commits August 15, 2024 16:37

fix: try forcing the file to be opened with os.O_SYNC

9a476ab

fix: forgot to pass total

1d9d9bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix!(trace): simplify tracing and only write data in complete batches #1437

fix!(trace): simplify tracing and only write data in complete batches #1437

evan-forbes commented Jul 29, 2024 •

edited

Loading

evan-forbes Jul 29, 2024

cmwaters left a comment

rootulp Aug 5, 2024

evan-forbes commented Aug 15, 2024

	or using aws s3 (after setting up the aws cli ofc):
	or using aws s3 (after [setting up the aws cli](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html)):

fix!(trace): simplify tracing and only write data in complete batches #1437

Are you sure you want to change the base?

fix!(trace): simplify tracing and only write data in complete batches #1437

Conversation

evan-forbes commented Jul 29, 2024 • edited Loading

Description

Obligatory Versioning Exclusion Note

evan-forbes Jul 29, 2024

Choose a reason for hiding this comment

cmwaters left a comment

Choose a reason for hiding this comment

rootulp Aug 5, 2024

Choose a reason for hiding this comment

evan-forbes commented Aug 15, 2024

evan-forbes commented Jul 29, 2024 •

edited

Loading