rust: expose api to write messages directly #693

neilisaac · 2022-11-01T20:53:31Z

Currently Writer::write_to_known_channel takes a byte slice argument, so messages must be encoded into a buffer before writing to the log. In order to efficiently write protobuf messages to mcap (#688) without encoding into a vec first, we would like the Writer to provide a method that exposes std::io::Write.

Writer could potentially expose a method to borrow the a Write object ex. Writer<W>::message_writer(&'a mut self, channel_id, sequence, log_time, publish_time) -> MessageWriter<'a, W> where MessageWriter implements Write and computes the message length for you. This would allow us to encode protobuf messages using protobuf::MessageDyn::write_to_writer_dyn(&self, w: &mut dyn Write).

The text was updated successfully, but these errors were encountered:

james-rms · 2022-11-01T22:16:06Z

Related to #658 - when writing in chunks you either need to:

maintain a buffer for the chunk content in progress, then once the chunk is complete write the full chunk record at once
Write a dummy chunk header to the file, then write the chunk content directly to the file, then once the chunk is done, seek back to the chunk header, and rewrite the header content with the now-known record length, compressed length, uncompressed length, CRC etc.

Option 1 precludes zero-copy writing, though agree we could potentially reduce the number of copies to 1.
Option 2 means we diverge from our our append-only writing strategy, meaning if the writer dies we can be left with a corrupt chunk record at the end of the file. This might still be reasonable, since the corruption is limited to the last chunk which would not be written at all in the (1) case.

For this reason I feel zero-copy writing is best suited for writing un-chunked MCAPs, which have the disadvantage that their messages aren't indexed.

neilisaac · 2022-11-01T22:36:00Z

@james-rms understood. My benchmarking numbers aren't fully verified yet, but this might be helpful to motivate this ask.

I benchmarked this by writing an uncompressed mcap file (containing protobuf PointCloud messages) to memory (tmpfs) using this library, and well as my own mcap implementation (which doesn't currently support chunks or compression), and found this library to be about 50% slower. I'm guessing it's due chunked two copy vs unchunked zero copy.

Using chunking is desirable because I'd like to use zstd compression too. I'm not sure whether eliminating the chunk buffer is viable with compression or not though.

neilisaac added the feature New feature or request label Nov 1, 2022

neilisaac mentioned this issue Nov 1, 2022

rust: optional protobuf schema and direct protobuf message writing support #688

Open

james-rms added the rust Related to the rust implementation label Nov 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rust: expose api to write messages directly #693

rust: expose api to write messages directly #693

neilisaac commented Nov 1, 2022

james-rms commented Nov 1, 2022

neilisaac commented Nov 1, 2022 •

edited

Loading

rust: expose api to write messages directly #693

rust: expose api to write messages directly #693

Comments

neilisaac commented Nov 1, 2022

james-rms commented Nov 1, 2022

neilisaac commented Nov 1, 2022 • edited Loading

neilisaac commented Nov 1, 2022 •

edited

Loading