types: serialize: constrain the new serialization traits to make them easier and safer to use #855

piodul · 2023-10-26T09:56:01Z

Currently, SerializeRow and SerializeCql traits are just given a mutable reference to a Vec and asked to append their CQL representation to the end. While simple, there are some issues with the interface:

The serialize method has access to the serialized representation of the values that were appended before it. It's not necessary for a correct implementation to have access to it.
Implementors technically can append any byte sequence to the end, but actually are expected to produce a CQL [value] containing the serialized value.

While the SerializeRow and SerializeCql traits are not generally meant to be manually implemented by the users, we can make the interface easier to use and harder to misuse by making it append-only, restricting what the users are allowed to append and requiring the users to append anything by using a dash of type-level magic.

Introduce RowWriter and CellWriter traits which satisfy the above wishes and constraints, and pass them instead of Vec in SerializeRow and SerializeCql.

The new traits have two implementations - a Vec backed one that actually appends the bytes given to it, and a usize-backed one which just measures the length of the output without writing anything. Passing the latter before doing the actual serialization will allow to preallocate the right amount of bytes and then serialize without reallocations. It should be measured whether the reallocation cost always outweighs the calculation cost before implementing this optimization.

Refs: #801

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I have provided docstrings for the public items that I want to introduce.
~~I have adjusted the documentation in ./docs/source/.~~
~~I added appropriate Fixes: annotations to PR description.~~

piodul · 2023-11-16T09:34:58Z

@Lorak-mmk review ping

piodul · 2023-11-17T13:57:12Z

v2:

More docstrings
RowWriter now does not prepend what it wrote with a u16 value indicating the count of the values written
Some items that were accidentally imported via _macro_internal path in tests are now imported through a proper path

piodul · 2023-11-17T14:34:26Z

v2.1:

Rebased
Addressed clippy's complaints about needless std::mem::drop

Introduce the `read_value` function which is able to read a [value], as specified in the CQL protocol. It will be used in the next commit, in order to make the interface of the SerializedValue iterators more correct.

Currently, the SerializedValues' `iter()` method treats both null and unset values as None, and `iter_name_value_pairs()` just assumes that values are never null/unset and panics if they are. Make the interface more correct by adjusting both methods to return RawValue. The iterators will be used in the next commit to implement the fallback that allows to implement `SerializeRow`/`SerializeCql` via legacy `ValueList`/`Value` traits.

Lorak-mmk · 2023-11-23T12:16:21Z

Actually I have one question: is mod.rs the best place for those new traits and structs? I don't really know Rust conventions in this matter, but it is quite a lot of code, wouldn't it be better to make a new file for it?

piodul · 2023-11-23T12:25:47Z

Actually I have one question: is mod.rs the best place for those new traits and structs? I don't really know Rust conventions in this matter, but it is quite a lot of code, wouldn't it be better to make a new file for it?

Makes sense, I can move the CellWriter and friends to a separate module.

piodul · 2023-11-23T13:45:48Z

v3: moved the newly introduced types and traits to a separate module (they are reexported from the place they were defined previously)

… interfaces Currently, `SerializeRow` and `SerializeCql` traits are just given a mutable reference to a Vec<u8> and asked to append their CQL representation to the end. While simple, there are some issues with the interface: - The serialize method has access to the serialized representation of the values that were appended before it. It's not necessary for a correct implementation to have access to it. - Implementors technically can append any byte sequence to the end, but actually are expected to produce a CQL [value] containing the serialized value. While the `SerializeRow` and `SerializeCql` traits are not generally meant to be manually implemented by the users, we can make the interface easier to use and harder to misuse by making it append-only, restricting what the users are allowed to append and requiring the users to append anything by using a dash of type-level magic. Introduce `RowWriter` and `CellWriter` traits which satisfy the above wishes and constraints, and pass them instead of Vec<u8> in `SerializeRow` and `SerializeCql`. The new traits have two implementations - a Vec<u8> backed one that actually appends the bytes given to it, and a usize-backed one which just measures the length of the output without writing anything. Passing the latter before doing the actual serialization will allow to preallocate the right amount of bytes and then serialize without reallocations. It should be measured whether the reallocation cost always outweighs the calculation cost before implementing this optimization.

piodul · 2023-11-23T13:47:56Z

v3.1: fixed links in the docs

Lorak-mmk

I think there is nothing blocking this, do you think we can merge it?

piodul · 2023-11-24T17:28:44Z

I think there is nothing blocking this, do you think we can merge it?

Yes, I'll go ahead and merge it.

piodul requested a review from Lorak-mmk October 26, 2023 09:56

piodul force-pushed the adjust-interface branch 2 times, most recently from 281cabd to 25677ba Compare October 27, 2023 07:25

Lorak-mmk mentioned this pull request Nov 12, 2023

Switch Session to new serialization traits #858

Merged

12 tasks

piodul force-pushed the adjust-interface branch from 25677ba to cc2476c Compare November 17, 2023 13:56

piodul force-pushed the adjust-interface branch 2 times, most recently from e672d2f to 7038cbc Compare November 17, 2023 14:33

piodul added 2 commits November 20, 2023 01:57

types: introduce read_value

7767f17

Introduce the `read_value` function which is able to read a [value], as specified in the CQL protocol. It will be used in the next commit, in order to make the interface of the SerializedValue iterators more correct.

piodul force-pushed the adjust-interface branch from 7038cbc to 448acd2 Compare November 20, 2023 00:58

Lorak-mmk approved these changes Nov 23, 2023

View reviewed changes

piodul force-pushed the adjust-interface branch from 448acd2 to a00b284 Compare November 23, 2023 13:44

piodul force-pushed the adjust-interface branch from a00b284 to 29a37b4 Compare November 23, 2023 13:47

piodul requested a review from Lorak-mmk November 23, 2023 14:56

Lorak-mmk approved these changes Nov 24, 2023

View reviewed changes

piodul merged commit 46e33c9 into scylladb:main Nov 24, 2023
8 checks passed

piodul mentioned this pull request Nov 30, 2023

Serialization refactor: macros for value/row serialization #851

Merged

8 tasks

Lorak-mmk mentioned this pull request Dec 21, 2023

Serialization refactor: add new serialization traits #801

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

types: serialize: constrain the new serialization traits to make them easier and safer to use #855

types: serialize: constrain the new serialization traits to make them easier and safer to use #855

piodul commented Oct 26, 2023

piodul commented Nov 16, 2023

piodul commented Nov 17, 2023

piodul commented Nov 17, 2023

Lorak-mmk commented Nov 23, 2023

piodul commented Nov 23, 2023

piodul commented Nov 23, 2023

piodul commented Nov 23, 2023

Lorak-mmk left a comment

piodul commented Nov 24, 2023

types: serialize: constrain the new serialization traits to make them easier and safer to use #855

types: serialize: constrain the new serialization traits to make them easier and safer to use #855

Conversation

piodul commented Oct 26, 2023

piodul commented Nov 16, 2023

piodul commented Nov 17, 2023

piodul commented Nov 17, 2023

Lorak-mmk commented Nov 23, 2023

piodul commented Nov 23, 2023

piodul commented Nov 23, 2023

piodul commented Nov 23, 2023

Lorak-mmk left a comment

Choose a reason for hiding this comment

piodul commented Nov 24, 2023