Skip to content
This repository has been archived by the owner on Jul 5, 2024. It is now read-only.

Bytecode, and a collection of bytecodes. #1529

Merged
merged 37 commits into from
Jul 24, 2023

Conversation

ChihChengLiang
Copy link
Collaborator

@ChihChengLiang ChihChengLiang commented Jul 13, 2023

Description

We revisit the handling of the bytecodes in the codebase. We reduce manual manipulation of bytes and use a more streamlined API with eth_types::Bytecode and bus_mapping::CodeDB.

Issue Link

Missed out work in #1391

Type of change

Refactor

Contents

  • Removed witness::bytecode module.
  • Extend eth_types::Bytecode and bus_mapping::CodeDB.
    • eth_types::Bytecode handles a single bytecode instance, and
    • CodeDB handles a group of bytecode instances
  • rewrite the bytecode circuit witness assignment.
  • rewrite the bytecode circuit tests, so that overwriting will not be included in the production code.
  • deduplicated the Mock account testing code.

Rationale

  • I kept the scope of the change small to reduce the review burden.
  • I will add more rationale with review comments.

@github-actions github-actions bot added crate-bus-mapping Issues related to the bus-mapping workspace member crate-circuit-benchmarks Issues related to the circuit-benchmarks workspace member crate-eth-types Issues related to the eth-types workspace member crate-zkevm-circuits Issues related to the zkevm-circuits workspace member T-bench Type: benchmark improvements labels Jul 13, 2023
@ChihChengLiang ChihChengLiang marked this pull request as ready for review July 13, 2023 16:23
bus-mapping/src/state_db.rs Outdated Show resolved Hide resolved
.code()
.iter()
.cloned()
.chain(0u8..((32 - len % 32) as u8))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep the original code from tracer tests. But I keep wondering if we should pad only zeros but not 0...something

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right, This should a list of zeroes, not a sequence!
I guess most of the times it works because the first zero would be a STOP, so the rest are ignored. And also the although in memory it's multiples of 32, when using this bytecode in memory a length is specified, so the padding will be unused. But I still think it's confusing not having just zeroes!

I think it's worth changing this.

eth-types/src/geth_types.rs Outdated Show resolved Hide resolved
Comment on lines 81 to 83
std::borrow::Cow::Owned(Vec::new())
vec![]
} else {
std::borrow::Cow::Borrowed(&builder.code_db.0[&actual.code_hash])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why we currently need std::borrow::Cow::Owned and std::borrow::Cow::Borrowed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the usage of Borrowed/Owned It's can allow auto lazily clone only on write happened. It's for memory usage optimisation. Maybe it's still valuable to keep it in this place since bytecode might be pretty longer and introduce memory overhead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this with @ed255. We concluded that the Cow was there to prevent cloning twice.

  • Cow is used to keep actual_code a reference to the byte vector so that we don't copy bytecodes from codeDB here.
  • The line found: Bytes::from(actual_code.to_vec()), below clones bytes with the to_vec method.

I resolved with another method that only clones once.

challenges: &Challenges<Value<F>>,
fail_fast: bool,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed fail_fast because it was never triggered.

@@ -484,13 +567,12 @@ impl<F: Field> BytecodeCircuitConfig<F> {
&self,
layouter: &mut impl Layouter<F>,
size: usize,
witness: &[UnrolledBytecode<F>],
overwrite: &UnrolledBytecode<F>,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overwrite is gone in the assignment code.

@hero78119 hero78119 self-requested a review July 20, 2023 06:25
Copy link
Member

@hero78119 hero78119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just have a first pass, all changes outside bytecode circuit LGTM! thanks for using new api to clean up codebase quite a lot!

Will have second review on bytecode circuit logic thoroughly sooner.

eth-types/src/geth_types.rs Outdated Show resolved Hide resolved
Comment on lines 81 to 83
std::borrow::Cow::Owned(Vec::new())
vec![]
} else {
std::borrow::Cow::Borrowed(&builder.code_db.0[&actual.code_hash])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the usage of Borrowed/Owned It's can allow auto lazily clone only on write happened. It's for memory usage optimisation. Maybe it's still valuable to keep it in this place since bytecode might be pretty longer and introduce memory overhead

bus-mapping/src/state_db.rs Outdated Show resolved Hide resolved
bus-mapping/src/state_db.rs Show resolved Hide resolved
@ChihChengLiang ChihChengLiang force-pushed the prune-bytecode-before-bytecode-circuit branch from d9f720a to 84bc322 Compare July 21, 2023 14:12
@ed255 ed255 self-requested a review July 24, 2023 12:39
Copy link
Member

@hero78119 hero78119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The rewrite of bytecode witness assignment and unify padding/normal row into single set_row api is awesome, plus the bytecode test api reorganized.

Just have few nitpick regarding to function naming, and return type suggestion.

bus-mapping/src/state_db.rs Show resolved Hide resolved
bus-mapping/src/state_db.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/bytecode_circuit/test.rs Outdated Show resolved Hide resolved
Copy link
Member

@ed255 ed255 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM! Very nice clean up 🎉
I left 3 comments, please take a look!

eth-types/src/bytecode.rs Outdated Show resolved Hide resolved
.code()
.iter()
.cloned()
.chain(0u8..((32 - len % 32) as u8))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right, This should a list of zeroes, not a sequence!
I guess most of the times it works because the first zero would be a STOP, so the rest are ignored. And also the although in memory it's multiples of 32, when using this bytecode in memory a length is specified, so the padding will be unused. But I still think it's confusing not having just zeroes!

I think it's worth changing this.

address: Address::repeat_byte(0xff),
code: code.into(),
nonce: U64::from(!is_empty as u64),
balance: if is_empty { 0 } else { 0xdeadbeefu64 }.into(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this I think this function generates an account that is either empty, or has code and positive balance. So I would change the function documentation accordingly, and maybe rename the function to mock_code_balance, or something that doesn't imply "code or balance".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! I renamed the function and updated the documentation 2228e4d

@ChihChengLiang ChihChengLiang force-pushed the prune-bytecode-before-bytecode-circuit branch from 84bc322 to 2228e4d Compare July 24, 2023 14:46
Copy link
Member

@ed255 ed255 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great clean up 👯‍♂️

@ChihChengLiang ChihChengLiang added this pull request to the merge queue Jul 24, 2023
Merged via the queue into main with commit 8a633f7 Jul 24, 2023
11 checks passed
@ChihChengLiang ChihChengLiang deleted the prune-bytecode-before-bytecode-circuit branch July 24, 2023 16:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
crate-bus-mapping Issues related to the bus-mapping workspace member crate-circuit-benchmarks Issues related to the circuit-benchmarks workspace member crate-eth-types Issues related to the eth-types workspace member crate-zkevm-circuits Issues related to the zkevm-circuits workspace member T-bench Type: benchmark improvements
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants