Skip to content
This repository has been archived by the owner on May 3, 2024. It is now read-only.

BLAKE2b hash function and BLAKE2b-512 compression function of EIP-152 #56

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

AlekseiVambol
Copy link

@AlekseiVambol AlekseiVambol commented Jan 17, 2023

Issue: #51

Safe Rust implementation of the BLAKE2b hash function (as "digest") in accordance with RFC-7693 [1] and BLAKE2b-512 compression function (as "contract") described in EIP-152 [2]. The API and implementation details can be understood using only [1] for all functions except for "contract". Understanding this one also requires using [2].

References
[1] https://www.rfc-editor.org/rfc/rfc7693
[2] https://eips.ethereum.org/EIPS/eip-152

Copy link

@Brechtpd Brechtpd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

zkevm-circuits/src/blake2b/Cargo.toml Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines 5 to 7
fn gf<F:FieldExt>(n: u64) -> Expression<F> {
Expression::Constant(F::from(n))
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you import gadgets::util::Expr you can simply do n.expr()

Copy link

@Brechtpd Brechtpd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just some minor suggestions as feedback.

}

impl<F:FieldExt> ACell<F> {
fn express(&self, meta: &mut VirtualCells<F>) -> Expression<F> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn express(&self, meta: &mut VirtualCells<F>) -> Expression<F> {
fn expr(&self, meta: &mut VirtualCells<F>) -> Expression<F> {

Just because this is the way it's called in most other code in the project.

}

fn byte_cells_to_number<F:FieldExt>(meta: &mut VirtualCells<F>, cells: &[ACell<F>]) -> Expression<F> {
cells.iter().map(|c| c.express(meta)).rfold(gf(0), |e, b| { e * gf(256) + b })
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cells.iter().map(|c| c.express(meta)).rfold(gf(0), |e, b| { e * gf(256) + b })
cells.iter().map(|cell| cell.expr(meta)).rfold(0.expr(), |value, part| { value * gf(256) + part })

or something like that I think improves readability a bit, but yes it is a bit longer.

I see use a lot of single letter variable names elsewhere as well, making them a bit longer so they have some meaning I think is worth it.

for (c, b) in cells.iter().zip(bytes.iter()) {
region.assign_advice(|| name, c.column, row + (c.offset as usize), || Value::known(F::from(*b as u64)))?;
}
return Ok(());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return Ok(());
Ok(())

I rarely see return used in rust, so I think a good habit to not use it where possible

summands: &[&U64ACells<F>; S],
result: &U64ACells<F>,
carry: ACell<F>) -> Self {
let mut sum = result.to_vec(); sum.push(carry);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sneaky 2 statements on the same line!? Haha, I would prefer this to be on another line though. :)
Maybe a good place to use concat instead to make it a one liner:

let sum = [result.to_vec(), vec![carry]].concat();

(something like that should work I think, untested).

return AddGate {selector, sum: sum.try_into().unwrap() };
}

fn put(&self, region: &mut Region<F>, row: usize, summands: &[u64; S]) -> Result<u64, Error>{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn put(&self, region: &mut Region<F>, row: usize, summands: &[u64; S]) -> Result<u64, Error>{
fn assign(&self, region: &mut Region<F>, row: usize, summands: &[u64; S]) -> Result<u64, Error>{

also just the more commonly used name everywhere else.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more github friendly alternative to spreadsheet files is to put the table in a markdown file (.md).
You could use this tool to help you https://www.tablesgenerator.com/markdown_tables

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@vlopes11 vlopes11 self-requested a review July 4, 2023 09:42
Copy link

@vlopes11 vlopes11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! This is a preliminary review with a couple of suggestions for the off-circuit implementation & a couple of optional nits

zkevm-circuits/src/blake2b/circuit/Cargo.toml Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/Cargo.toml Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/hash/tests/integration.rs Outdated Show resolved Hide resolved
the value of SIGMA[i % 10][j] is replaced with j by p(i). On the other hand, per(SIGMA[i % 10]) replaces j with SIGMA[i % 10][j], therefore,
p(i) is per'(SIGMA[i % 10]). So P(i) = per'(SIGMA[(i + 1) % 10]) * per(SIGMA[i % 10]). The permutations computed by this formula are used by
RoundGate class gate to compute the values of m permuted for the next rounds without the need to access the value of m.
*/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: as per convention, we usually leave the first comment of a module for its documentation. Since this is lib.rs, it should be the crate documentation.

They are expected to be specific to the implementation of the module/crate. One good approach for valuable comments that gives further context is to create a separated markdown file, and link it as a read more

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description is specific to the implementation. It also introduces new notions and formulas with proofs, which are used and mentioned in further comments. I will consider the possible placing of this description into a separate file, but it is logically inseparable from other comments.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely keep the description, the question is mostly related to how usually rust module docs are inserted into a file (i.e. they are the first comment, as per convention)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your proposal? Should I write something else as the first comment and make the current first comment the second one? Or use "//!" for comments?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one alternative to include this on a markdown file and using #![doc = include_str!("../docs/xxx.md")] if we want to add it as documentation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave the first comment it as it is till for now. Perhaps, some comment will have to be added before it during integration.

zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
Copy link

@vlopes11 vlopes11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A second round of review. I validated the correctness of the implementation, and it LGTM.

I added a couple of observations & points as alternative approaches. We could proceed by discussing the trade-offs for such propositions.

zkevm-circuits/src/blake2b/hash/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/hash/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/hash/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
impl <F:FieldExt, const R:usize> CompressionCircuit<F,R> {
// Creates a CompressionCircuit for the specified binary logarithm
// of the circuit's height and BLAKE2b compression function inputs
pub fn new(k: u32, inputs: &[CompressionInput]) -> Self {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a generic IntoIterator<Item = CompressionInput> instead of a slice. Otherwise, we will force the user to have an allocated (likely heap) slice of inputs so we can create our own vec and reallocate it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do this - it takes ownership of the argument. For me it is undesirable behavior, so I will left it as it is.

zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/blake2b/circuit/src/lib.rs Outdated Show resolved Hide resolved
Copy link

@vlopes11 vlopes11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is critical to have circuit_check verify the expected test vector output; unless I missed something, it is not doing that currently

Comment on lines 36 to 45
// Tests the circuit instance, which has height 2^17 and processes the BLAKE2b
// compression function inputs corresponding to the EIP-152 test vectors 4-7
#[test]
fn circuit_check() {
let k = 17;
let vectors: Vec<CompressionInput> = EIP152_VECTORS.iter().map(|v| hex_to_input(v[0])).collect();
let circuit = CompressionCircuit::<Fr,128>::new(k, &vectors);
let prover = MockProver::run(k, &circuit, vec![]).unwrap();
prover.assert_satisfied();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing the part of this test that we assert the output equals the expected test vector output. Did I overlook something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we cannot obtain the output directly from the MockProver, we have to check it in a different way. The witnesses generated as the result of circuit synthesis are compared with the results computed by the function not used for witness generation:
// Checking the correctness of the computation
let (h_ex, rlc_ex) = RLCTable::compress_with_rlc(challenge, &input.h, &input.m, input.t, input.f, input.r);
let correctness = (h == h_ex) && (format!("{:?}", rlc) == format!("{:?}", rlc_ex));
assert!(correctness, "Processing of a BLAKE2b compression function input was incorrect! This input is {:?}", input);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a suggestion of a possible solution.

Given the blake2b circuit takes two advice inputs/witnesses (besides, ofc, the internal ones): the input of the hash, and its output.

First we fetch the input and output from the advice
$$(input, output) \gets \mathbb{A}$$

Then we compute the blake hash into result as the circuit synthesize
$$result \gets \mathbb{C}$$

Finally, we assert the computed result to be the expected witness
$$result \equiv output$$

This approach forces the prover to compute the expected output himself, and allows the users of the circuit to pass this output advice column to different circuits. With that, we can easily pass the expected input/output from the test vectors

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we do not need this, because in the current architecture the supercircuit will compute input/output pair (outside the circuit) during the interaction and hash them by means of random linear combination hash. The input/output pairs and their rlc hashes are also computed by the current circuit (inside it). The rlc hash computed by the supercircuit is going to be constrained to be into the lookup table containing all rlc's computed by the current circuit. Thus, if some i/o pair produced by this circuit is incorrect, then lookup constraint will not be satisfied.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood; however, it is very expensive to test the super circuit. In case a bug is introduced here, it is preferable to test the blake circuit individually rather than depending on the full integrated test.

This is ofc not a blocker, so I'll open an issue to add a unit test in the future

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do no need to test it in the supercircuit. The current test are sufficient. If MockProver doesn't panic during performing the test, this means that the computations performed during the circuit synthesis produce the expected output and all witness data satisfy the constraints. The correctness of the algorithm computing the expected compression function (outside the circuit) is checked by its own test.

@ggkitsas ggkitsas mentioned this pull request Jul 17, 2023
6 tasks
@vlopes11 vlopes11 mentioned this pull request Jul 20, 2023
Copy link

@vlopes11 vlopes11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good once the nits are fixed

This crate is not used in the circuit and is not going to be integrated in zk-EVM. It was the first step towards the circuit. Some code from this crate was a bit rewritten and added to the circuit-related files.
Deleted to be replaced with the updated version, which has new folder structure.
Cargo.toml:
1. Default comment was removed.
2. "hex = "0.4"" was added instead of "hex = "0.4.3"".

Cargo.lock: Recreated.

blake2.rs: The new file containing the "blake2b" module, almost all content of which (except for the "test" and "benchmark" submodules) is the same as the one of "lib.rs" just before it undergone the last change from the list below. The content of the "test" submodule is related to the one of "integration.rs" in the similar way. The same is true about the relation between the "test" submodule and "bench.rs".

lib.rs:
1. "Chunks" instead of "chunks" at the end of the comments for "fn create_chuncks".
2. "pub" was removed from many places: now the data types of the circuit's basic building blocks are not exported.
3. Trailing spaces were removed.
4. "Thus, 56 * 8 + 32 cells or 240 cell pairs are required" was added after "56 byte-column chunks of height 8 and 32 byte-column cells".
5. The comment "14 bit columns are required: 10 collumns for..." was added.
6. The end of "fn synthesize" was rewritten in order to check the correctness of the computation only if tests are run.
7  The "blake2" submodule was renamed into "compression".
8. The content is moved to the "blake2b" module and replaced with "mod blake2b;".

integration.rs:
1. Trailing spaces were removed.
2. The content is moved to the "test" submodule of the "blake2b" module; the file was deleted.

bench.rs:
1. Trailing spaces were removed.
2. The comment describing the correct way of running "fn bench()" was added.
3. The content is moved to the "benchmark" submodule of the "blake2b" module; the file was deleted.
Changing the comment describing the correct way of running "fn bench()".
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants