-
Notifications
You must be signed in to change notification settings - Fork 125
BLAKE2b hash function and BLAKE2b-512 compression function of EIP-152 #56
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
The code I have shown 30/01/23.
fn gf<F:FieldExt>(n: u64) -> Expression<F> { | ||
Expression::Constant(F::from(n)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you import gadgets::util::Expr
you can simply do n.expr()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Just some minor suggestions as feedback.
} | ||
|
||
impl<F:FieldExt> ACell<F> { | ||
fn express(&self, meta: &mut VirtualCells<F>) -> Expression<F> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn express(&self, meta: &mut VirtualCells<F>) -> Expression<F> { | |
fn expr(&self, meta: &mut VirtualCells<F>) -> Expression<F> { |
Just because this is the way it's called in most other code in the project.
} | ||
|
||
fn byte_cells_to_number<F:FieldExt>(meta: &mut VirtualCells<F>, cells: &[ACell<F>]) -> Expression<F> { | ||
cells.iter().map(|c| c.express(meta)).rfold(gf(0), |e, b| { e * gf(256) + b }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cells.iter().map(|c| c.express(meta)).rfold(gf(0), |e, b| { e * gf(256) + b }) | |
cells.iter().map(|cell| cell.expr(meta)).rfold(0.expr(), |value, part| { value * gf(256) + part }) |
or something like that I think improves readability a bit, but yes it is a bit longer.
I see use a lot of single letter variable names elsewhere as well, making them a bit longer so they have some meaning I think is worth it.
for (c, b) in cells.iter().zip(bytes.iter()) { | ||
region.assign_advice(|| name, c.column, row + (c.offset as usize), || Value::known(F::from(*b as u64)))?; | ||
} | ||
return Ok(()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return Ok(()); | |
Ok(()) |
I rarely see return
used in rust, so I think a good habit to not use it where possible
summands: &[&U64ACells<F>; S], | ||
result: &U64ACells<F>, | ||
carry: ACell<F>) -> Self { | ||
let mut sum = result.to_vec(); sum.push(carry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sneaky 2 statements on the same line!? Haha, I would prefer this to be on another line though. :)
Maybe a good place to use concat
instead to make it a one liner:
let sum = [result.to_vec(), vec![carry]].concat();
(something like that should work I think, untested).
return AddGate {selector, sum: sum.try_into().unwrap() }; | ||
} | ||
|
||
fn put(&self, region: &mut Region<F>, row: usize, summands: &[u64; S]) -> Result<u64, Error>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn put(&self, region: &mut Region<F>, row: usize, summands: &[u64; S]) -> Result<u64, Error>{ | |
fn assign(&self, region: &mut Region<F>, row: usize, summands: &[u64; S]) -> Result<u64, Error>{ |
also just the more commonly used name everywhere else.
…ake2b/circuit/Cargo.toml
…ake2b/hash/src/lib.rs
…s/src/blake2b/hash/Cargo.toml
…its/src/blake2b/hash/tests/integration.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more github friendly alternative to spreadsheet files is to put the table in a markdown file (.md).
You could use this tool to help you https://www.tablesgenerator.com/markdown_tables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! This is a preliminary review with a couple of suggestions for the off-circuit implementation & a couple of optional nits
the value of SIGMA[i % 10][j] is replaced with j by p(i). On the other hand, per(SIGMA[i % 10]) replaces j with SIGMA[i % 10][j], therefore, | ||
p(i) is per'(SIGMA[i % 10]). So P(i) = per'(SIGMA[(i + 1) % 10]) * per(SIGMA[i % 10]). The permutations computed by this formula are used by | ||
RoundGate class gate to compute the values of m permuted for the next rounds without the need to access the value of m. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: as per convention, we usually leave the first comment of a module for its documentation. Since this is lib.rs
, it should be the crate documentation.
They are expected to be specific to the implementation of the module/crate. One good approach for valuable comments that gives further context is to create a separated markdown file, and link it as a read more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description is specific to the implementation. It also introduces new notions and formulas with proofs, which are used and mentioned in further comments. I will consider the possible placing of this description into a separate file, but it is logically inseparable from other comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should definitely keep the description, the question is mostly related to how usually rust module docs are inserted into a file (i.e. they are the first comment, as per convention)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is your proposal? Should I write something else as the first comment and make the current first comment the second one? Or use "//!" for comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one alternative to include this on a markdown file and using #![doc = include_str!("../docs/xxx.md")]
if we want to add it as documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave the first comment it as it is till for now. Perhaps, some comment will have to be added before it during integration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A second round of review. I validated the correctness of the implementation, and it LGTM.
I added a couple of observations & points as alternative approaches. We could proceed by discussing the trade-offs for such propositions.
impl <F:FieldExt, const R:usize> CompressionCircuit<F,R> { | ||
// Creates a CompressionCircuit for the specified binary logarithm | ||
// of the circuit's height and BLAKE2b compression function inputs | ||
pub fn new(k: u32, inputs: &[CompressionInput]) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a generic IntoIterator<Item = CompressionInput>
instead of a slice. Otherwise, we will force the user to have an allocated (likely heap) slice of inputs so we can create our own vec and reallocate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do this - it takes ownership of the argument. For me it is undesirable behavior, so I will left it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is critical to have circuit_check
verify the expected test vector output; unless I missed something, it is not doing that currently
// Tests the circuit instance, which has height 2^17 and processes the BLAKE2b | ||
// compression function inputs corresponding to the EIP-152 test vectors 4-7 | ||
#[test] | ||
fn circuit_check() { | ||
let k = 17; | ||
let vectors: Vec<CompressionInput> = EIP152_VECTORS.iter().map(|v| hex_to_input(v[0])).collect(); | ||
let circuit = CompressionCircuit::<Fr,128>::new(k, &vectors); | ||
let prover = MockProver::run(k, &circuit, vec![]).unwrap(); | ||
prover.assert_satisfied(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing the part of this test that we assert the output equals the expected test vector output. Did I overlook something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we cannot obtain the output directly from the MockProver, we have to check it in a different way. The witnesses generated as the result of circuit synthesis are compared with the results computed by the function not used for witness generation:
// Checking the correctness of the computation
let (h_ex, rlc_ex) = RLCTable::compress_with_rlc(challenge, &input.h, &input.m, input.t, input.f, input.r);
let correctness = (h == h_ex) && (format!("{:?}", rlc) == format!("{:?}", rlc_ex));
assert!(correctness, "Processing of a BLAKE2b compression function input was incorrect! This input is {:?}", input);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a suggestion of a possible solution.
Given the blake2b
circuit takes two advice inputs/witnesses (besides, ofc, the internal ones): the input of the hash, and its output.
First we fetch the input and output from the advice
Then we compute the blake hash into result as the circuit synthesize
Finally, we assert the computed result to be the expected witness
This approach forces the prover to compute the expected output himself, and allows the users of the circuit to pass this output
advice column to different circuits. With that, we can easily pass the expected input/output from the test vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we do not need this, because in the current architecture the supercircuit will compute input/output pair (outside the circuit) during the interaction and hash them by means of random linear combination hash. The input/output pairs and their rlc hashes are also computed by the current circuit (inside it). The rlc hash computed by the supercircuit is going to be constrained to be into the lookup table containing all rlc's computed by the current circuit. Thus, if some i/o pair produced by this circuit is incorrect, then lookup constraint will not be satisfied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood; however, it is very expensive to test the super circuit. In case a bug is introduced here, it is preferable to test the blake circuit individually rather than depending on the full integrated test.
This is ofc not a blocker, so I'll open an issue to add a unit test in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do no need to test it in the supercircuit. The current test are sufficient. If MockProver doesn't panic during performing the test, this means that the computations performed during the circuit synthesis produce the expected output and all witness data satisfy the constraints. The correctness of the algorithm computing the expected compression function (outside the circuit) is checked by its own test.
Co-authored-by: Victor Lopes <[email protected]>
Co-authored-by: Victor Lopes <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good once the nits are fixed
This crate is not used in the circuit and is not going to be integrated in zk-EVM. It was the first step towards the circuit. Some code from this crate was a bit rewritten and added to the circuit-related files.
Deleted to be replaced with the updated version, which has new folder structure.
Cargo.toml: 1. Default comment was removed. 2. "hex = "0.4"" was added instead of "hex = "0.4.3"". Cargo.lock: Recreated. blake2.rs: The new file containing the "blake2b" module, almost all content of which (except for the "test" and "benchmark" submodules) is the same as the one of "lib.rs" just before it undergone the last change from the list below. The content of the "test" submodule is related to the one of "integration.rs" in the similar way. The same is true about the relation between the "test" submodule and "bench.rs". lib.rs: 1. "Chunks" instead of "chunks" at the end of the comments for "fn create_chuncks". 2. "pub" was removed from many places: now the data types of the circuit's basic building blocks are not exported. 3. Trailing spaces were removed. 4. "Thus, 56 * 8 + 32 cells or 240 cell pairs are required" was added after "56 byte-column chunks of height 8 and 32 byte-column cells". 5. The comment "14 bit columns are required: 10 collumns for..." was added. 6. The end of "fn synthesize" was rewritten in order to check the correctness of the computation only if tests are run. 7 The "blake2" submodule was renamed into "compression". 8. The content is moved to the "blake2b" module and replaced with "mod blake2b;". integration.rs: 1. Trailing spaces were removed. 2. The content is moved to the "test" submodule of the "blake2b" module; the file was deleted. bench.rs: 1. Trailing spaces were removed. 2. The comment describing the correct way of running "fn bench()" was added. 3. The content is moved to the "benchmark" submodule of the "blake2b" module; the file was deleted.
Changing the comment describing the correct way of running "fn bench()".
Issue: #51
Safe Rust implementation of the BLAKE2b hash function (as "digest") in accordance with RFC-7693 [1] and BLAKE2b-512 compression function (as "contract") described in EIP-152 [2]. The API and implementation details can be understood using only [1] for all functions except for "contract". Understanding this one also requires using [2].
References
[1] https://www.rfc-editor.org/rfc/rfc7693
[2] https://eips.ethereum.org/EIPS/eip-152