Standardizing judging criteria for `partial-XX` tag #163

McCoady · 2024-09-10T18:34:34Z

In the current system there's no agreed upon method used by judges when handing out partial-XX tags.

Most judges use the tag very rarely, while other judges tend to use it more liberally. If you look at various finalised contest repos you'll find some contests where 0-5 issues are marked partial-XX while others have 50+.

Previously this was a somewhat arbitrary decision which likely didn't have a major effect on a warden's overall payout. However, given the (somewhat) recent addition of hunter/gatherer rewards, issues marked with a partial-XX tag are invalid for the hunter/gatherer rewards, therefore being issued a partial-XX tag can cause a warden to miss out on either/both rewards.

Personally I prefer the option where the tags are used sparingly and the majority of issues are deemed either satisfactory/unsatisfactory, however I'd welcome the opinions of the various C4 judges on the matter. 😀

The text was updated successfully, but these errors were encountered:

3docSec · 2024-09-11T12:29:07Z

I'm seeing it as you do - something to use very sparingly, because the concept of "valid" report covers the vast majority of cases as-is: reports "a) found the bug and b) identified how it's relevant".

The exception to that, and the situation where I'd seriously consider partials, is when vulnerabilities are made of multiple, somewhat independent, aspects.

Example 1, classic & easy: the case of "multiple instances of the same vulnerability grouped under a single primary", as described here. Contract A and B are not handling fee-on-transfer tokens well. Report X identifies the vulnerability in contracts A and B, report Y only in A => X becomes primary, Y becomes dupe at partial-50.

Example 2, rare: "two L vulnerabilities" that are reported as chained into an "H-impact attack". Imagine a different report that identifies one of the two L's and not the other one nor the H-impact chaining attack: this one deserves "some credit" - because fixing a single L makes the H attack non-viable anymore, but at the same time, it can't deserve "full credit" because it misses the second mitigation and any indication to the true urgency of the vulnerability. Here I'd rather go with a partial-25.

alex-ppg · 2024-09-11T15:05:11Z

I personally believe that partial rewards should be imposed strictly as it incentivizes wardens to work less on how their submissions look and argue their validity. The C4 guidelines are explicit and state that a submission of a C4 contest is what a reader would expect to find in an audit report, and any submission that does not fall in that category should be penalized IMO.

I fall in the "liberally" crowd as I tend to be quite strict with my judgments, and these are my personal requirements which I believe could become somewhat standardized:

QA Upgraded Issues

Any submission that was submitted as a QA report but was of HM severity will acquire a corresponding penalty due to a mis-assessment of the exhibit's impact. As an auditor myself, it is quite frequent that projects simply acknowledge low-severity issues and remediate high and medium-severity issues meaning that this particular distinction (as the reward pool also indicates) is very important to ensure a vulnerability is given the attention that it needs.

Impact Identification

A submission that is correct in pointing out a vulnerability in the code but fails to adequately rationalize it should be penalized as otherwise we incentivize users to submit poorly thought-out submissions. For example, a submission that details a token ID of 0 should be prevented in the staking system but fails to state that it would lead to some form of financial impact should be penalized. Obvious vulnerabilities should be somewhat less stringent on this penalty (i.e. a 25% percent one) versus highly nuanced ones.

Relative Quality

A restriction of the C4 contest format is that a single submission must be made primary out of a particular duplicate set. As such, it is not possible to impose a uniform penalty based on quality. As such, I personally apply a comparative quality penalty instead. If all submissions in a particular duplicate set are of low quality I do not penalize this particular trait, however, if the submission set contains 2-3 exemplary submissions amongst the duplicate set I am inclined to incentivize these submissions in relation to the reward they acquire.

After all, we should strive to increase the quality of the reports C4 ultimately produces and one way to achieve this is to "benefit" high-quality submissions and not only the primary ones. We can achieve this indirectly by penalizing lower-quality reports.

There are several edge cases to consider and other instances that should still fall under the discretion of each judge, however, it might be good for both judges and wardens to have some examples of why a particular submission was penalized which the judges could reference to further substantiate their judgment.

Brivan-26 · 2024-09-12T12:46:15Z

Hi @alex-ppg I believe the following POV is very strict:

Any submission that was submitted as a QA report but was of HM severity will acquire a corresponding penalty due to a mis-assessment of the exhibit's impact.

Some issues are not clear whether they are of Medium severity or Low severity and require some subjectivity from the judge. There are issues from past contests that were argued whether they are Low or Medium and ended up being Medium by the judge's decision, so penalizing those who submit the issue as QA is unfair.

Concerning the Relative Quality section, I believe the insufficient quality report tag helps with that and if there have to be penalties for such reports, it should be applied on the insufficient quality report tag level instead of applying the partial-XX credits

alex-ppg · 2024-09-19T13:17:25Z

Hey @Brivan-26, thank you for your feedback. If a finding is borderline medium and requires a lot of pieces to fall in place for it to be considered as such then the penalty may or may not be applied. Keep in mind that these are guidelines rather than precise rulesets, and penalization is and should remain up to the discretion of each judge.

Personally, the penalty should be applied in all cases as a finding that was submitted only as part of a QA report but is medium has a very high likelihood of being skipped over. The main reason it became part of the HM pool is because another warden was able to better articulate its impact, and thus it does not deserve a full reward IMHO.

On the other hand, simply applying the insufficient quality report for wardens who have successfully pinpointed a finding but were unable to articulate it properly is overly strict and discourages newcomers to the space. We are meant to help these people acquire experience and motivation for the space, and stripping them of their entire reward instead of penalizing them is not going to help with that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardizing judging criteria for `partial-XX` tag #163

Standardizing judging criteria for `partial-XX` tag #163

McCoady commented Sep 10, 2024

3docSec commented Sep 11, 2024

alex-ppg commented Sep 11, 2024 •

edited

Loading

Brivan-26 commented Sep 12, 2024

alex-ppg commented Sep 19, 2024

Standardizing judging criteria for partial-XX tag #163

Standardizing judging criteria for partial-XX tag #163

Comments

McCoady commented Sep 10, 2024

3docSec commented Sep 11, 2024

alex-ppg commented Sep 11, 2024 • edited Loading

QA Upgraded Issues

Impact Identification

Relative Quality

Brivan-26 commented Sep 12, 2024

alex-ppg commented Sep 19, 2024

Standardizing judging criteria for `partial-XX` tag #163

Standardizing judging criteria for `partial-XX` tag #163

alex-ppg commented Sep 11, 2024 •

edited

Loading