diff --git a/index.html b/index.html index 4f175ae..fd1c7c4 100644 --- a/index.html +++ b/index.html @@ -14,6 +14,7 @@

News

@@ -57,10 +58,10 @@

Overview

Important Dates

@@ -69,9 +70,9 @@

Rules

  • Open Format: This is an open competition. All participants are encouraged to share their methods upon conclusion of the competition, and outstanding submissions will be highlighted in a joint publication. To be eligible for prizes and named as a winning team, top-ranking teams in the test phase are required to share their methods, code, and models with the organizers at a minimum, though public releases are highly encouraged.
  • Registration: Double registration is not allowed. We expect teams to self-certify that all team members are not part of a different team registered for the competition, and we will actively monitor for violation of this rule. Teams may participate in multiple tracks. Organizers are not allowed to participate in the competition or win prizes.
  • Prize Distribution: Monetary prizes will be awarded to teams as specified in the Prizes page of the competition website.
  • -
  • Compute Limits: In both tracks, methods developed for the Base Model subtrack must run in at most 5 days on a single 80GB A100 GPUs to generate a submission for the test phase. For Large Model subtracks, methods must be able to run in at most 5 days on 8x 80GB A100 GPUs to generate a submission for the test phase. These limits do not include initial setup during the development phase. This is a soft requirement; we will only check it for the winning teams, and understandable reasons for exceeding the limit by a modest amount are acceptable. We expect that most methods will not require this much compute.
  • +
  • Compute Limits: In both tracks, methods developed for the Base Model subtrack must run in at most 2 A100 GPU-days to generate a submission for the test phase. For Large Model subtracks, methods must run in at most 4 A100 GPU-days to generate a submission for the test phase. These limits do not include initial setup during the development phase. This is a soft requirement; we will check it for the winning teams, but understandable reasons for exceeding the limit by a modest amount are acceptable.
  • Allowed Methods (Trojan Detection Track): The use of features that are clearly loopholes is not allowed (e.g., metadata). We may not anticipate all loopholes and we encourage participants to alert us to their existence. Legitimate features that do not constitute loopholes include all features derived from the trained parameters of networks, the target strings, training triggers, and text datasets. Similar to the Red Teaming Track, we also do not allow the submission of prompts that are intended to make LLMs copy-paste target strings from the prompt into the generation.
  • -
  • Allowed Methods (Red Teaming Track): Methods used for generating submitted test cases must satisfy the following requirements. (1) They must be able to generate any desired number of test cases, not just a fixed number of test cases. (2) They must be fully automated. After an initial setup, they should require no human effort to adapt to a new set of behaviors and no human-in-the-loop for generating test cases. (3) They must not use DeBERTaV3 models (to minimize overlap with the automated metrics). (4) They must cause LLMs to generate behaviors mostly from scratch. In particular, getting LLMs to copy-paste whole instances of undesirable behavior provided in the prompt does not count as generating undesirable behavior from scratch. (5) They must not intentionally boost the diversity metric in an artificial manner, e.g., by adding random text to the beginning of a prompt.
  • +
  • Allowed Methods (Red Teaming Track): Methods used for generating submitted test cases must satisfy the following requirements. (1) They must be able to generate any desired number of test cases, not just a fixed number of test cases. (2) They must be fully automated. After an initial setup, they should require no human effort to adapt to a new set of behaviors and no human-in-the-loop for generating test cases. (3) They must cause LLMs to generate behaviors mostly from scratch. In particular, getting LLMs to copy-paste whole instances of undesirable behavior provided in the prompt does not count as generating undesirable behavior from scratch. (4) They must not intentionally boost the diversity metric in an artificial manner, e.g., by adding random text to the beginning of a prompt.
  • Rule breaking may result in disqualification, and significant rule breaking will result in an ineligibility for prizes.
  • These rules are an initial set, and we require participants to consent to a change of rules if there is an urgent need during registration. If a situation should arise that was not anticipated, we will implement a fair solution, ideally using consensus of participants.