adding runner descriptions

ucl-dark · Oct 24, 2023 · 409e7d6 · 409e7d6
1 parent c6452ee
commit 409e7d6
Showing 1 changed file with 55 additions and 0 deletions.
diff --git a/docs/getting-started/runners.md b/docs/getting-started/runners.md
@@ -23,6 +23,61 @@ In order for this approach to work the observation vector needs to include one e
 
 See [this experiment](https://github.com/akbir/pax/blob/9d3fa62e34279a338c07cffcbf208edc8a95e7ba/pax/conf/experiment/rice/weight_sharing.yaml) for an example of how to configure it.
 
+## Evo Hardstop
+
+The Evo Runner optimizes the first agent using evolutionary learning. 
+This runner stops the learning of an opponent during training, corresponds to the hardstop challenge of Shaper.
+
+See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.
+
+## Evo Scanned
+
+The Evo Runner optimizes the first agent using evolutionary learning. 
+Here we also scan over the evolutionary steps, which makes compilation longer, training shorter and logging stats is not possible.
+
+See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.
+
+## Evo Mixed LR Runner (experimental)
+
+The Evo Runner optimizes the first agent using evolutionary learning. 
+This runner randomly samples learning rates for the opponents.
+
+See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.
+
+## Evo Mixed Payoff (experimental)
+
+The Evo Runner optimizes the first agent using evolutionary learning. 
+Payoff matrix is randomly sampled at each rollout. Each opponent has a different payoff matrix.
+
+See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.
+
+## Evo Mixed Payoff Gen (experimental)
+
+The Evo Runner optimizes the first agent using evolutionary learning. 
+Payoff matrix is randomly sampled at each rollout. Each opponent has the same payoff matrix.
+
+See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.
+
+## Evo Mixed IPD Payoff (experimental)
+
+The Evo Runner optimizes the first agent using evolutionary learning. 
+This runner randomly samples payoffs that follow Iterated Prisoner's Dilemma [constraints](https://en.wikipedia.org/wiki/Prisoner%27s_dilemma).
+
+See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.
+
+## Evo Mixed Payoff Input (experimental)
+
+The Evo Runner optimizes the first agent using evolutionary learning. 
+Payoff matrix is randomly sampled at each rollout. Each opponent has the same payoff matrix. The payoff matrix is observed as input to the agent.
+
+See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.
+
+## Evo Mixed Payoff Only Opp (experimental)
+
+The Evo Runner optimizes the first agent using evolutionary learning. 
+Noise is added to the opponents IPD-like payout matrix at each rollout. Each opponent has the same noise added.
+
+See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.