From 409e7d612b2eaed29fef342634a33e377ec9cf77 Mon Sep 17 00:00:00 2001 From: Aidandos Date: Tue, 24 Oct 2023 15:54:44 +0000 Subject: [PATCH] adding runner descriptions --- docs/getting-started/runners.md | 55 +++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/docs/getting-started/runners.md b/docs/getting-started/runners.md index 43dc241d..45d55770 100644 --- a/docs/getting-started/runners.md +++ b/docs/getting-started/runners.md @@ -23,6 +23,61 @@ In order for this approach to work the observation vector needs to include one e See [this experiment](https://github.com/akbir/pax/blob/9d3fa62e34279a338c07cffcbf208edc8a95e7ba/pax/conf/experiment/rice/weight_sharing.yaml) for an example of how to configure it. +## Evo Hardstop + +The Evo Runner optimizes the first agent using evolutionary learning. +This runner stops the learning of an opponent during training, corresponds to the hardstop challenge of Shaper. + +See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it. + +## Evo Scanned + +The Evo Runner optimizes the first agent using evolutionary learning. +Here we also scan over the evolutionary steps, which makes compilation longer, training shorter and logging stats is not possible. + +See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it. + +## Evo Mixed LR Runner (experimental) + +The Evo Runner optimizes the first agent using evolutionary learning. +This runner randomly samples learning rates for the opponents. + +See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it. + +## Evo Mixed Payoff (experimental) + +The Evo Runner optimizes the first agent using evolutionary learning. +Payoff matrix is randomly sampled at each rollout. Each opponent has a different payoff matrix. + +See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it. + +## Evo Mixed Payoff Gen (experimental) + +The Evo Runner optimizes the first agent using evolutionary learning. +Payoff matrix is randomly sampled at each rollout. Each opponent has the same payoff matrix. + +See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it. + +## Evo Mixed IPD Payoff (experimental) + +The Evo Runner optimizes the first agent using evolutionary learning. +This runner randomly samples payoffs that follow Iterated Prisoner's Dilemma [constraints](https://en.wikipedia.org/wiki/Prisoner%27s_dilemma). + +See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it. + +## Evo Mixed Payoff Input (experimental) + +The Evo Runner optimizes the first agent using evolutionary learning. +Payoff matrix is randomly sampled at each rollout. Each opponent has the same payoff matrix. The payoff matrix is observed as input to the agent. + +See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it. + +## Evo Mixed Payoff Only Opp (experimental) + +The Evo Runner optimizes the first agent using evolutionary learning. +Noise is added to the opponents IPD-like payout matrix at each rollout. Each opponent has the same noise added. + +See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.