Skip to content

Commit

Permalink
adding runner descriptions
Browse files Browse the repository at this point in the history
  • Loading branch information
Aidandos committed Oct 24, 2023
1 parent c6452ee commit 409e7d6
Showing 1 changed file with 55 additions and 0 deletions.
55 changes: 55 additions & 0 deletions docs/getting-started/runners.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,61 @@ In order for this approach to work the observation vector needs to include one e

See [this experiment](https://github.com/akbir/pax/blob/9d3fa62e34279a338c07cffcbf208edc8a95e7ba/pax/conf/experiment/rice/weight_sharing.yaml) for an example of how to configure it.

## Evo Hardstop

The Evo Runner optimizes the first agent using evolutionary learning.
This runner stops the learning of an opponent during training, corresponds to the hardstop challenge of Shaper.

See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.

## Evo Scanned

The Evo Runner optimizes the first agent using evolutionary learning.
Here we also scan over the evolutionary steps, which makes compilation longer, training shorter and logging stats is not possible.

See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.

## Evo Mixed LR Runner (experimental)

The Evo Runner optimizes the first agent using evolutionary learning.
This runner randomly samples learning rates for the opponents.

See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.

## Evo Mixed Payoff (experimental)

The Evo Runner optimizes the first agent using evolutionary learning.
Payoff matrix is randomly sampled at each rollout. Each opponent has a different payoff matrix.

See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.

## Evo Mixed Payoff Gen (experimental)

The Evo Runner optimizes the first agent using evolutionary learning.
Payoff matrix is randomly sampled at each rollout. Each opponent has the same payoff matrix.

See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.

## Evo Mixed IPD Payoff (experimental)

The Evo Runner optimizes the first agent using evolutionary learning.
This runner randomly samples payoffs that follow Iterated Prisoner's Dilemma [constraints](https://en.wikipedia.org/wiki/Prisoner%27s_dilemma).

See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.

## Evo Mixed Payoff Input (experimental)

The Evo Runner optimizes the first agent using evolutionary learning.
Payoff matrix is randomly sampled at each rollout. Each opponent has the same payoff matrix. The payoff matrix is observed as input to the agent.

See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.

## Evo Mixed Payoff Only Opp (experimental)

The Evo Runner optimizes the first agent using evolutionary learning.
Noise is added to the opponents IPD-like payout matrix at each rollout. Each opponent has the same noise added.

See [this experiment](https://github.com/akbir/pax/blob/9a01bae33dcb2f812977be388751393f570957e9/pax/conf/experiment/ipd/shaper_att_v_tabular.yaml) for an example of how to configure it.



0 comments on commit 409e7d6

Please sign in to comment.