Welcome to this comprehensive guide on integrating reinforcement learning (RL) with Mesa environments. Mesa, an agent-based modeling framework, offers an excellent platform to experiment with RL algorithms. In this tutorial, we'll explore several examples of how RL can be applied to various Mesa environments, starting with the Epstein Civil Violence model.
Before diving into the implementation, take a moment to familiarize yourself with the Epstein Civil Violence model. This will give you a solid understanding of the environment we’ll be working with.
Next, ensure all dependencies are installed by following the instructions in the README.md
.
To begin, let’s import the required modules for the Epstein Civil Violence model:
from epstein_civil_violence.model import EpsteinCivilViolence_RL
from epstein_civil_violence.server import run_model
from epstein_civil_violence.train import config
from train import train_model
Here’s a breakdown of the modules:
EpsteinCivilViolence_RL
: Contains the core model and environment.run_model
: Configures and runs the model for inference.config
: Defines the parameters for training the model.train_model
: Includes functions for training the RL agent using RLlib.
Let's load and reset the environment. This also allows us to inspect the observation space:
env = EpsteinCivilViolence_RL()
observation, info = env.reset(seed=42)
To get a feel for how the environment operates, let's run it for a few steps using random actions. We’ll sample the action space for these actions:
for _ in range(10):
action_dict = {}
for agent in env.schedule.agents:
action_dict[agent.unique_id] = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action_dict)
if terminated or truncated:
observation, info = env.reset()
Now that you're familiar with the environment, let's train the RL model using the preset configuration:
train_model(config, num_iterations=1, result_path='results.txt', checkpoint_dir='checkpoints')
Feel free to modify the training parameters in the train_config.py
file to experiment with different outcomes.
After training, you can visualize the results by running inference on the model. Mesa's built-in visualization tools will help you launch a webpage to view the model's performance:
server = run_model(path='checkpoints')
server.port = 6005
server.launch(open_browser=True)
In the example above, we utilized RLlib to integrate reinforcement learning algorithms with the Mesa environment, which is particularly useful when you want different policies for different agents. However, if your use case requires a simpler setup where all agents follow the same policy, you can opt for Stable-Baselines. An example of integrating Stable-Baselines with Mesa can be found in the Boltzmann Money model.
If you're ready to explore RL in different agent-based scenarios, you can start by experimenting with various examples we provide at Mesa-Examples. These examples cover a range of scenarios and offer a great starting point for understanding how to apply RL within Mesa environments.
If you have your own scenario in mind, you can create it as a Mesa model by following this series of Tutorials. Once your scenario is set up as a Mesa model, you can refer to the code in the provided implementations to see how the RL components are built on top of the respective Mesa models.