Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datacollector: Allow collecting data from Agent (sub)classes #2300

Merged
merged 6 commits into from
Sep 21, 2024

Conversation

EwoutH
Copy link
Member

@EwoutH EwoutH commented Sep 19, 2024

Enhanced Mesa's DataCollector to allow collecting data from Agent (sub)classes, providing more flexible and granular data collection capabilities.

Motive

To enable more comprehensive data collection in multi-agent simulations, allowing researchers to track attributes and behaviors specific to different agent types, including custom Agent subclasses.

Implementation

  • Modified DataCollector class to accept agenttype_reporters parameter
  • Added _new_agenttype_reporter method for handling agent-type-specific reporters
  • Updated collect method to handle agent-type-specific data collection
  • Added get_agenttype_vars_dataframe method for retrieving agent-type-specific data
  • Updated Model class to support agenttype_reporters in initialize_data_collector
  • Added support for collecting data from all Agent subclasses, not just predefined agent types
  • Updated docstrings and module-level documentation
  • Added comprehensive unit tests for the new functionality

Usage Examples

class MyModel(Model):
    def __init__(self):
        super().__init__()
        self.datacollector = DataCollector(
            agent_reporters={"life_span": "life_span"},
            # The new agenttype_reporters argument
            agenttype_reporters={
                Wolf: {"sheep_eaten": "sheep_eaten"},
                Sheep: {"wool": "wool_amount"},
                Animal: {"energy": "energy"}  # Collects from all animals
            }
        )

# Retrieve data for a specific agent type
wolf_data = model.datacollector.get_agenttype_vars_dataframe(Wolf)
# Retrieve data for all Animal subclasses, which are in this case all Wolf and Sheep
animal_data = model.datacollector.get_agenttype_vars_dataframe(Animal)

Additional Notes

  • Backward compatible with existing DataCollector usage
  • Supports collecting data from custom Agent subclasses and superclasses

Part of #348.

@EwoutH EwoutH added the enhancement Release notes label label Sep 19, 2024
@EwoutH EwoutH changed the title datacollector: Allow collecting data from Agent type datacollector: Allow collecting data from Agent classes Sep 19, 2024
Copy link

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔵 +1.3% [+0.1%, +2.5%] 🔵 +0.2% [-0.1%, +0.5%]
BoltzmannWealth large 🔵 -0.3% [-0.9%, +0.2%] 🔵 -1.4% [-3.6%, +1.8%]
Schelling small 🔵 +0.1% [-0.3%, +0.4%] 🔵 -0.4% [-0.6%, -0.2%]
Schelling large 🔵 +0.1% [-0.6%, +0.8%] 🔵 -2.1% [-3.4%, -0.8%]
WolfSheep small 🔵 +0.9% [+0.6%, +1.1%] 🔵 +0.0% [-0.2%, +0.3%]
WolfSheep large 🔵 -3.0% [-4.1%, -1.7%] 🟢 -6.4% [-7.6%, -5.3%]
BoidFlockers small 🔵 +1.0% [+0.3%, +1.5%] 🔵 -0.2% [-0.9%, +0.5%]
BoidFlockers large 🔵 +0.8% [+0.3%, +1.2%] 🔵 +0.5% [-0.3%, +1.2%]

@EwoutH EwoutH added the backport-candidate PRs we might want to backport to an earlier branch label Sep 19, 2024
@rht
Copy link
Contributor

rht commented Sep 19, 2024

What about the case for e.g. in Epstein civil violence, you want to collect based on citizens that are jailed, which is not based on type?

@EwoutH
Copy link
Member Author

EwoutH commented Sep 19, 2024

This PR doesn't fix everything, we clearly need a new DataCollector. But it's a minimally invasive, fully backwards compatible extension to the current DataCollector that everybody knows. It's even backportable to Mesa 2.x.

In Epstein civil violence, you can log if agents are jailed or not and filter afterwards. Not ideal, but workable.

Of course I'm open for other suggestions for backwards compatible alternatives.

@rht
Copy link
Contributor

rht commented Sep 19, 2024

The code path has been well tested with model reporter and agent reporter. Also, better than nothing. I'm fine with this PR.

@quaquel
Copy link
Member

quaquel commented Sep 20, 2024

There are a few lines currently not covered by the tests, but otherwise, this seems fine. It solves a long-standing open issue in a backward-compatible way. It makes sense to add it while we continue work on building the foundations for a better datacollector.

@EwoutH
Copy link
Member Author

EwoutH commented Sep 20, 2024

Added some additional tests.

I consciously named it agenttype_reporters, to leave room for future other arguments like agentset_reporters or custom_agent_reporters.

This functionality might even be able to be implemented more elegantly using more existing code.

But I think API is solid, and there are distinct use cases for it. One huge advantage of collecting by type, it that you can ensure all agents have those attributes and methods you’re collecting. It also elegantly using the pre-existing model.agent_by_types duct, which is a very fast way to get these agent groups (since select can be an expensive operation).

Added `agenttype_reporters` to Mesa's DataCollector, enabling collection of data specific to agent types.
Added three new test methods to cover the missing codepaths:

1. `test_agenttype_reporter_string_attribute`: This test covers the case where the reporter is a string (attribute name).
2. `test_agenttype_reporter_function_with_params`: This test covers the case where the reporter is a list (function with parameters).
3. `test_agenttype_reporter_multiple_types`: This test explicitly checks that adding reporters for multiple agent types works correctly, which covers the case where `agent_type` is not initially in `self.agenttype_reporters`.
@EwoutH EwoutH force-pushed the datacollector_agentset_reporters branch from f0c7f0d to 3dbb5b6 Compare September 20, 2024 08:19
@EwoutH EwoutH changed the title datacollector: Allow collecting data from Agent classes datacollector: Allow collecting data from Agent (sub)classes Sep 20, 2024
@EwoutH EwoutH added feature Release notes label and removed enhancement Release notes label labels Sep 20, 2024
@EwoutH EwoutH added this to the v3.0 milestone Sep 20, 2024
@EwoutH
Copy link
Member Author

EwoutH commented Sep 20, 2024

In 86b184d and 3dbb5b6 I updated this feature to allow collecting from any Agent (sub)class, so also a class like Animals, which is not present in model.agents_by_type but is a valid Agent (sub)class. It even allows inputting Agent itself, but that's ofcourse redundant with the agent_reporters.

        self.datacollector = DataCollector(
            agenttype_reporters={
                Sheep: {"wool": "wool_amount"},
                Animal: {"energy": "energy"}  # Collects from all animals
            }
        )

I would love some final reviews on this (also from @Corvince).


Finally, DataCollector can create a pandas DataFrame from each collection.

The default DataCollector here makes several assumptions:
* The model has an agent list called agents
* The model has a dictionary of AgentSets called agents_by_type
* For collecting agent-level variables, agents must have a unique_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this part about assumptions. All this is is in Model. Why not say: "the DataCollector is designed to work with Model or any subclass thereof?

Copy link
Member

@quaquel quaquel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 minor remaining comments but otherwise seems good to go.

All three assumptions are now guarded by that we require the Agent and Model super classes to always be initialized. So they are not relevant anymore for the user.
@EwoutH
Copy link
Member Author

EwoutH commented Sep 21, 2024

Thanks for reviewing!

I'm merging to keep moving this forward, but @Corvince still I would really love to hear your thoughts on the API.

@EwoutH EwoutH merged commit e6874ad into projectmesa:main Sep 21, 2024
10 of 12 checks passed
EwoutH added a commit to EwoutH/mesa that referenced this pull request Sep 24, 2024
…mesa#2300)

Enhanced Mesa's DataCollector to allow collecting data from Agent (sub)classes, providing more flexible and granular data collection capabilities.

To enable more comprehensive data collection in multi-agent simulations, allowing researchers to track attributes and behaviors specific to different agent types, including custom Agent subclasses.

- Modified `DataCollector` class to accept `agenttype_reporters` parameter
- Added `_new_agenttype_reporter` method for handling agent-type-specific reporters
- Updated `collect` method to handle agent-type-specific data collection
- Added `get_agenttype_vars_dataframe` method for retrieving agent-type-specific data
- Updated `Model` class to support `agenttype_reporters` in `initialize_data_collector`
- Added support for collecting data from all Agent subclasses, not just predefined agent types
- Updated docstrings and module-level documentation
- Added comprehensive unit tests for the new functionality

```python
class MyModel(Model):
def __init__(self):
super().__init__()
self.datacollector = DataCollector(
agent_reporters={"life_span": "life_span"},
agenttype_reporters={
Wolf: {"sheep_eaten": "sheep_eaten"},
Sheep: {"wool": "wool_amount"},
Animal: {"energy": "energy"} # Collects from all animals
}
)

wolf_data = model.datacollector.get_agenttype_vars_dataframe(Wolf)
animal_data = model.datacollector.get_agenttype_vars_dataframe(Animal)
```

- Backward compatible with existing DataCollector usage
- Supports collecting data from custom Agent subclasses and superclasses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-candidate PRs we might want to backport to an earlier branch feature Release notes label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants