Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace weakref dict with normal dict #1993

Closed
wants to merge 6 commits into from

Conversation

Corvince
Copy link
Contributor

Just noticed that with the cell space testing agentrefs became the main bottleneck. This PR is just to run the benchmark code and see how much difference it makes. Maybe we can talk about the advantages and disadvantages of weakrefs some more. (That is to say I want to start a discussion, not necessarily get this merged).

Copy link

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
Schelling small 🟢 -22.3% [-22.6%, -22.0%] 🟢 -25.9% [-26.1%, -25.8%]
Schelling large 🟢 -23.7% [-24.3%, -23.1%] 🟢 -26.4% [-27.2%, -25.6%]
WolfSheep small 🟢 -25.9% [-26.4%, -25.5%] 🟢 -22.3% [-22.4%, -22.2%]
WolfSheep large 🟢 -48.0% [-57.1%, -37.6%] 🟢 -24.7% [-25.5%, -24.0%]
BoidFlockers small 🟢 -6.7% [-7.2%, -6.1%] 🔵 -1.9% [-2.5%, -1.3%]
BoidFlockers large 🟢 -7.0% [-7.6%, -6.3%] 🔵 +0.1% [-0.5%, +0.7%]

@EwoutH
Copy link
Member

EwoutH commented Jan 23, 2024

That are some serious numbers. Quite consistent reduction in runtime of ~25%.

Wonder how we can keep the functionality of weakness without the (majority of) the performance impact.

At least good to know that this is the effect.

@rht
Copy link
Contributor

rht commented Jan 23, 2024

Cython has __weakref__ functionality that could be tested for performance. I will allocate a time this week to migrate the build system from Hatch to Meson, so that experimenting with Cython code is much easier.

@EwoutH
Copy link
Member

EwoutH commented Jan 23, 2024

Didn't we just migrate to hatch? Anyway, that would be great!

@quaquel
Copy link
Member

quaquel commented Jan 24, 2024

Thanks for checking this. It was also on my list of things to check. I am surprised by the overhead of weakrefs. I knew it is done in pure python and that there thus would be overhead. I never imagined that it was this big.

The reason for using weakrefs in AgentSet was to avoid memory leaks. It is also convenient for the user that you don't have to remove agents from the space, the scheduler, and whatever else there might be. However, if the performance overhead is so large, I am wondering whether the cure is not worse than the problem.

@quaquel
Copy link
Member

quaquel commented Jan 24, 2024

So, I ran into the slowness again. While playing with #1994, I moved Cell.agents to Cell._agents and turned Cell.agents into a property that returned an AgentSet. On WolfSheep, this doubles the runtime, which is just plain silly.

@quaquel
Copy link
Member

quaquel commented Jan 24, 2024

So the problem seems to be WeakkeyDictionary which is a pure python implementation. It might be possible to substantially improve our speed without losing the use of weakrefs.

@quaquel
Copy link
Member

quaquel commented Jan 25, 2024

A quick update: I looked into the current implementation of WeakkeyDict. It is basically a clever wrapper around a normal dict with weakkey.ref() keys. Some minor speedups may be possible because the code uses some slightly dated constructions and has protection for multithreading (which we don't need). I am going to test this over the weekend. Since it is normal Python code, I also see no reason why Cython could not be used. Again, I'll try to test over the weekend.

@quaquel
Copy link
Member

quaquel commented Jan 26, 2024

Another experiment. I changed the implementation of Agentset.shuffle in 2 ways: I shuffle the weakrefs themselves, and it defaults to inplace updating (quick dirty check, can be changed also within the schedulers. The first column below is the current code on main. I'll do a comparison against @Corvince version with a dict later this morning.

Model Size Init time [95% CI] Run time [95% CI]
Schelling small 🔵 -0.1% [-0.5%, +0.3%] 🟢 -8.0% [-8.1%, -7.9%]
Schelling large 🔵 -1.2% [-2.6%, +0.3%] 🟢 -8.2% [-9.2%, -7.2%]
WolfSheep small 🔵 -3.6% [-4.4%, -2.8%] 🟢 -19.9% [-20.2%, -19.5%]
WolfSheep large 🟢 -56.3% [-63.0%, -43.8%] 🟢 -19.0% [-21.8%, -15.5%]
BoidFlockers small 🔵 +0.9% [-0.6%, +2.6%] 🔵 -3.4% [-4.4%, -2.3%]
BoidFlockers large 🔵 +0.1% [-1.3%, +1.5%] 🔵 -1.9% [-3.1%, -1.0%]

The new code is below. The key slowdown compared to a normal dict is that when iterating over a WeakkeyDict, you are turning all weakrefs into hard refs, and (in agentset) back again into weakrefs. This is rather inefficient as evidenced above.

    def shuffle(self, inplace: bool = True) -> AgentSet:
        """
        Randomly shuffle the order of agents in the AgentSet.

        Args:
            inplace (bool, optional): If True, shuffles the agents in the current AgentSet; otherwise, returns a new shuffled AgentSet. Defaults to False.

        Returns:
            AgentSet: A shuffled AgentSet. Returns the current AgentSet if inplace is True.
        """
        weakrefs = list(self._agents.keyrefs())
        self.random.shuffle(weakrefs)

        if inplace:
            self._agents.data = {entry:None for entry in weakrefs}
            return self
        else:
            return AgentSet((agent for ref in weakrefs if (agent:=ref()) is not None), self.model)

@EwoutH
Copy link
Member

EwoutH commented Jan 26, 2024

Great work! Curious if we can get towards that 25%, or even more with Cython.

@quaquel
Copy link
Member

quaquel commented Jan 26, 2024

For completeness: dict (@Corvince version) against updated shuffle. So, there is still a performance loss for using WeakkeyDict, but updating the shuffle can help a lot. I am not sure why large Schelling is not performant. I am also curious why the init times are so much better for a dict.

Model Size Init time [95% CI] Run time [95% CI]
Schelling small 🟢 -17.3% [-17.7%, -16.7%] 🟢 -16.7% [-17.0%, -16.4%]
Schelling large 🟢 -19.8% [-20.4%, -19.1%] 🟢 -24.1% [-24.9%, -23.3%]
WolfSheep small 🟢 -18.5% [-19.3%, -17.8%] 🟢 -5.2% [-5.7%, -4.7%]
WolfSheep large 🟢 -17.6% [-18.8%, -16.1%] 🟢 -13.2% [-15.4%, -11.3%]
BoidFlockers small 🟢 -7.9% [-9.2%, -6.8%] 🔵 -0.9% [-1.5%, -0.4%]
BoidFlockers large 🟢 -7.1% [-8.5%, -5.6%] 🔵 -0.6% [-1.3%, +0.2%]

@quaquel
Copy link
Member

quaquel commented Jan 26, 2024

With #2007 merged and my suggested change to shuffle, we are getting close. Here, I compare the dict version to the current main with the updated shuffle implementation. As can be seen, we now lose about 15% with weakrefs. We effectively have halved the performance difference on the large models.

Model Size Init time [95% CI] Run time [95% CI]
Schelling small 🟢 -19.5% [-20.2%, -18.7%] 🔵 +1.1% [+0.5%, +1.8%]
Schelling large 🟢 -21.5% [-22.6%, -20.6%] 🟢 -12.0% [-16.3%, -8.0%]
WolfSheep small 🟢 -18.4% [-19.0%, -17.7%] 🟢 -5.4% [-5.9%, -4.8%]
WolfSheep large 🟢 -17.0% [-18.5%, -15.3%] 🟢 -12.5% [-14.2%, -10.9%]
BoidFlockers small 🟢 -8.6% [-10.2%, -7.0%] 🔵 -1.0% [-1.8%, -0.3%]
BoidFlockers large 🟢 -7.0% [-8.5%, -5.5%] 🔵 -1.8% [-3.0%, -0.4%]

@quaquel quaquel added trigger-benchmarks Special label that triggers the benchmarking CI and removed trigger-benchmarks Special label that triggers the benchmarking CI labels Feb 24, 2024
@quaquel quaquel added trigger-benchmarks Special label that triggers the benchmarking CI and removed trigger-benchmarks Special label that triggers the benchmarking CI labels Feb 24, 2024
@quaquel quaquel added the trigger-benchmarks Special label that triggers the benchmarking CI label Feb 24, 2024
@quaquel quaquel added trigger-benchmarks Special label that triggers the benchmarking CI and removed trigger-benchmarks Special label that triggers the benchmarking CI labels Feb 24, 2024
Copy link

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
Schelling small 🟢 -22.7% [-23.0%, -22.2%] 🔵 -2.6% [-2.9%, -2.4%]
Schelling large 🟢 -29.8% [-39.6%, -24.6%] 🔵 -1.6% [-2.7%, -0.4%]
WolfSheep small 🟢 -26.4% [-26.7%, -26.1%] 🟢 -5.6% [-5.8%, -5.5%]
WolfSheep large 🟢 -28.8% [-35.9%, -24.9%] 🟢 -5.8% [-7.9%, -3.5%]
BoidFlockers small 🟢 -6.6% [-7.1%, -6.1%] 🔵 -0.8% [-1.6%, +0.0%]
BoidFlockers large 🟢 -8.4% [-9.0%, -7.8%] 🔵 -1.1% [-1.7%, -0.4%]

@quaquel
Copy link
Member

quaquel commented Feb 24, 2024

After me being stupid for a a bit, I got the benchmarks to run again. So the performance difference of dict compared to weakkeydict is still there but it is a lot smaller than it was before.

@Corvince
Copy link
Contributor Author

Yes I think the difference is negligible at this point. I think we can close this PR then, do you agree?

@quaquel quaquel closed this Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
trigger-benchmarks Special label that triggers the benchmarking CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants