Reimplement Lib/Set.hpp as a wrapper of std::unordered_set #551

inpefess · 2024-05-08T11:13:11Z

Why?

And why have original basic container implementations instead of standard ones?

Why not replace Vampire's set with `std::unordered_set`?

Many more changes at once. This issue proposes to begin with only one file (Lib/Set.hpp).

We have Vampire's set rewritten as a wrapper around `std::unordered_set`, so what?

We can replace the wrapper's methods calls with standard ones of the wrapped set case by case wherever they appear in the code (some parts might need more effort to refactor than others). Ideally, this process ends with a total replacement of the original implementation by the standard.

We can also encourage new code to use standard sets.

If such an approach works, we can do the same with other containers.

Why start with sets?

We already have unit tests for this class.

The text was updated successfully, but these errors were encountered:

inpefess · 2024-05-08T11:14:25Z

I can work on it if there are no objections against the very idea.

mezpusz · 2024-05-08T12:50:49Z

I'm not against the idea, just wanted to note that Vampire's Set is closer to std::unordered_set than std::set to my knowledge, using hashing rather than red-black trees/heaps/whatnot. Also note that we use open addressing, and STL could be using something more sophisticated (and more efficient). But in general, especially in critical code such as in TermSharing, we should be careful with the change.

MichaelRawson · 2024-05-08T14:38:02Z

I am very excited to see this kind of change proposed! Thanks for filing this. If you are sufficiently brave to tackle this please go ahead and try it once there are no more comments here, otherwise I will do it at some point.

It may be that either

It is not possible to use std::unordered_set to implement Lib::Set because of the interface and how it is used.
We lose performance this way.

The former can probably be solved by changing how we use sets on a case-by-case basis. The latter I could forgive if the penalty is minimal, as in my experience the straight-line speed rarely translates to many solved problems.

Another thorny aspect may be the Hash parameter, which is used somewhat inconsistently. This can and should be cleaned up anyway.

JakobR · 2024-05-08T17:05:42Z

When I last tested std::unordered_map, it was quite a bit slower than Vampire's DHMap, at least on my machine. I would assume the implementations for sets are similar. (It seems to be a common criticism of std::unordered_map that it is relatively slow because of some API requirements of the C++ standard. But I don't remember the details.)

inpefess · 2024-05-08T18:02:08Z

Great! Thank you for your comments. I will give it a try.

MichaelRawson · 2024-05-09T06:00:37Z

A solution if std::unordered_map turns out to be too slow is to rewrite the container as simply as possible. We only need one set/map structure, not two, and I personally doubt double-hashing (the "DH" in DHMap/DHSet) is worth the extra complexity.

inpefess · 2024-05-09T09:16:45Z

I see at least one non-standard method, rawFindOrInsert

vampire/Lib/Set.hpp

Line 170 in 1916f50

* Checks whether a value with a given hashCode is in the map.

, giving a user a surprising amount of liberty (to use different hash functions for different items stored, for example). Happily, it appears only in Kernel/Term.cpp (e.g.

vampire/Kernel/Term.cpp

Line 850 in 1916f50

Term::termHash(function, [&](auto i){ return args[i]; }, arity),

), and I don't see any per-item hash functions there. I assume this issue to be blocked by that. I will propose a possible refactoring in a separate issue, e.g. defining a pertinent hash function at a moment of terms set creation instead of passing hash functions when inserting.

MichaelRawson · 2024-05-09T16:22:15Z

Fine by me - go straight to PR if you like, rawFindOrInsert is not ideal and I would be gladly rid of it.

quickbeam123 · 2024-05-15T10:42:34Z

This is definitely a nice exercise!

However, please try to approach this locally first (if at all possible - maybe focusing on an expected hot spot), so that you don't spend too much time on it, before we can compare the performance.

I wouldn't be too happy to accept a large refactor which on top of things slows Vampire down a bit, just to be closer to using standard classes.

inpefess changed the title ~~Reimplement Lib/Set.hpp as a wrapper of std::set~~ Reimplement Lib/Set.hpp as a wrapper of std::unordered_set May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reimplement Lib/Set.hpp as a wrapper of std::unordered_set #551

Reimplement Lib/Set.hpp as a wrapper of std::unordered_set #551

inpefess commented May 8, 2024 •

edited

Loading

inpefess commented May 8, 2024

mezpusz commented May 8, 2024

MichaelRawson commented May 8, 2024

JakobR commented May 8, 2024

inpefess commented May 8, 2024

MichaelRawson commented May 9, 2024

inpefess commented May 9, 2024

MichaelRawson commented May 9, 2024 •

edited

Loading

quickbeam123 commented May 15, 2024 •

edited

Loading

Reimplement Lib/Set.hpp as a wrapper of std::unordered_set #551

Reimplement Lib/Set.hpp as a wrapper of std::unordered_set #551

Comments

inpefess commented May 8, 2024 • edited Loading

Why?

Why not replace Vampire's set with std::unordered_set?

We have Vampire's set rewritten as a wrapper around std::unordered_set, so what?

Why start with sets?

inpefess commented May 8, 2024

mezpusz commented May 8, 2024

MichaelRawson commented May 8, 2024

JakobR commented May 8, 2024

inpefess commented May 8, 2024

MichaelRawson commented May 9, 2024

inpefess commented May 9, 2024

MichaelRawson commented May 9, 2024 • edited Loading

quickbeam123 commented May 15, 2024 • edited Loading

inpefess commented May 8, 2024 •

edited

Loading

Why not replace Vampire's set with `std::unordered_set`?

We have Vampire's set rewritten as a wrapper around `std::unordered_set`, so what?

MichaelRawson commented May 9, 2024 •

edited

Loading

quickbeam123 commented May 15, 2024 •

edited

Loading