Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What guarantees do u32 Atom hash values offer? #224

Open
dekellum opened this issue Sep 25, 2019 · 1 comment
Open

What guarantees do u32 Atom hash values offer? #224

dekellum opened this issue Sep 25, 2019 · 1 comment

Comments

@dekellum
Copy link
Contributor

This uses the phf* (perfect hash function) crates underneath, and if I understand correctly, phf offers guaranteed perfect, no collision, hash values for a compile time aggregated (static) set.

However as implemented here, these u64 hashes are shift-xor'd to u32 values:

// This may or may not be great...

Does this not reintroduce a risk of collision? Or does string_cache_codegen also include a guaruntee that the u32-bit hash doesn't collide for a static set? Or does string_cache not really need to care about such collisions?

Depending on the answer, I might be able to offer a doc PR.

@devongovett
Copy link

We found two strings that collide: "rgb2hex" and "hex2rgb". Both are under the MAX_INLINE_LEN so hit this code path when hashing.

string-cache/src/atom.rs

Lines 144 to 148 in 0cf93bb

INLINE_TAG => {
let data = self.unsafe_data.get();
// This may or may not be great...
((data >> 32) ^ data) as u32
}

Should this not write the actual bytes to the hasher rather than only a u32?

state.write_u32(self.get_hash())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants