Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink label caching #9

Open
dnmilne opened this issue Dec 16, 2013 · 0 comments
Open

Rethink label caching #9

dnmilne opened this issue Dec 16, 2013 · 0 comments

Comments

@dnmilne
Copy link
Owner

dnmilne commented Dec 16, 2013

Caching of labels is only necessary for wikification, and in this situation there are waay more misses than hits, because we check every ngram in the document and most of these are nonsense phrases. A bloom filter would quickly get rid of all of the misses, and looking up the hits would probably be fast enough via the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant