Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key bugs using a 32-bit signed int calculated using a fast hash #6

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

davcamer
Copy link

The hash algorithm is based on Java's String.hashCode

Code was taken from a response on StackOverflow:
http://stackoverflow.com/questions/7616461/generate-a-hash-from-string-in-javascript-jquery

This cut indexing time by 50% and index size by 20%.
Nothing is free though, and insert time is increased by 25%.

Inserting 1000 documents is still 6x faster than indexing them.

…g.hashCode

Code was taken from a response on StackOverflow:
http://stackoverflow.com/questions/7616461/generate-a-hash-from-string-in-javascript-jquery

This cut indexing time by 50% and index size by 20%.
Nothing is free though, and insert time is increased by 25%.

Inserting 1000 documents is till 6x faster than indexing them.
TL;DR This reduces both the number of views, reducing size
on disk and view server processing time, and in some cases
the number of trips to the view server.

This is a bit of a novel because it is a squash of three
well researched, but minor, commits.

Remove reduce functions which only returned null

In the description of Lookup Views on the wiki, they specifically mention that
null is an appropriate emit value for a view that will only be used for lookup.
I believe that having a reduce function defined for such a view will require
roundtrips to the view server, but won't accomplish any actual work. On the
other hand, not having a function will automatically default the view to
ignore the reduce function, at least for query side (but hopefully for
generation as well) as described in the query options section.

http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#Lookup_Views
http://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options

Use built-in _sum function where appropriate

This should eliminate trips to the view server by doing summation in the
erlang process. Documentation from couchapp (erica's predecessor) describes
the simple file with '_sum' as the correct way to do this for a couchapp.
Several sources reference the increased efficiency of using the build-in
function as compared to the same javascript function.

http://couchapp.org/page/faq

http://wiki.apache.org/couchdb/Built-In_Reduce_Functions
http://docs.couchdb.org/en/latest/ddocs.html
http://nosql.mypopescu.com/post/773435732/couchdb-built-in-reduce-functions

Replacing reports-per views with recent-items where possible

- reports-per-android-version => recent-items-by-androidver
- reports-per-app-version-code => recent-items-by-appver
- reports-per-app-version-name => recent-items-by-appvercode
- recent-items-by-bug => recent-items-by-bug-by-installation-id

For these views, and also for recent-items and recent-items-by-installation-id, no longer
capturing the summary data in the view value. Instead, Acralyzer will use the include_docs
option to get the needed data in the browser.
- makes document storage about one-third smaller
- makes viewindexing about one-quarter faster
@davcamer
Copy link
Author

Here's the larger set of changes we've made on our acra-storage deployment. Through some poor decisions about reporting, we ended up with a VERY large set of crash reports -- millions. These changes greatly improved the size and speed of our couch instance. Indexing was an order of magnitude faster.

I had been waiting to see if the initial change that keys bugs by hash code was accepted, but thinking about it more, it seems better to get the full range of changes in the open. With the full picture, any discussion of individual changes might be easier.

@KevinGaudin
Copy link
Member

I'm sorry for coming late here, time is hard to find these days.

Thanks a lot for your contribution, I'm starting to review your proposed changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants