-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Barnes-Hut approximation #25
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Nothing seemed to change really.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #25 +/- ##
==========================================
+ Coverage 8.42% 12.33% +3.90%
==========================================
Files 8 10 +2
Lines 178 227 +49
==========================================
+ Hits 15 28 +13
- Misses 163 199 +36 ☔ View full report in Codecov by Sentry. |
Potential to participate in simpler arithmetic
Now I know why tree3 tests fail. Some points that share a prefix with the others may lay outside the box created by the first and the last in the list that share the prefix.
Getting the idea now. (See img2.jpeg)
vector::erase invalidates iterators at begin, no good. Use list. Noticeable perf improvement as a side effect.
Dim particles at cutoff (inverse-square used for style). Screenshot.
Now, 59-60 FPS in the beginning. How? Stop recomputing everything. Bottleneck = equal_range, solve that by cutting number of particles if possible at every loop in main.
Take the idea of re-using groups a bit further.
Importantly: Don't copy state
Separate state vs. "view." Algorithm only requires a view of particles.
Unreliable check for full bit pattern -> fixed. No idea why was unreliable.
Sorting is the bottleneck. Can't go further (I think). On MSVC, stable_sort is much faster than sort. I have no idea why.
Use a tree that is built in the beginning of the frame (considered needed for the gravity simulation to avoid duplicated work).
Identify allocation as a non-essential bottleneck --> Re-use allocated memory --> Replace list with vector
* WIP: draft out an interface * WIP: 2 * WIP: 3 * Fix typing errors; make able to build Has a defect, crashes. * Fix a few tree-building defects Other defects remain. * Fix average finding routine * WIP 4 * Fix averaging * Fix sorting performance problem * Make it work; comment the code * Edit some comments in barnes_hut.h * Make mask type general * clang-tidy * Fix build * Remove tree3demo I'll be making a change in the way grouping works * Fix perf due to binary search; rid group() free function Though not a regression, still was a problem. In top-down approach, binary search, significant impact to perf. In this bottom-up approach, on other hand, no need to binary search. Measure latency improvement in first-time construction of groups. * Optimize for latency sacrificing measured memory usage Noticed many memmove calls, plus that emplace was always freeing and allocating new memory. Know that a vector typically allocates memory in powers of two, or else in some sort of geometric sequence. Well, reserve memory and then let construct in place, no more problem. Got 60 FPS @ 50,000 particles.
* Table.h: Attempt to use Barnes-Hut in gravity simulation * Make it "work" but degrade accuracy and speed Latency in the case of 1,000 particles doubled on average The demo in the beginning is breaking down * Remove ref to the area rectangle-circle collision routine for viz Incorrect routine * Remove variable timing for eval of physics Integrators not known to cope well with variable timing * Raise particle count ceiling to 5,000 * Recycle memory for copy * WIP: New implementation with an actual tree Does not compile yet. Preliminary idea. Worried about shared pointer overhead. But should reduce traversal overhead in `run`. * Update barnes_hut.h * WIP: Refactor barnes_hut.h * Update barnes_hut.h * Update barnes_hut.h * Fix much, but hit stack overflow Probably just use dumb pointers or a list. * WIP: Convert to regular pointers * WIP: Use a different LCRS approach * WIP: Make able to compile Crashes during tree construction, though * WIP: Include unincluded headers * WIP: Rename deleteGroup to delete_group * WIP: Prevent immediate crash (MSVC) * WIP: Fix null dereference by holding lower layer root (x) constant * WIP * WIP: Fix crashes (few-particle) * Simplify * WIP * WIP * WIP: two-particle case No more crashes or memory leaks, but duplication problems. * WIP dedup * WIP * Ignore clangd cache * Change signature of run() * WIP: New algorithm design Make layers explicit * Refactor. Still has aliasing problem, though. * Compress things somewhat * build -> parent * Fix comment about prefixes * Simplify constructor for B * Solve aliasing problem * Bring `explicit` back * Make compile on MSVC * Make it work for 400 particles * Clean up somewhat * Minor cleanup * Find leak * Add assert to find bug * Remove memory leak * Tweaks * Make grass compile
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Toward an O(n log n) approximation scheme [where n is the number of particles].