Skip to content

Commit

Permalink
Introduce Barnes-Hut approximation (#25)
Browse files Browse the repository at this point in the history
* Demonstrate Morton code stuff (bad)

* Improve Morton demo; test

* Add colors and filling animation to Morton demo

* Demonstrate four quadrants

* Expand range to (-2, 2) x (-2, 2)

Nothing seemed to change really.

* quadrantdemo: Attach a screenshot

* morton: Minor formatting

* WIP: tree.h

* Remove tree.h inherited from branch tree2

* Do square Morton code

* Exclude INT32_MAX from integer conversion

* Enable more warnings

* CI: Update APT

* Remove /WX options

* Newton: Cut "c0 -= c0"

* Halton: Remove unsigned in index and base

Potential to participate in simpler arithmetic

* Cut dead function morton32; add tests

* WIP: Another tree...

* Update tree.h

* Update tree.h

* tree.h: Attempt to improve documentation

* tree.h: Attempt to improve documentation 2

* tree.h: First draft of the algorithm

* tree.h: Add smoke test [fail]

* Visualized masked Morton code boxes

Now I know why tree3 tests fail. Some points that share a prefix with the others may lay outside the box created by the first and the last in the list that share the prefix.

* quadrantdemo: Add circles, two centers.

Getting the idea now.

(See img2.jpeg)

* WIP: Add algo, demo

* tree.h: Appears to work.

* tree3demo: Track mouse

* tree3demo: Compute nodes only once

* tree3demo: Decouple extra data

* tree3demo: Apply movable mask

* tree3demo: 60 FPS, resizable window, pan & zoom

* tree3demo: Visualize angle rejection (left click)

* tree3demo: Various improvements

* tree3demo: Allow flight of particles

* tree3demo: Print numbers of accepted nodes

* tree3demo: Attach a screenshot

* Delete /Testing

* barnes_hut.h: WIP (dfs)

Still figuring out

* barnes_hut.h: WIP (dfs) 2

Still figuring out

* barnes_hut.h: Remove Node structure

Not necessary.

* barnes_hut.h: Simplify; remove dfs()

Dfs approach considered not necessary.

* barnes_hut.h: WIP (run_level)

I tried to translate instructions from my notes to code.

Turns out, only parts of the instructions belong in barnes_hut.h because there's many assumptions that functions in the file can't possibly make (depend on).

* barnes_hut.h: i -> g

* barnes_hut.h: WIP (run_level) 2

I think I'm going to remove this function outright because it's almost nearly just a linear scan.

* Update barnes_hut.h

* tree3demo: Simplify radius computation

* WIP: Add hierarchydemo

Crashing because of attempt to erase erased particle

* hierarchydemo: Fix erase() bug; add flight

* hierarchydemo: Comply with iterator rules

vector::erase invalidates iterators at begin, no good.

Use list.

Noticeable perf improvement as a side effect.

* hierarchydemo: Various changes (and 10,000 particles)

* hierarchydemo: Reset (R); tweak

* hierarchydemo: Add comments

* hierarchydemo: Add fog; use 50,000 particles, etc.

Dim particles at cutoff (inverse-square used for style).

Screenshot.

* hierarchydemo: Reuse particles

Now, 59-60 FPS in the beginning.

How?

Stop recomputing everything. Bottleneck = equal_range, solve that by cutting number of particles if possible at every loop in main.

* hierarchydemo: Use vector for particles; const iterator

Take the idea of re-using groups a bit further.

* hierarchydemo: Simplify hot loop.

Importantly: Don't copy state

* hierarchydemo: Reorganize

Separate state vs. "view."

Algorithm only requires a view of particles.

* hierarchydemo: Start from lowest level of detail

* hierarchydemo: Fix `refine`; tweak

Unreliable check for full bit pattern -> fixed.

No idea why was unreliable.

* hierarchydemo: Edit a couple comments

* hierarchydemo: Use stable_sort instead of sort

Sorting is the bottleneck. Can't go further (I think).

On MSVC, stable_sort is much faster than sort. I have no idea why.

* hierarchydemo: Precompute Morton; 100,000 particles.

* clang-tidy

* hierarchydemo: Implement experimental "copies" algorithm

Use a tree that is built in the beginning of the frame (considered needed for the gravity simulation to avoid duplicated work).

* hierarchydemo: Optimize memory usage; 50,000 particles

Identify allocation as a non-essential bottleneck
--> Re-use allocated memory
--> Replace list with vector

* clang-tidy

* Move fixedmorton32 to barnes_hut.h

* Rename fixedmorton32 to morton

* Clean up code

* Fix build

* Factor out construction of the Barnes-Hut tree from the demo (#26)

* WIP: draft out an interface

* WIP: 2

* WIP: 3

* Fix typing errors; make able to build

Has a defect, crashes.

* Fix a few tree-building defects

Other defects remain.

* Fix average finding routine

* WIP 4

* Fix averaging

* Fix sorting performance problem

* Make it work; comment the code

* Edit some comments in barnes_hut.h

* Make mask type general

* clang-tidy

* Fix build

* Remove tree3demo

I'll be making a change in the way grouping works

* Fix perf due to binary search; rid group() free function

Though not a regression, still was a problem.

In top-down approach, binary search, significant impact to perf.

In this bottom-up approach, on other hand, no need to binary search.

Measure latency improvement in first-time construction of groups.

* Optimize for latency sacrificing measured memory usage

Noticed many memmove calls, plus that emplace was always freeing and allocating new memory.

Know that a vector typically allocates memory in powers of two, or else in some sort of geometric sequence.

Well, reserve memory and then let construct in place, no more problem.

Got 60 FPS @ 50,000 particles.

* demo/main.cpp: Put [[maybe_unused]]

* Use Barnes-Hut in gravity simulation (#27)

* Table.h: Attempt to use Barnes-Hut in gravity simulation

* Make it "work" but degrade accuracy and speed

Latency in the case of 1,000 particles doubled on average

The demo in the beginning is breaking down

* Remove ref to the area rectangle-circle collision routine for viz

Incorrect routine

* Remove variable timing for eval of physics

Integrators not known to cope well with variable timing

* Raise particle count ceiling to 5,000

* Recycle memory for copy

* WIP: New implementation with an actual tree

Does not compile yet.

Preliminary idea.

Worried about shared pointer overhead.

But should reduce traversal overhead in `run`.

* Update barnes_hut.h

* WIP: Refactor barnes_hut.h

* Update barnes_hut.h

* Update barnes_hut.h

* Fix much, but hit stack overflow

Probably just use dumb pointers or a list.

* WIP: Convert to regular pointers

* WIP: Use a different LCRS approach

* WIP: Make able to compile

Crashes during tree construction, though

* WIP: Include unincluded headers

* WIP: Rename deleteGroup to delete_group

* WIP: Prevent immediate crash (MSVC)

* WIP: Fix null dereference by holding lower layer root (x) constant

* WIP

* WIP: Fix crashes (few-particle)

* Simplify

* WIP

* WIP

* WIP: two-particle case

No more crashes or memory leaks, but duplication problems.

* WIP dedup

* WIP

* Ignore clangd cache

* Change signature of run()

* WIP: New algorithm design

Make layers explicit

* Refactor. Still has aliasing problem, though.

* Compress things somewhat

* build -> parent

* Fix comment about prefixes

* Simplify constructor for B

* Solve aliasing problem

* Bring `explicit` back

* Make compile on MSVC

* Make it work for 400 particles

* Clean up somewhat

* Minor cleanup

* Find leak

* Add assert to find bug

* Remove memory leak

* Tweaks

* Make grass compile

* Tweak main demo

* Compute CM properly

* Add various improvements + complexity measurement
  • Loading branch information
axionbuster authored Mar 6, 2024
1 parent ef31e16 commit 19ab5c8
Show file tree
Hide file tree
Showing 27 changed files with 1,128 additions and 488 deletions.
88 changes: 44 additions & 44 deletions .clang-format
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
Language: Cpp
Language: Cpp
# BasedOnStyle: LLVM
AccessModifierOffset: -2
AlignAfterOpenBracket: Align
Expand All @@ -9,7 +9,7 @@ AlignConsecutiveAssignments: None
AlignConsecutiveBitFields: None
AlignConsecutiveDeclarations: None
AlignEscapedNewlines: Right
AlignOperands: Align
AlignOperands: Align
AlignTrailingComments: true
AllowAllArgumentsOnNextLine: true
AllowAllParametersOfDeclarationOnNextLine: true
Expand All @@ -29,21 +29,21 @@ AttributeMacros:
BinPackArguments: true
BinPackParameters: true
BraceWrapping:
AfterCaseLabel: false
AfterClass: false
AfterCaseLabel: false
AfterClass: false
AfterControlStatement: Never
AfterEnum: false
AfterFunction: false
AfterNamespace: false
AfterEnum: false
AfterFunction: false
AfterNamespace: false
AfterObjCDeclaration: false
AfterStruct: false
AfterUnion: false
AfterStruct: false
AfterUnion: false
AfterExternBlock: false
BeforeCatch: false
BeforeElse: false
BeforeCatch: false
BeforeElse: false
BeforeLambdaBody: false
BeforeWhile: false
IndentBraces: false
BeforeWhile: false
IndentBraces: false
SplitEmptyFunction: true
SplitEmptyRecord: true
SplitEmptyNamespace: true
Expand All @@ -57,21 +57,21 @@ BreakConstructorInitializersBeforeComma: false
BreakConstructorInitializers: BeforeColon
BreakAfterJavaFieldAnnotations: false
BreakStringLiterals: true
ColumnLimit: 80
CommentPragmas: '^ IWYU pragma:'
ColumnLimit: 80
CommentPragmas: '^ IWYU pragma:'
QualifierAlignment: Leave
CompactNamespaces: false
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
Cpp11BracedListStyle: true
DeriveLineEnding: true
DerivePointerAlignment: false
DisableFormat: false
DisableFormat: false
EmptyLineAfterAccessModifier: Never
EmptyLineBeforeAccessModifier: LogicalBlock
ExperimentalAutoDetectBinPacking: false
PackConstructorInitializers: BinPack
BasedOnStyle: ''
BasedOnStyle: ''
ConstructorInitializerAllOnOneLineOrOnePerLine: false
AllowAllConstructorInitializersOnNextLine: true
FixNamespaceComments: true
Expand All @@ -81,20 +81,20 @@ ForEachMacros:
- BOOST_FOREACH
IfMacros:
- KJ_IF_MAYBE
IncludeBlocks: Preserve
IncludeBlocks: Preserve
IncludeCategories:
- Regex: '^"(llvm|llvm-c|clang|clang-c)/'
Priority: 2
SortPriority: 0
CaseSensitive: false
- Regex: '^(<|"(gtest|gmock|isl|json)/)'
Priority: 3
SortPriority: 0
CaseSensitive: false
- Regex: '.*'
Priority: 1
SortPriority: 0
CaseSensitive: false
- Regex: '^"(llvm|llvm-c|clang|clang-c)/'
Priority: 2
SortPriority: 0
CaseSensitive: false
- Regex: '^(<|"(gtest|gmock|isl|json)/)'
Priority: 3
SortPriority: 0
CaseSensitive: false
- Regex: '.*'
Priority: 1
SortPriority: 0
CaseSensitive: false
IncludeIsMainRegex: '(Test)?$'
IncludeIsMainSourceRegex: ''
IndentAccessModifiers: false
Expand All @@ -103,16 +103,16 @@ IndentCaseBlocks: false
IndentGotoLabels: true
IndentPPDirectives: None
IndentExternBlock: AfterExternBlock
IndentRequires: false
IndentWidth: 2
IndentRequires: false
IndentWidth: 2
IndentWrappedFunctionNames: false
InsertTrailingCommas: None
JavaScriptQuotes: Leave
JavaScriptWrapImports: true
KeepEmptyLinesAtTheStartOfBlocks: true
LambdaBodyIndentation: Signature
MacroBlockBegin: ''
MacroBlockEnd: ''
MacroBlockEnd: ''
MaxEmptyLinesToKeep: 1
NamespaceIndentation: None
ObjCBinPackProtocolList: Auto
Expand All @@ -131,13 +131,13 @@ PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 60
PenaltyIndentedWhitespace: 0
PointerAlignment: Right
PPIndentWidth: -1
PPIndentWidth: -1
ReferenceAlignment: Pointer
ReflowComments: true
ReflowComments: true
RemoveBracesLLVM: false
SeparateDefinitionBlocks: Leave
ShortNamespaceLines: 1
SortIncludes: CaseSensitive
SortIncludes: CaseSensitive
SortJavaStaticImport: Before
SortUsingDeclarations: true
SpaceAfterCStyleCast: false
Expand All @@ -154,34 +154,34 @@ SpaceBeforeParensOptions:
AfterForeachMacros: true
AfterFunctionDefinitionName: false
AfterFunctionDeclarationName: false
AfterIfMacros: true
AfterIfMacros: true
AfterOverloadedOperator: false
BeforeNonEmptyParentheses: false
SpaceAroundPointerQualifiers: Default
SpaceBeforeRangeBasedForLoopColon: true
SpaceInEmptyBlock: false
SpaceInEmptyParentheses: false
SpacesBeforeTrailingComments: 1
SpacesInAngles: Never
SpacesInAngles: Never
SpacesInConditionalStatement: false
SpacesInContainerLiterals: true
SpacesInCStyleCastParentheses: false
SpacesInLineCommentPrefix:
Minimum: 1
Maximum: -1
Minimum: 1
Maximum: -1
SpacesInParentheses: false
SpacesInSquareBrackets: false
SpaceBeforeSquareBrackets: false
BitFieldColonSpacing: Both
Standard: Latest
Standard: Latest
StatementAttributeLikeMacros:
- Q_EMIT
StatementMacros:
- Q_UNUSED
- QT_REQUIRE_VERSION
TabWidth: 8
UseCRLF: false
UseTab: Never
TabWidth: 8
UseCRLF: false
UseTab: Never
WhitespaceSensitiveMacros:
- STRINGIZE
- PP_STRINGIZE
Expand Down
77 changes: 38 additions & 39 deletions .clang-tidy
Original file line number Diff line number Diff line change
@@ -1,46 +1,45 @@
---
Checks: 'clang-diagnostic-*,clang-analyzer-*'
Checks: 'clang-diagnostic-*,clang-analyzer-*'
WarningsAsErrors: ''
HeaderFilterRegex: ''
AnalyzeTemporaryDtors: false
FormatStyle: none
User: axion
FormatStyle: none
CheckOptions:
- key: llvm-else-after-return.WarnOnConditionVariables
value: 'false'
- key: modernize-loop-convert.MinConfidence
value: reasonable
- key: modernize-replace-auto-ptr.IncludeStyle
value: llvm
- key: cert-str34-c.DiagnoseSignedUnsignedCharComparisons
value: 'false'
- key: google-readability-namespace-comments.ShortNamespaceLines
value: '10'
- key: cert-err33-c.CheckedFunctions
value: '::aligned_alloc;::asctime_s;::at_quick_exit;::atexit;::bsearch;::bsearch_s;::btowc;::c16rtomb;::c32rtomb;::calloc;::clock;::cnd_broadcast;::cnd_init;::cnd_signal;::cnd_timedwait;::cnd_wait;::ctime_s;::fclose;::fflush;::fgetc;::fgetpos;::fgets;::fgetwc;::fopen;::fopen_s;::fprintf;::fprintf_s;::fputc;::fputs;::fputwc;::fputws;::fread;::freopen;::freopen_s;::fscanf;::fscanf_s;::fseek;::fsetpos;::ftell;::fwprintf;::fwprintf_s;::fwrite;::fwscanf;::fwscanf_s;::getc;::getchar;::getenv;::getenv_s;::gets_s;::getwc;::getwchar;::gmtime;::gmtime_s;::localtime;::localtime_s;::malloc;::mbrtoc16;::mbrtoc32;::mbsrtowcs;::mbsrtowcs_s;::mbstowcs;::mbstowcs_s;::memchr;::mktime;::mtx_init;::mtx_lock;::mtx_timedlock;::mtx_trylock;::mtx_unlock;::printf_s;::putc;::putwc;::raise;::realloc;::remove;::rename;::scanf;::scanf_s;::setlocale;::setvbuf;::signal;::snprintf;::snprintf_s;::sprintf;::sprintf_s;::sscanf;::sscanf_s;::strchr;::strerror_s;::strftime;::strpbrk;::strrchr;::strstr;::strtod;::strtof;::strtoimax;::strtok;::strtok_s;::strtol;::strtold;::strtoll;::strtoul;::strtoull;::strtoumax;::strxfrm;::swprintf;::swprintf_s;::swscanf;::swscanf_s;::thrd_create;::thrd_detach;::thrd_join;::thrd_sleep;::time;::timespec_get;::tmpfile;::tmpfile_s;::tmpnam;::tmpnam_s;::tss_create;::tss_get;::tss_set;::ungetc;::ungetwc;::vfprintf;::vfprintf_s;::vfscanf;::vfscanf_s;::vfwprintf;::vfwprintf_s;::vfwscanf;::vfwscanf_s;::vprintf_s;::vscanf;::vscanf_s;::vsnprintf;::vsnprintf_s;::vsprintf;::vsprintf_s;::vsscanf;::vsscanf_s;::vswprintf;::vswprintf_s;::vswscanf;::vswscanf_s;::vwprintf_s;::vwscanf;::vwscanf_s;::wcrtomb;::wcschr;::wcsftime;::wcspbrk;::wcsrchr;::wcsrtombs;::wcsrtombs_s;::wcsstr;::wcstod;::wcstof;::wcstoimax;::wcstok;::wcstok_s;::wcstol;::wcstold;::wcstoll;::wcstombs;::wcstombs_s;::wcstoul;::wcstoull;::wcstoumax;::wcsxfrm;::wctob;::wctrans;::wctype;::wmemchr;::wprintf_s;::wscanf;::wscanf_s;'
- key: cert-oop54-cpp.WarnOnlyIfThisHasSuspiciousField
value: 'false'
- key: cert-dcl16-c.NewSuffixes
value: 'L;LL;LU;LLU'
- key: google-readability-braces-around-statements.ShortStatementLines
value: '1'
- key: cppcoreguidelines-non-private-member-variables-in-classes.IgnoreClassesWithAllMemberVariablesBeingPublic
value: 'true'
- key: google-readability-namespace-comments.SpacesBeforeComments
value: '2'
- key: modernize-loop-convert.MaxCopySize
value: '16'
- key: modernize-pass-by-value.IncludeStyle
value: llvm
- key: modernize-use-nullptr.NullMacros
value: 'NULL'
- key: llvm-qualified-auto.AddConstToQualified
value: 'false'
- key: modernize-loop-convert.NamingStyle
value: CamelCase
- key: llvm-else-after-return.WarnOnUnfixable
value: 'false'
- key: google-readability-function-size.StatementThreshold
value: '800'
- key: llvm-else-after-return.WarnOnConditionVariables
value: 'false'
- key: modernize-loop-convert.MinConfidence
value: reasonable
- key: modernize-replace-auto-ptr.IncludeStyle
value: llvm
- key: cert-str34-c.DiagnoseSignedUnsignedCharComparisons
value: 'false'
- key: google-readability-namespace-comments.ShortNamespaceLines
value: '10'
- key: cert-err33-c.CheckedFunctions
value: '::aligned_alloc;::asctime_s;::at_quick_exit;::atexit;::bsearch;::bsearch_s;::btowc;::c16rtomb;::c32rtomb;::calloc;::clock;::cnd_broadcast;::cnd_init;::cnd_signal;::cnd_timedwait;::cnd_wait;::ctime_s;::fclose;::fflush;::fgetc;::fgetpos;::fgets;::fgetwc;::fopen;::fopen_s;::fprintf;::fprintf_s;::fputc;::fputs;::fputwc;::fputws;::fread;::freopen;::freopen_s;::fscanf;::fscanf_s;::fseek;::fsetpos;::ftell;::fwprintf;::fwprintf_s;::fwrite;::fwscanf;::fwscanf_s;::getc;::getchar;::getenv;::getenv_s;::gets_s;::getwc;::getwchar;::gmtime;::gmtime_s;::localtime;::localtime_s;::malloc;::mbrtoc16;::mbrtoc32;::mbsrtowcs;::mbsrtowcs_s;::mbstowcs;::mbstowcs_s;::memchr;::mktime;::mtx_init;::mtx_lock;::mtx_timedlock;::mtx_trylock;::mtx_unlock;::printf_s;::putc;::putwc;::raise;::realloc;::remove;::rename;::scanf;::scanf_s;::setlocale;::setvbuf;::signal;::snprintf;::snprintf_s;::sprintf;::sprintf_s;::sscanf;::sscanf_s;::strchr;::strerror_s;::strftime;::strpbrk;::strrchr;::strstr;::strtod;::strtof;::strtoimax;::strtok;::strtok_s;::strtol;::strtold;::strtoll;::strtoul;::strtoull;::strtoumax;::strxfrm;::swprintf;::swprintf_s;::swscanf;::swscanf_s;::thrd_create;::thrd_detach;::thrd_join;::thrd_sleep;::time;::timespec_get;::tmpfile;::tmpfile_s;::tmpnam;::tmpnam_s;::tss_create;::tss_get;::tss_set;::ungetc;::ungetwc;::vfprintf;::vfprintf_s;::vfscanf;::vfscanf_s;::vfwprintf;::vfwprintf_s;::vfwscanf;::vfwscanf_s;::vprintf_s;::vscanf;::vscanf_s;::vsnprintf;::vsnprintf_s;::vsprintf;::vsprintf_s;::vsscanf;::vsscanf_s;::vswprintf;::vswprintf_s;::vswscanf;::vswscanf_s;::vwprintf_s;::vwscanf;::vwscanf_s;::wcrtomb;::wcschr;::wcsftime;::wcspbrk;::wcsrchr;::wcsrtombs;::wcsrtombs_s;::wcsstr;::wcstod;::wcstof;::wcstoimax;::wcstok;::wcstok_s;::wcstol;::wcstold;::wcstoll;::wcstombs;::wcstombs_s;::wcstoul;::wcstoull;::wcstoumax;::wcsxfrm;::wctob;::wctrans;::wctype;::wmemchr;::wprintf_s;::wscanf;::wscanf_s;'
- key: cert-oop54-cpp.WarnOnlyIfThisHasSuspiciousField
value: 'false'
- key: cert-dcl16-c.NewSuffixes
value: 'L;LL;LU;LLU'
- key: google-readability-braces-around-statements.ShortStatementLines
value: '1'
- key: cppcoreguidelines-non-private-member-variables-in-classes.IgnoreClassesWithAllMemberVariablesBeingPublic
value: 'true'
- key: google-readability-namespace-comments.SpacesBeforeComments
value: '2'
- key: modernize-loop-convert.MaxCopySize
value: '16'
- key: modernize-pass-by-value.IncludeStyle
value: llvm
- key: modernize-use-nullptr.NullMacros
value: 'NULL'
- key: llvm-qualified-auto.AddConstToQualified
value: 'false'
- key: modernize-loop-convert.NamingStyle
value: CamelCase
- key: llvm-else-after-return.WarnOnUnfixable
value: 'false'
- key: google-readability-function-size.StatementThreshold
value: '800'
...

Loading

0 comments on commit 19ab5c8

Please sign in to comment.