Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements for transpose, and more. #1624

Merged

Conversation

GillesDuvert
Copy link
Contributor

Transpose is improved by forcing use of our multi-threaded code (preexisting) instead of Eigen::'s. Checked to be 4+ times faster. As Transpose is used in several GDL functions this provides some stamina to them. The previous choice of Eigen:: seems strange in retrospect but was based on actual measurements at the time, but optimizations have been added since.

I've added a commandline switch (--with-eigen-transpose) to enable Eigen::Transpose in case this would prove faster on some architectures or after Eigen:: made progresses.

Second, this version permits, via the use of another switch (--smart-tpool) to use a threadpool mode where, in case threads are available, TPOOL_MIN_ELTS would also be more or less the number of elements that each thread will process, so that GDL may use less threads than the machine can provide (some GDL running machines have 64 or more cores). Obviously it is not worth starting 128 threads if 10 would already do the job in time. To get more concurrential threads, diminish TPOOL_MIN_ELTS, and conversely, to find the optimum for a specific case. May be a cure for #1149?

OTOH, it is not always the number of elements processed by one thread that govern the overall spent time. The time spent per element, were it a simple addition or a long procedure, is also a key factor. The parallelize() function (in basegdl.cpp) accepts modifiers to change this behaviour. I've tweaked a few, but this is not very 'adaptive', introspection will be needed.

Running GDL with --smart-tpool on machines with a large number of threads and test it would be invaluable.

…olution instead of Eigen::'s.

Added a commandline switch to enable Eigen::Transpose in case this would prove faster on some architectures or after Eigen:: made progresses.
Insured (sort of) that in multithreaded mode, TPOOL_MIN_ELTS is also more or less the number of elements that each thread will process, so that GDL
may use less threads than the machine can provide (some GDL running machines have 64 or more cores). 
Obviously it is not worth starting 128 threads if 10 would already do the job in time.
…hreads used insure each thread will process more or less TPOOL_MIN_ELTS, not a diminutve number given the number of available threads, that can be large, 128 or more.
…hreads used insure each thread will process more or less TPOOL_MIN_ELTS, not a diminutve number given the number of available threads, that can be large, 128 or more.
…ode) use of the max available number of threads, or other variant.
@codecov
Copy link

codecov bot commented Aug 29, 2023

Codecov Report

Patch coverage: 32.25% and project coverage change: -0.02% ⚠️

Comparison is base (907e95f) 41.02% compared to head (0757621) 41.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1624      +/-   ##
==========================================
- Coverage   41.02%   41.00%   -0.02%     
==========================================
  Files         355      355              
  Lines       95032    95045      +13     
  Branches    19527    19531       +4     
==========================================
- Hits        38987    38974      -13     
- Misses      56045    56071      +26     
Files Changed Coverage Δ
src/basic_fun.cpp 51.41% <ø> (ø)
src/gdl.cpp 59.39% <0.00%> (-1.48%) ⬇️
src/objects.cpp 97.16% <ø> (ø)
src/objects.hpp 66.66% <ø> (ø)
src/basegdl.cpp 6.32% <25.00%> (-0.45%) ⬇️
src/math_fun_jmg.cpp 28.12% <33.33%> (ø)
src/datatypes.cpp 42.24% <57.14%> (-0.70%) ⬇️
src/minmax_include.cpp 36.64% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@GillesDuvert GillesDuvert merged commit 104626d into gnudatalanguage:master Aug 30, 2023
5 of 8 checks passed
@GillesDuvert GillesDuvert deleted the improvements_transpose branch September 8, 2023 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant