-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance optimizations and 0.5 compatibility #56
Conversation
5ae4d05
to
62b777c
Compare
Ok, Julia 0.5 is out so would be great to look more into this. Is more work needed on this PR? Anything in particular I can help out with? |
62b777c
to
6051e08
Compare
Just rebased it to the master. Recently I was using my Once this PR is merged, I'll update the PR #46 for asynchronous parallel evaluator (synchronous one is good for NES-like methods, the asynchronous should improve Borg performance). I was not doing systematic comparison of the package performance using 0.4 and 0.5 julia. Perhaps you can run the benchmarks and compare the times to detect potential regressions? |
Ok, I only ran 2 repetitions per JuliaVersion+GitBranch but the top 5 performance changes per test file are: test_mutation_operators.jl (fastest = 3.86 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_mutation_operators.jl: 4.29 (+11.2%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_mutation_operators.jl: 69.53 (+1699.7%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_mutation_operators.jl: 87.21 (+2157.2%)
test_bimodal_cauchy_distribution.jl (fastest = 0.07 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt):
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_bimodal_cauchy_distribution.jl: 0.15 (+128.7%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_bimodal_cauchy_distribution.jl: 0.31 (+368.9%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_bimodal_cauchy_distribution.jl: 1.38 (+2005.3%)
test_search_space.jl (fastest = 2.65 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_search_space.jl: 3.15 (+18.9%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_search_space.jl: 32.95 (+1145.5%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_search_space.jl: 37.65 (+1323.0%)
test_random_search.jl (fastest = 0.29 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_random_search.jl: 0.31 (+6.0%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_random_search.jl: 3.07 (+964.6%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_random_search.jl: 3.81 (+1218.5%)
test_frequency_adaptation.jl (fastest = 0.34 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_frequency_adaptation.jl: 0.38 (+11.4%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_frequency_adaptation.jl: 3.48 (+913.8%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_frequency_adaptation.jl: 4.3 (+1151.8%) and for the overall testing time: TOTAL TIME for 26 test files, ef29295cefda4a67 (fastest = 53.05 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt):
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, TOTAL TIME for 26 test files, ef29295cefda4a67: 211.63 (+298.9%)
TOTAL TIME for 26 test files, 0f6275f23e64ceb4 (fastest = 53.08 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, TOTAL TIME for 26 test files, 0f6275f23e64ceb4: 182.54 (+243.9%) |
Great, thanks a lot! I will try to look into that. |
Yes, these are the only ones that are faster on 0.5.0 (on my machine): test_tracing.jl (fastest = 5.69 secs for 0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_tracing.jl: 5.74 (+0.8%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_tracing.jl: 5.84 (+2.6%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_tracing.jl: 7.0 (+23.0%)
test_toplevel_bboptimize.jl (fastest = 7.19 secs for 0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_toplevel_bboptimize.jl: 7.55 (+5.0%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_toplevel_bboptimize.jl: 7.65 (+6.4%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_toplevel_bboptimize.jl: 7.77 (+8.1%) |
So it seems mutation operators and SearchSpace should be the main focus. Why are they so much slower on 0.5.0? |
Do you want me to merge this one and you do a new branch or should I wait? |
I think it should be safe to merge, especially if you want to take a look at 0.5 regressions yourself. I don't know why mutations and search space are so slow. There was quite some refactoring of the array indexing in 0.5, maybe that's the issue. Mutation tests also actively use search space, so maybe it's the root cause. There are several tests in |
Ok, I'll merge for now. There are a few regressions but let's handle them when we dive deeper. I will look into the SearchSpace tests and code for now and we can sync before you start digging in. Thanks. |
Actually with 3 repetitions per JuliaVersion+BBOBranch and timing with CPUTime rather than Base I'm somewhat reluctant merging this: TOTAL TIME for 26 test files, ef29295cefda4a67 (fastest = 53.14 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0):
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, TOTAL TIME for 26 test files, ef29295cefda4a67: 219.87 (+313.7%)
TOTAL TIME for 26 test files, 0f6275f23e64ceb4 (fastest = 51.8 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, TOTAL TIME for 26 test files, 0f6275f23e64ceb4: 175.8 (+239.4%)
test_mutation_operators.jl (fastest = 3.77 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, test_mutation_operators.jl: 4.18 (+10.7%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 67.31 (+1684.5%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, test_mutation_operators.jl: 87.74 (+2226.3%)
test_search_space.jl (fastest = 2.66 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, test_search_space.jl: 2.76 (+3.8%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 28.0 (+953.1%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, test_search_space.jl: 44.14 (+1560.0%) I will investigate SearchSpace on a branch off master for now. |
Hmm, interesting. In principle, the PR doesn't touch the search space, yet the tests show something that is called from test_search_space have become slower. |
Yes, the inner loop in the test inside this context: context("rand_individuals correctly handles column-wise generation in assymetric search spaces") do
for i in 1:NumTestRepetitions÷10
numdims = rand(1:13)
minbounds = rand(numdims)
ds = rand(1:10, numdims) .* rand(numdims)
maxbounds = minbounds .+ ds
parambounds = collect(zip(minbounds, maxbounds))
ss = RangePerDimSearchSpace(parambounds)
@fact mins(ss) --> minbounds
@fact maxs(ss) --> maxbounds
@fact round(deltas(ss), 6) --> round(ds, 6)
# Now generate 100 individuals and make sure they are all within bounds
inds = rand_individuals(ss, 100)
@fact size(inds, 2) --> 100
for i in 1:size(inds, 2)
for d in 1:numdims
@fact (minbounds[d] <= inds[d,i] <= maxbounds[d]) --> true
end
end
end
end takes 62% of the time of running the test_search_space.jl tests. |
Could it be that the ternary interval operator is much slower in 0.5? |
Or it could be FactCheck that is causing this. Investigating... |
Seems this is a FactCheck problem. Times to run test_search_space.jl test file in different situations: Julia version, Inner loop variant, Time
0.5.0, two binary comparisons, 55.5
0.5.0, ternary comparison, 27.8
0.5.0, no comparison, 13.3
0.4.6, two binary comparisons, 8.4
0.4.6, ternary comparison, 7.8
0.4.6, no comparison, 7.0 So it seems it might all come down to FactCheck being very slow on 0.5.0. I suggest we switch to the new Base.Test. What do you think? |
63bb3bb
to
b306104
Compare
When converting tests I found that @test_throws finds that these two lines in test_parameters.jl actually raises MethodError's rather than the expected (?) KeyError's: # Actually throws MethodError:
@test_throws KeyError (dc1[:NotThere] = 3)
# Actually throws MethodError:
@test_throws KeyError (dc1[:NotThere] = "abc") Can you please check what should be expected here so we are sure before updating it. |
Dict throws |
Ok, I ported the main test suite to BaseTestNext. The mutation operators and search space doesn't seem to be a problem with 0.5.0 but there seems to be some work to do in test_mutation_operators.jl (fastest = 0.88 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_mutation_operators.jl: 1.04 (+18.6%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 3.77 (+330.4%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 67.31 (+7580.7%)
test_search_space.jl (fastest = 2.63 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 2.66 (+1.0%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_search_space.jl: 2.96 (+12.5%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 28.0 (+963.4%)
TOTAL TIME for 26 test files, bd18b1c2d90640a8 (fastest = 55.44 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, TOTAL TIME for 26 test files, bd18b1c2d90640a8: 62.61 (+12.9%)
test_fitness.jl (fastest = 2.48 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_fitness.jl: 5.65 (+128.0%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_fitness.jl: 6.39 (+157.7%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_fitness.jl: 6.46 (+160.6%) |
|
Except for test_fitness, the 13% increased total time on 0.5.0 might mainly be the increased compilation times from the LLVM change in 0.5.0 (release notes says that compilation times should be expected to be somewhat longer). |
could improve type inference for top-level objects and allow running tests using `reload()`
so that DictChain throws the same KeyError exception as Dict
NP, NC parameters should always be integer constants, but in some contexts the real type of CrossoverOperator{NP,NC} could not be deduced at compile time, so numparents/children() result type would be Any, unless explicitly specified
this allows Julia to compile recombination and candidate processing code knowing the exact type of recombination operator
- avoid recursion when sampling - avoid keyword arguments when sampling
since Julia 0.4.0- is no longer supported
since we don't want to extend these definitions
since Julia 0.5 is mature
use @inbounds, map() and zip() instead of comprehensions to avoid costly element accesses
9bbfc8e
to
cb94553
Compare
Just rebased the PR against the master. The |
Thanks for this, it is looking good. The test_mutation_operators.jl (fastest = 0.88 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_mutation_operators.jl: 1.0 (+13.7%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_mutation_operators.jl: 1.0 (+13.9%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_mutation_operators.jl: 1.0 (+14.3%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_mutation_operators.jl: 1.02 (+16.6%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_mutation_operators.jl: 1.04 (+18.6%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 3.77 (+330.4%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 67.31 (+7580.7%)
test_search_space.jl (fastest = 2.63 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 2.66 (+1.0%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_search_space.jl: 2.89 (+9.9%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_search_space.jl: 2.95 (+12.1%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_search_space.jl: 2.96 (+12.5%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_search_space.jl: 3.14 (+19.3%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_search_space.jl: 3.28 (+24.5%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 28.0 (+963.4%)
test_fitness.jl (fastest = 2.48 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_fitness.jl: 5.65 (+128.0%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_fitness.jl: 6.38 (+157.3%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_fitness.jl: 6.39 (+157.7%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_fitness.jl: 6.46 (+160.6%)
0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_fitness.jl: 6.52 (+163.2%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_fitness.jl: 7.0 (+182.4%)
0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_fitness.jl: 7.09 (+186.0%) There seems to be no big performance differences with this branch (and still some perf regression in fitness) but since this has other test additions and fixes I will go ahead and merge. |
Some performance optimizations for Borg based on the quick look at profiling and
@code_warntype
. I hope to detect more bottlenecks and update this PR.It looks like some Julia codegen optimizations are broken on v0.5, several people reported ~x5 slowdowns (see e.g. JuliaLang/julia#18129 JuliaLang/julia#18135). BBO might suffer from this as well, but I'm not sure what is the most effective (and automated) way to detect and isolate such issues.