Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimizations and 0.5 compatibility #56

Merged
merged 19 commits into from
Sep 22, 2016

Conversation

alyst
Copy link
Contributor

@alyst alyst commented Aug 19, 2016

Some performance optimizations for Borg based on the quick look at profiling and @code_warntype. I hope to detect more bottlenecks and update this PR.

It looks like some Julia codegen optimizations are broken on v0.5, several people reported ~x5 slowdowns (see e.g. JuliaLang/julia#18129 JuliaLang/julia#18135). BBO might suffer from this as well, but I'm not sure what is the most effective (and automated) way to detect and isolate such issues.

@coveralls
Copy link

coveralls commented Aug 19, 2016

Coverage Status

Coverage increased (+0.1%) to 66.977% when pulling 5a1cb28 on alyst:misc_perf_opt into c399d40 on robertfeldt:master.

@coveralls
Copy link

coveralls commented Aug 19, 2016

Coverage Status

Coverage increased (+0.1%) to 66.977% when pulling f799808 on alyst:misc_perf_opt into a4b1b36 on robertfeldt:master.

@coveralls
Copy link

coveralls commented Aug 19, 2016

Coverage Status

Coverage increased (+0.05%) to 66.916% when pulling b2ca2dc on alyst:misc_perf_opt into a4b1b36 on robertfeldt:master.

@alyst alyst force-pushed the misc_perf_opt branch 2 times, most recently from 5ae4d05 to 62b777c Compare August 30, 2016 13:45
@coveralls
Copy link

coveralls commented Aug 30, 2016

Coverage Status

Coverage decreased (-0.6%) to 66.324% when pulling 5ae4d05 on alyst:misc_perf_opt into f0389b7 on robertfeldt:master.

@coveralls
Copy link

coveralls commented Aug 30, 2016

Coverage Status

Coverage increased (+0.02%) to 66.931% when pulling 62b777c on alyst:misc_perf_opt into f0389b7 on robertfeldt:master.

@coveralls
Copy link

coveralls commented Aug 30, 2016

Coverage Status

Coverage increased (+0.06%) to 66.978% when pulling 62b777c on alyst:misc_perf_opt into f0389b7 on robertfeldt:master.

@robertfeldt
Copy link
Owner

Ok, Julia 0.5 is out so would be great to look more into this. Is more work needed on this PR? Anything in particular I can help out with?

@alyst
Copy link
Contributor Author

alyst commented Sep 20, 2016

Just rebased it to the master. Recently I was using my tests branch (this branch + asynchronous parallel evaluator) and it was working fine on 0.5-rcN, so I think it should fine to merge.

Once this PR is merged, I'll update the PR #46 for asynchronous parallel evaluator (synchronous one is good for NES-like methods, the asynchronous should improve Borg performance).

I was not doing systematic comparison of the package performance using 0.4 and 0.5 julia. Perhaps you can run the benchmarks and compare the times to detect potential regressions?

@coveralls
Copy link

coveralls commented Sep 20, 2016

Coverage Status

Coverage increased (+0.8%) to 67.679% when pulling 6051e08 on alyst:misc_perf_opt into f0389b7 on robertfeldt:master.

@robertfeldt
Copy link
Owner

Ok, I only ran 2 repetitions per JuliaVersion+GitBranch but the top 5 performance changes per test file are:

test_mutation_operators.jl (fastest = 3.86 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_mutation_operators.jl: 4.29 (+11.2%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_mutation_operators.jl: 69.53 (+1699.7%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_mutation_operators.jl: 87.21 (+2157.2%)

test_bimodal_cauchy_distribution.jl (fastest = 0.07 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt):
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_bimodal_cauchy_distribution.jl: 0.15 (+128.7%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_bimodal_cauchy_distribution.jl: 0.31 (+368.9%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_bimodal_cauchy_distribution.jl: 1.38 (+2005.3%)

test_search_space.jl (fastest = 2.65 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_search_space.jl: 3.15 (+18.9%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_search_space.jl: 32.95 (+1145.5%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_search_space.jl: 37.65 (+1323.0%)

test_random_search.jl (fastest = 0.29 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_random_search.jl: 0.31 (+6.0%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_random_search.jl: 3.07 (+964.6%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_random_search.jl: 3.81 (+1218.5%)

test_frequency_adaptation.jl (fastest = 0.34 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_frequency_adaptation.jl: 0.38 (+11.4%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_frequency_adaptation.jl: 3.48 (+913.8%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_frequency_adaptation.jl: 4.3 (+1151.8%)

and for the overall testing time:

TOTAL TIME for 26 test files, ef29295cefda4a67 (fastest = 53.05 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt):
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, TOTAL TIME for 26 test files, ef29295cefda4a67: 211.63 (+298.9%)

TOTAL TIME for 26 test files, 0f6275f23e64ceb4 (fastest = 53.08 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, TOTAL TIME for 26 test files, 0f6275f23e64ceb4: 182.54 (+243.9%)

@alyst
Copy link
Contributor Author

alyst commented Sep 21, 2016

Great, thanks a lot! I will try to look into that.
Had you observed any tests where 0.5 is faster?

@robertfeldt
Copy link
Owner

Yes, these are the only ones that are faster on 0.5.0 (on my machine):

test_tracing.jl (fastest = 5.69 secs for 0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_tracing.jl: 5.74 (+0.8%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_tracing.jl: 5.84 (+2.6%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_tracing.jl: 7.0 (+23.0%)

test_toplevel_bboptimize.jl (fastest = 7.19 secs for 0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_toplevel_bboptimize.jl: 7.55 (+5.0%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt, test_toplevel_bboptimize.jl: 7.65 (+6.4%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master, test_toplevel_bboptimize.jl: 7.77 (+8.1%)

@robertfeldt
Copy link
Owner

So it seems mutation operators and SearchSpace should be the main focus. Why are they so much slower on 0.5.0?

@robertfeldt
Copy link
Owner

Do you want me to merge this one and you do a new branch or should I wait?

@alyst
Copy link
Contributor Author

alyst commented Sep 21, 2016

I think it should be safe to merge, especially if you want to take a look at 0.5 regressions yourself.
Your benchmark shows there are some regressions of this PR over master on 0.5, so maybe it would have to be revisited later. I plan to look at it, but probably over the weekend.

I don't know why mutations and search space are so slow. There was quite some refactoring of the array indexing in 0.5, maybe that's the issue. Mutation tests also actively use search space, so maybe it's the root cause. There are several tests in test_search_space, so I need to dissect further.

@robertfeldt
Copy link
Owner

Ok, I'll merge for now. There are a few regressions but let's handle them when we dive deeper. I will look into the SearchSpace tests and code for now and we can sync before you start digging in. Thanks.

@alyst alyst changed the title WIP: Performance optimizations Performance optimizations and 0.5 compatibility Sep 21, 2016
@robertfeldt
Copy link
Owner

Actually with 3 repetitions per JuliaVersion+BBOBranch and timing with CPUTime rather than Base I'm somewhat reluctant merging this:

TOTAL TIME for 26 test files, ef29295cefda4a67 (fastest = 53.14 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0):
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, TOTAL TIME for 26 test files, ef29295cefda4a67: 219.87 (+313.7%)

TOTAL TIME for 26 test files, 0f6275f23e64ceb4 (fastest = 51.8 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, TOTAL TIME for 26 test files, 0f6275f23e64ceb4: 175.8 (+239.4%)

test_mutation_operators.jl (fastest = 3.77 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, test_mutation_operators.jl: 4.18 (+10.7%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 67.31 (+1684.5%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, test_mutation_operators.jl: 87.74 (+2226.3%)

test_search_space.jl (fastest = 2.66 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, test_search_space.jl: 2.76 (+3.8%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 28.0 (+953.1%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/6051e0, test_search_space.jl: 44.14 (+1560.0%)

I will investigate SearchSpace on a branch off master for now.

@alyst
Copy link
Contributor Author

alyst commented Sep 21, 2016

Hmm, interesting. In principle, the PR doesn't touch the search space, yet the tests show something that is called from test_search_space have become slower.

@robertfeldt
Copy link
Owner

robertfeldt commented Sep 21, 2016

Yes, the inner loop in the test inside this context:

  context("rand_individuals correctly handles column-wise generation in assymetric search spaces") do
    for i in 1:NumTestRepetitions÷10
      numdims = rand(1:13)
      minbounds = rand(numdims)
      ds = rand(1:10, numdims) .* rand(numdims)
      maxbounds = minbounds .+ ds
      parambounds = collect(zip(minbounds, maxbounds))
      ss = RangePerDimSearchSpace(parambounds)
      @fact mins(ss) --> minbounds
      @fact maxs(ss) --> maxbounds
      @fact round(deltas(ss), 6) --> round(ds, 6)

      # Now generate 100 individuals and make sure they are all within bounds
      inds = rand_individuals(ss, 100)
      @fact size(inds, 2) --> 100
      for i in 1:size(inds, 2)
        for d in 1:numdims
          @fact (minbounds[d] <= inds[d,i] <= maxbounds[d]) --> true
        end
      end
    end
  end

takes 62% of the time of running the test_search_space.jl tests.

@robertfeldt
Copy link
Owner

Could it be that the ternary interval operator is much slower in 0.5?

@robertfeldt
Copy link
Owner

Or it could be FactCheck that is causing this. Investigating...

@robertfeldt
Copy link
Owner

robertfeldt commented Sep 21, 2016

Seems this is a FactCheck problem. Times to run test_search_space.jl test file in different situations:

Julia version, Inner loop variant, Time
0.5.0, two binary comparisons, 55.5
0.5.0, ternary comparison, 27.8
0.5.0, no comparison, 13.3
0.4.6, two binary comparisons, 8.4
0.4.6, ternary comparison, 7.8
0.4.6, no comparison, 7.0

So it seems it might all come down to FactCheck being very slow on 0.5.0. I suggest we switch to the new Base.Test. What do you think?

@alyst alyst force-pushed the misc_perf_opt branch 3 times, most recently from 63bb3bb to b306104 Compare September 21, 2016 14:58
@robertfeldt
Copy link
Owner

When converting tests I found that @test_throws finds that these two lines in test_parameters.jl actually raises MethodError's rather than the expected (?) KeyError's:

# Actually throws MethodError: 
@test_throws KeyError (dc1[:NotThere] = 3)
# Actually throws MethodError:
@test_throws KeyError (dc1[:NotThere] = "abc")

Can you please check what should be expected here so we are sure before updating it.

@alyst
Copy link
Contributor Author

alyst commented Sep 21, 2016

Dict throws KeyError in similar case, so DictChain should be updated to have the same behaviour. The signatures of get/set methods should be relaxed by removing the types of key and value params.

@robertfeldt
Copy link
Owner

Ok, I ported the main test suite to BaseTestNext. The mutation operators and search space doesn't seem to be a problem with 0.5.0 but there seems to be some work to do in test_fitness:

test_mutation_operators.jl (fastest = 0.88 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_mutation_operators.jl: 1.04 (+18.6%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 3.77 (+330.4%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 67.31 (+7580.7%)

test_search_space.jl (fastest = 2.63 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 2.66 (+1.0%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_search_space.jl: 2.96 (+12.5%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 28.0 (+963.4%)

TOTAL TIME for 26 test files, bd18b1c2d90640a8 (fastest = 55.44 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, TOTAL TIME for 26 test files, bd18b1c2d90640a8: 62.61 (+12.9%)

test_fitness.jl (fastest = 2.48 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_fitness.jl: 5.65 (+128.0%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_fitness.jl: 6.39 (+157.7%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_fitness.jl: 6.46 (+160.6%)

@robertfeldt
Copy link
Owner

...master/770a2f is the latest commit.

@robertfeldt
Copy link
Owner

Except for test_fitness, the 13% increased total time on 0.5.0 might mainly be the increased compilation times from the LLVM change in 0.5.0 (release notes says that compilation times should be expected to be somewhat longer).

could improve type inference for top-level objects and allow
running tests using `reload()`
so that DictChain throws the same KeyError exception as Dict
NP, NC parameters should always be integer constants, but in some
contexts the real type of CrossoverOperator{NP,NC} could not be
deduced at compile time, so numparents/children() result type would
be Any, unless explicitly specified
this allows Julia to compile recombination and candidate
processing code knowing the exact type of recombination operator
- avoid recursion when sampling
- avoid keyword arguments when sampling
since Julia 0.4.0- is no longer supported
since we don't want to extend these definitions
since Julia 0.5 is mature
use @inbounds, map() and zip() instead of comprehensions
to avoid costly element accesses
@alyst
Copy link
Contributor Author

alyst commented Sep 22, 2016

Just rebased the PR against the master. The DictChain get()/set() and show() should be fixed now.

@coveralls
Copy link

coveralls commented Sep 22, 2016

Coverage Status

Coverage increased (+0.9%) to 68.084% when pulling cb94553 on alyst:misc_perf_opt into d249ab8 on robertfeldt:master.

@robertfeldt
Copy link
Owner

Thanks for this, it is looking good. The d2... is latest master and cb...is latest in this PR and f0... was master when tests were using FactCheck (so might not be testing all the same things anylonger):

test_mutation_operators.jl (fastest = 0.88 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_mutation_operators.jl: 1.0 (+13.7%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_mutation_operators.jl: 1.0 (+13.9%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_mutation_operators.jl: 1.0 (+14.3%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_mutation_operators.jl: 1.02 (+16.6%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_mutation_operators.jl: 1.04 (+18.6%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 3.77 (+330.4%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_mutation_operators.jl: 67.31 (+7580.7%)

test_search_space.jl (fastest = 2.63 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f):
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 2.66 (+1.0%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_search_space.jl: 2.89 (+9.9%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_search_space.jl: 2.95 (+12.1%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_search_space.jl: 2.96 (+12.5%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_search_space.jl: 3.14 (+19.3%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_search_space.jl: 3.28 (+24.5%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_search_space.jl: 28.0 (+963.4%)

test_fitness.jl (fastest = 2.48 secs for 0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b):
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/f0389b, test_fitness.jl: 5.65 (+128.0%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_fitness.jl: 6.38 (+157.3%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_fitness.jl: 6.39 (+157.7%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/770a2f, test_fitness.jl: 6.46 (+160.6%)
  0.5.0, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_fitness.jl: 6.52 (+163.2%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/master/d249ab, test_fitness.jl: 7.0 (+182.4%)
  0.4.6, git@github.com:robertfeldt/BlackBoxOptim.jl.git/alyst-misc_perf_opt/cb9455, test_fitness.jl: 7.09 (+186.0%)

There seems to be no big performance differences with this branch (and still some perf regression in fitness) but since this has other test additions and fixes I will go ahead and merge.

@robertfeldt robertfeldt merged commit cb94553 into robertfeldt:master Sep 22, 2016
@alyst alyst deleted the misc_perf_opt branch July 24, 2017 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants