Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of sequential batch optimization #74

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Initial implementation of sequential batch optimization #74

wants to merge 1 commit into from

Conversation

nrontsis
Copy link

@nrontsis nrontsis commented Aug 26, 2017

This is a first attempt to partially address #71

Minimal example (modified version of this):

import numpy as np
from GPflowOpt.domain import ContinuousParameter
import GPflow
from GPflowOpt.bo import BayesianOptimizer
from GPflowOpt.design import LatinHyperCube
from GPflowOpt.acquisition import QEI_CL


def fx(X):
    X = np.atleast_2d(X)
    # Return objective & gradient
    return np.sum(np.square(X), axis=1)[:, None]


domain = ContinuousParameter('x1', -2, 2) + ContinuousParameter('x2', -1, 2)

# Use standard Gaussian process Regression
lhd = LatinHyperCube(21, domain)
X = lhd.generate()
Y = fx(X)
model = GPflow.gpr.GPR(X, Y, GPflow.kernels.Matern52(2, ARD=True))

# Now create the Bayesian Optimizer
alpha = QEI_CL(model, batch_size=5)
optimizer = BayesianOptimizer(domain, alpha)

# Run the Bayesian optimization
# with optimizer.silent():
r = optimizer.optimize(fx, n_iter=15)
print(r)

In my setup, this PR and the original master code occasionally exhibits:

  1. Optimisation failures
Warning: optimization restart #/# failed
  1. NaN/Inf in the gradient
Warning: inf or nan in gradient: replacing with zeros
  1. Cholesky decomposition failures.

@codecov-io
Copy link

codecov-io commented Aug 26, 2017

Codecov Report

Merging #74 into master will decrease coverage by 1.72%.
The diff coverage is 45.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #74      +/-   ##
==========================================
- Coverage    99.8%   98.07%   -1.73%     
==========================================
  Files          17       18       +1     
  Lines        1013     1041      +28     
==========================================
+ Hits         1011     1021      +10     
- Misses          2       20      +18
Impacted Files Coverage Δ
GPflowOpt/acquisition/__init__.py 100% <100%> (ø) ⬆️
GPflowOpt/acquisition/acquisition.py 100% <100%> (ø) ⬆️
GPflowOpt/acquisition/ei.py 100% <100%> (ø) ⬆️
GPflowOpt/bo.py 89.28% <27.27%> (-9.39%) ⬇️
GPflowOpt/acquisition/qei_cl.py 41.17% <41.17%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc36cfe...40c79b8. Read the comment docs.

@javdrher
Copy link
Member

Great job, this is great to start from. I think we can do some reshaping to make these features more part of the framework and less specific to the qEI implementation which is otherwise great. I'll add some notes inline.

For the troubles you encounter, 1-2 aren't something to be super worried about, 3 is concerning. This can usually be remedied by adding some priors on the hyperparameters (or some things we are discussing in #73 )

Copy link
Member

@javdrher javdrher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets consider this first, afterwards we might even consider moving the qEI into the normal EI class?

Y = np.vstack((self.Y_,) + (self.fmin.value,)*len(args))
self.set_data(X, Y)

def __enter__(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two notes:

  • I dislike the old style of doing contexts and prefer the contextmanager decorator (see https://jeffknupp.com/blog/2016/03/07/python-with-context-managers/ fun with contextlib).
  • I think, with contextlib, this should be moved to Acquisition as the batch context is something which is potentially applicable to any acquisition implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the following in acquisition:

class SequentialBatch(object):
      def __init__(acquisition, size):
            self.acquisition = acquisition
            self.size = size

     def optimize(optimizer):
         points = []
         def inverse_acquisition(X):
               ....

         for j in range(self.acquisition.batch_size):
             result = optimizer.optimize(inverse_acquisition)
             batch.append(result.x)
             self.acquisition.set_batch(np.concatenate(points))
     return np.concatenate(points)

...

class Acquisition(Parameterized):
....

    def set_batch(self, batch_points):
        pass  # To be implemented in subclasses, e.g. qEI

    @contextmanager
    def batch(size):
          X_orig, Y_orig = self.data
          yield SequentialBatch(self, size)
          self.set_batch(np.empty((0, self.domain.size)))
          self.set_data(X_orig, Y_orig)

Then in BayesianOptimizer you could do:

    if self.batch_size > 1:
        with self.acquisition.batch(self.batch_size) as batch:
            points = batch.optimize(self.optimizer)
       self._update_model_data(points)

with some further thought I think you could even see the one point strategy here as a batch with size 1 here and simply do this by default in BayesianOptimizer. What about my proposal for set_batch? It seems intuitive to simply pass in an array which is the current batch. Then if you pass a 0 x dim it signals clearing. This way has some advantages I think

  • The interface for adding batch support to an acquisition implementation is limited to implementing set_batch.
  • if we'd ever have asynchronous batches, once a point was evaluate you could simply perform set_batch(2d array \ evaluated point) and optimize. The great thing is that given this angle, all that would be needed to expand towards such approach is another object which is yielded from the context, e.g. AsyncSequentialBatch or something. Just a motivation for the design in the snippet)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing I just realized, with the pseudocode of the previous reply, the logic of selecting the points relays to a different policy object (the SequentialBatch). It would be perfectly possible to add a TrueBatch which duplicates the domain and selects the points at once. Swtiching between both approaches would simply require to instantiate BayesianOptimizer with a different object in its constructor. This would be a very simple and clean way to make both approaches co-exist if we merge this branch with the other batch approach.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for the very detailed response! Sorry the delay from my part; the past week was busy.

Having a batch contextmanager in Acquisition assumes that the initialization and finilization for the batch is irrespective of the batch policy. I think this is restrictive, particularly due to the fact that it cannot be "overwritten" by the SequentialBatch class. This is why I tried to add the context in the final class itself. Note that, to my understanding, this cannot be done with contextlib decorators. On the other hand, I agree that set_batch should be included in the base Acquisition class as it can be overwritten by other child classes.

If we only plan to use the contextmanager to call optimize, as you suggest doing in BayesianOptimizer:

with self.acquisition.batch(self.batch_size) as batch:
    points = batch.optimize(self.optimizer)

then there is no need to use the context, at least in BayesianOptimizer. The function batch.optimize() should be responsible for initializing/clean up (either with a help of a context internally or by itself).

But I do believe that a context is useful. It gives us safe access to penalised/modified acquisition functions via set_batch, which would be useful for researching the effect of the batch heuristics used.

I could add a function optimise in the batch acquisition class. It would be something like:

def optimize(optimizer):
         points = []
         def inverse_acquisition(X):
               ....
         # Use the context here
         with self:
                # Build the batch...
         return points

and the BayesianOptimiser would simply do:

if self.batch_size > 1:
        # No need for context here, it is used inside acquisition.optimize
        points = acquisition.optimize(self.optimizer)

Finally, this might be on the pedantic side, but I would call the function get_suggestion instead of optimize. I think optimise is not necessarily descriptive of the process done and it would make me wonder why not directly use optimizer.optimize.

Looking forward to hearing your thoughts on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have discussed this today and considered some scenario's. I think we won't be able to solve everything in one PR unfortunately, we are proposing the following for now to get started which should be a nice compromise:
(1) do not implement a context for now
(2) Class wise, we'd like to go for the following
image
I favor your idea of naming (i.e. get_suggestion) as optimize would be confusing with GPflow, and confusing with optimizer.optimize.
i.e., the get_suggestion in sequential batch looks a lot like the sequential batch above and also contains the operation of the context currently. If a sequential strategy really diverts from the default it can override get_suggestion. Furthermore, the methods get_suggestion relies on go in the interface of the derived class (i.e. in SequentialBatchAcquisition, a set_batch is required, this method is missing from the other classes). In Acquisition get_suggestion simply optimizes, and BayesianOptimizer calls acquisition.get_suggestion, it doesn't even need to verify batch size.

I think this is a good design to start with, its also a plain OO solution to solve this without extra complexity. We also added for instance an ABAcquisition, which could be used to implement for instance https://arxiv.org/pdf/1704.03651.pdf which isn't a true batch approach but it shares some semantics. Unfortunately, this currently isn't compatible with the aggregation mechanics and MCMC components we have in place currently. I'm still in the process of discussing this with @icouckuy but we should be able to resolve this later on it will however involve some reshuffling but right now I think its fully manageable.

What do you think. You could add SequentialBatchAcquisition, I'll do ParallelBatchAcquisition.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think it's a good plan. I will try to update the PR accordingly for SequentialBatchAcquisition by early next week.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, to give you an idea: this is where we are heading at the moment:
image
still discussion ongoing though. Ultimately the idea is that Acquisition/SequentialBatchAcquisition become the same thing (set_batch by default doing nothing, for batch support it is sufficient to override). The design of the UML also takes into account the current capabilities of the framework to add and multiply, use MCMC etc. Your code will be easy to fit in so don't worry about this for now.

Kriging Is Well-Suited to Parallelize Optimization.
"""

def __init__(self, model, batch_size):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking to derive batch_size in acquisition each time set_batch is called, so we can keep batch_size in BayesianOptimizer. This would also permit changing the size of the batches during the optimization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

@javdrher javdrher mentioned this pull request Aug 27, 2017
@javdrher javdrher mentioned this pull request Sep 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants