Initial implementation of sequential batch optimization #74

nrontsis · 2017-08-26T16:25:35Z

This is a first attempt to partially address #71

Minimal example (modified version of this):

import numpy as np
from GPflowOpt.domain import ContinuousParameter
import GPflow
from GPflowOpt.bo import BayesianOptimizer
from GPflowOpt.design import LatinHyperCube
from GPflowOpt.acquisition import QEI_CL


def fx(X):
    X = np.atleast_2d(X)
    # Return objective & gradient
    return np.sum(np.square(X), axis=1)[:, None]


domain = ContinuousParameter('x1', -2, 2) + ContinuousParameter('x2', -1, 2)

# Use standard Gaussian process Regression
lhd = LatinHyperCube(21, domain)
X = lhd.generate()
Y = fx(X)
model = GPflow.gpr.GPR(X, Y, GPflow.kernels.Matern52(2, ARD=True))

# Now create the Bayesian Optimizer
alpha = QEI_CL(model, batch_size=5)
optimizer = BayesianOptimizer(domain, alpha)

# Run the Bayesian optimization
# with optimizer.silent():
r = optimizer.optimize(fx, n_iter=15)
print(r)

In my setup, this PR and the original master code occasionally exhibits:

Optimisation failures

Warning: optimization restart #/# failed

NaN/Inf in the gradient

Warning: inf or nan in gradient: replacing with zeros

Cholesky decomposition failures.

codecov-io · 2017-08-26T16:34:19Z

Codecov Report

Merging #74 into master will decrease coverage by 1.72%.
The diff coverage is 45.45%.

@@            Coverage Diff             @@
##           master      #74      +/-   ##
==========================================
- Coverage    99.8%   98.07%   -1.73%     
==========================================
  Files          17       18       +1     
  Lines        1013     1041      +28     
==========================================
+ Hits         1011     1021      +10     
- Misses          2       20      +18

Impacted Files	Coverage Δ
GPflowOpt/acquisition/__init__.py	`100% <100%> (ø)`	⬆️
GPflowOpt/acquisition/acquisition.py	`100% <100%> (ø)`	⬆️
GPflowOpt/acquisition/ei.py	`100% <100%> (ø)`	⬆️
GPflowOpt/bo.py	`89.28% <27.27%> (-9.39%)`	⬇️
GPflowOpt/acquisition/qei_cl.py	`41.17% <41.17%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc36cfe...40c79b8. Read the comment docs.

javdrher · 2017-08-27T21:23:24Z

Great job, this is great to start from. I think we can do some reshaping to make these features more part of the framework and less specific to the qEI implementation which is otherwise great. I'll add some notes inline.

For the troubles you encounter, 1-2 aren't something to be super worried about, 3 is concerning. This can usually be remedied by adding some priors on the hyperparameters (or some things we are discussing in #73 )

javdrher

Lets consider this first, afterwards we might even consider moving the qEI into the normal EI class?

javdrher · 2017-08-27T21:27:57Z

GPflowOpt/acquisition/qei_cl.py

+        Y = np.vstack((self.Y_,) + (self.fmin.value,)*len(args))
+        self.set_data(X, Y)
+
+    def __enter__(self):


Two notes:

I dislike the old style of doing contexts and prefer the contextmanager decorator (see https://jeffknupp.com/blog/2016/03/07/python-with-context-managers/ fun with contextlib).

I think, with contextlib, this should be moved to Acquisition as the batch context is something which is potentially applicable to any acquisition implementation.

what about the following in acquisition:

class SequentialBatch(object): def __init__(acquisition, size): self.acquisition = acquisition self.size = size def optimize(optimizer): points = [] def inverse_acquisition(X): .... for j in range(self.acquisition.batch_size): result = optimizer.optimize(inverse_acquisition) batch.append(result.x) self.acquisition.set_batch(np.concatenate(points)) return np.concatenate(points) ... class Acquisition(Parameterized): .... def set_batch(self, batch_points): pass # To be implemented in subclasses, e.g. qEI @contextmanager def batch(size): X_orig, Y_orig = self.data yield SequentialBatch(self, size) self.set_batch(np.empty((0, self.domain.size))) self.set_data(X_orig, Y_orig)

Then in BayesianOptimizer you could do:

if self.batch_size > 1: with self.acquisition.batch(self.batch_size) as batch: points = batch.optimize(self.optimizer) self._update_model_data(points)

with some further thought I think you could even see the one point strategy here as a batch with size 1 here and simply do this by default in BayesianOptimizer. What about my proposal for set_batch? It seems intuitive to simply pass in an array which is the current batch. Then if you pass a 0 x dim it signals clearing. This way has some advantages I think

The interface for adding batch support to an acquisition implementation is limited to implementing set_batch.

if we'd ever have asynchronous batches, once a point was evaluate you could simply perform set_batch(2d array \ evaluated point) and optimize. The great thing is that given this angle, all that would be needed to expand towards such approach is another object which is yielded from the context, e.g. AsyncSequentialBatch or something. Just a motivation for the design in the snippet)

One more thing I just realized, with the pseudocode of the previous reply, the logic of selecting the points relays to a different policy object (the SequentialBatch). It would be perfectly possible to add a TrueBatch which duplicates the domain and selects the points at once. Swtiching between both approaches would simply require to instantiate BayesianOptimizer with a different object in its constructor. This would be a very simple and clean way to make both approaches co-exist if we merge this branch with the other batch approach.

Hey, thanks for the very detailed response! Sorry the delay from my part; the past week was busy.

Having a batch contextmanager in Acquisition assumes that the initialization and finilization for the batch is irrespective of the batch policy. I think this is restrictive, particularly due to the fact that it cannot be "overwritten" by the SequentialBatch class. This is why I tried to add the context in the final class itself. Note that, to my understanding, this cannot be done with contextlib decorators. On the other hand, I agree that set_batch should be included in the base Acquisition class as it can be overwritten by other child classes.

If we only plan to use the contextmanager to call optimize, as you suggest doing in BayesianOptimizer:

with self.acquisition.batch(self.batch_size) as batch: points = batch.optimize(self.optimizer)

then there is no need to use the context, at least in BayesianOptimizer. The function batch.optimize() should be responsible for initializing/clean up (either with a help of a context internally or by itself).

But I do believe that a context is useful. It gives us safe access to penalised/modified acquisition functions via set_batch, which would be useful for researching the effect of the batch heuristics used.

I could add a function optimise in the batch acquisition class. It would be something like:

def optimize(optimizer): points = [] def inverse_acquisition(X): .... # Use the context here with self: # Build the batch... return points

and the BayesianOptimiser would simply do:

if self.batch_size > 1: # No need for context here, it is used inside acquisition.optimize points = acquisition.optimize(self.optimizer)

Finally, this might be on the pedantic side, but I would call the function get_suggestion instead of optimize. I think optimise is not necessarily descriptive of the process done and it would make me wonder why not directly use optimizer.optimize.

Looking forward to hearing your thoughts on this.

We have discussed this today and considered some scenario's. I think we won't be able to solve everything in one PR unfortunately, we are proposing the following for now to get started which should be a nice compromise:
(1) do not implement a context for now
(2) Class wise, we'd like to go for the following

I favor your idea of naming (i.e. get_suggestion) as optimize would be confusing with GPflow, and confusing with optimizer.optimize.
i.e., the get_suggestion in sequential batch looks a lot like the sequential batch above and also contains the operation of the context currently. If a sequential strategy really diverts from the default it can override get_suggestion. Furthermore, the methods get_suggestion relies on go in the interface of the derived class (i.e. in SequentialBatchAcquisition, a set_batch is required, this method is missing from the other classes). In Acquisition get_suggestion simply optimizes, and BayesianOptimizer calls acquisition.get_suggestion, it doesn't even need to verify batch size.

I think this is a good design to start with, its also a plain OO solution to solve this without extra complexity. We also added for instance an ABAcquisition, which could be used to implement for instance https://arxiv.org/pdf/1704.03651.pdf which isn't a true batch approach but it shares some semantics. Unfortunately, this currently isn't compatible with the aggregation mechanics and MCMC components we have in place currently. I'm still in the process of discussing this with @icouckuy but we should be able to resolve this later on it will however involve some reshuffling but right now I think its fully manageable.

What do you think. You could add SequentialBatchAcquisition, I'll do ParallelBatchAcquisition.

Yup, I think it's a good plan. I will try to update the PR accordingly for SequentialBatchAcquisition by early next week.

Great, to give you an idea: this is where we are heading at the moment:

still discussion ongoing though. Ultimately the idea is that Acquisition/SequentialBatchAcquisition become the same thing (set_batch by default doing nothing, for batch support it is sufficient to override). The design of the UML also takes into account the current capabilities of the framework to add and multiply, use MCMC etc. Your code will be easy to fit in so don't worry about this for now.

javdrher · 2017-08-27T21:28:48Z

GPflowOpt/acquisition/qei_cl.py

+    Kriging Is Well-Suited to Parallelize Optimization.
+    """
+
+    def __init__(self, model, batch_size):


I was thinking to derive batch_size in acquisition each time set_batch is called, so we can keep batch_size in BayesianOptimizer. This would also permit changing the size of the batches during the optimization.

Initial implementation of sequential batch optimization

40c79b8

javdrher reviewed Aug 27, 2017

View reviewed changes

javdrher mentioned this pull request Aug 27, 2017

Callback #73

Merged

javdrher mentioned this pull request Sep 17, 2017

Batch support (phase 1) #81

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial implementation of sequential batch optimization #74

Initial implementation of sequential batch optimization #74

nrontsis commented Aug 26, 2017 •

edited

Loading

codecov-io commented Aug 26, 2017 •

edited

Loading

javdrher commented Aug 27, 2017

javdrher left a comment

javdrher Aug 27, 2017

javdrher Aug 27, 2017

javdrher Aug 28, 2017

nrontsis Sep 1, 2017

javdrher Sep 4, 2017

nrontsis Sep 7, 2017

javdrher Sep 7, 2017

javdrher Aug 27, 2017

nrontsis Sep 1, 2017

Initial implementation of sequential batch optimization #74

Are you sure you want to change the base?

Initial implementation of sequential batch optimization #74

Conversation

nrontsis commented Aug 26, 2017 • edited Loading

codecov-io commented Aug 26, 2017 • edited Loading

Codecov Report

javdrher commented Aug 27, 2017

javdrher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nrontsis commented Aug 26, 2017 •

edited

Loading

codecov-io commented Aug 26, 2017 •

edited

Loading