Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous inter-point constraints #345

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
1 change: 1 addition & 0 deletions CHANGELOG.md
AVHopp marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment: through the PR, the writing is not consistent. Sometimes you write interpoint and sometimes inter-point. Let's settle for one and be consistent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new intperpoint flag should be mentioned in the userguide

Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added
- `allow_missing` and `allow_extra` keyword arguments to `Objective.transform`
- Example for a traditional mixture
- Continuous inter-point constraints via new `is_interpoint` attribute

### Changed
- `SubstanceParameter` encodings are now computed exclusively with the
Expand Down
43 changes: 35 additions & 8 deletions baybe/constraints/continuous.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import gc
import math
from collections.abc import Collection, Sequence
from itertools import chain, repeat
from typing import TYPE_CHECKING, Any

import numpy as np
Expand Down Expand Up @@ -45,6 +46,15 @@ class ContinuousLinearConstraint(ContinuousConstraint):
rhs: float = field(default=0.0, converter=float, validator=finite_float)
"""Right-hand side value of the in-/equality."""

is_interpoint: bool = field(default=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is_interpoint: bool = field(default=False)
is_interpoint: bool = field(default=False, validator=instance_of(bool))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what is more in line with our convention, is_interpoint or interpoint 🤔 The former is what we typically use for properties, while the latter makes a bit more sense from a constructor perspective. @Scienfitz: opinions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, would use the latter for arguments

"""Flag for defining an interpoint constraint.

An inter-point constraint is a constraint that is defined over full batches. That
is, and inter-point constraint of the form ``param_1 + 2*param_2 <=2`` means that
the sum of ``param2`` plus two times the sum of ``param_2`` across the full batch
must not exceed 2.
Comment on lines +52 to +55
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
An inter-point constraint is a constraint that is defined over full batches. That
is, and inter-point constraint of the form ``param_1 + 2*param_2 <=2`` means that
the sum of ``param2`` plus two times the sum of ``param_2`` across the full batch
must not exceed 2.
While intra-point constraints impose conditions on each individual point of a batch,
inter-point constraints do so **across** the points of the batch. That is, an
inter-point constraint of the form ``x_1 + x_2 <= 1`` enforces that the sum of all
``x_1`` values plus the sum of all ``x_2`` values in the batch must not exceed 1.

"""

@coefficients.validator
def _validate_coefficients( # noqa: DOC101, DOC103
self, _: Any, coefficients: list[float]
Expand Down Expand Up @@ -98,7 +108,10 @@ def _drop_parameters(
)

def to_botorch(
self, parameters: Sequence[NumericalContinuousParameter], idx_offset: int = 0
self,
parameters: Sequence[NumericalContinuousParameter],
idx_offset: int = 0,
batch_size: int = 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wouldnt give this an int default ('.recommend' also does not have a default for batch size) but would make the default 'None'

) -> tuple[Tensor, Tensor, float]:
"""Cast the constraint in a format required by botorch.

Expand All @@ -108,6 +121,8 @@ def to_botorch(
Args:
parameters: The parameter objects of the continuous space.
idx_offset: Offset to the provided parameter indices.
batch_size: The batch size used in the recommendation. Necessary for
interpoint constraints, ignored by all others.

Returns:
The tuple required by botorch.
Expand All @@ -117,16 +132,28 @@ def to_botorch(
from baybe.utils.torch import DTypeFloatTorch

param_names = [p.name for p in parameters]
param_indices = [
param_names.index(p) + idx_offset
for p in self.parameters
if p in param_names
]
if not self.is_interpoint:
param_indices = [
param_names.index(p) + idx_offset
for p in self.parameters
if p in param_names
]
coefficients = self.coefficients
torch_indices = torch.tensor(param_indices)
else:
param_index = {name: param_names.index(name) for name in self.parameters}
param_indices_interpoint = [
(batch, param_index[name] + idx_offset)
for name in self.parameters
for batch in range(batch_size)
]
coefficients = list(chain(*zip(*repeat(self.coefficients, batch_size))))
torch_indices = torch.tensor(param_indices_interpoint)

return (
torch.tensor(param_indices),
torch_indices,
torch.tensor(
[self._multiplier * c for c in self.coefficients], dtype=DTypeFloatTorch
[self._multiplier * c for c in coefficients], dtype=DTypeFloatTorch
),
np.asarray(self._multiplier * self.rhs, dtype=DTypeFloatNumpy).item(),
)
Expand Down
6 changes: 4 additions & 2 deletions baybe/recommenders/pure/bayesian/botorch.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,12 +185,12 @@ def _recommend_continuous(
num_restarts=self.n_restarts,
raw_samples=self.n_raw_samples,
equality_constraints=[
c.to_botorch(subspace_continuous.parameters)
c.to_botorch(subspace_continuous.parameters, batch_size=batch_size)
for c in subspace_continuous.constraints_lin_eq
]
or None, # TODO: https://github.com/pytorch/botorch/issues/2042
inequality_constraints=[
c.to_botorch(subspace_continuous.parameters)
c.to_botorch(subspace_continuous.parameters, batch_size=batch_size)
for c in subspace_continuous.constraints_lin_ineq
]
or None, # TODO: https://github.com/pytorch/botorch/issues/2042
Expand Down Expand Up @@ -234,6 +234,8 @@ def _recommend_hybrid(
Returns:
The recommended points.
"""
# TODO Interpoint constraints are not yet enabled in hybrid search spaces
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you raise an actual NotImplementedError here? Otherwise, these constraints would just silently be ignored.


# For batch size > 1, this optimizer needs a MC acquisition function
if batch_size > 1 and not self.acquisition_function.is_mc:
raise IncompatibleAcquisitionFunctionError(
Expand Down
117 changes: 109 additions & 8 deletions baybe/searchspace/continuous.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import gc
import warnings
from collections.abc import Collection, Sequence
from itertools import chain
from itertools import chain, repeat
from typing import TYPE_CHECKING, Any, cast

import numpy as np
Expand Down Expand Up @@ -82,6 +82,10 @@ def __str__(self) -> str:
nonlin_constraints_list = [
constr.summary() for constr in self.constraints_nonlin
]
nonlin_constraints_list = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplication

constr.summary() for constr in self.constraints_nonlin
]

param_df = pd.DataFrame(param_list)
lin_eq_df = pd.DataFrame(eq_constraints_list)
lin_ineq_df = pd.DataFrame(ineq_constraints_list)
Expand Down Expand Up @@ -282,6 +286,24 @@ def comp_rep_bounds(self) -> pd.DataFrame:
index=["min", "max"],
)

@property
def is_constrained(self) -> bool:
"""Return whether the subspace is constrained in any way."""
return any(
(
self.constraints_lin_eq,
self.constraints_lin_ineq,
self.constraints_nonlin,
)
)

@property
def has_interpoint_constraints(self) -> bool:
"""Return whether or not the space has any interpoint constraints."""
return any(
c.is_interpoint for c in self.constraints_lin_eq + self.constraints_lin_ineq
)

Comment on lines +289 to +306
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstrings of properties should read as if they were an attribute, i.e. no verb like return .... Instead, something like Boolean flag indicating ...

def _drop_parameters(self, parameter_names: Collection[str]) -> SubspaceContinuous:
"""Create a copy of the subspace with certain parameters removed.

Expand Down Expand Up @@ -359,7 +381,10 @@ def samples_random(self, n_points: int = 1) -> pd.DataFrame:
)
return self.sample_uniform(n_points)

def sample_uniform(self, batch_size: int = 1) -> pd.DataFrame:
def sample_uniform(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this reformat about? By mistake?

self,
batch_size: int = 1,
) -> pd.DataFrame:
"""Draw uniform random parameter configurations from the continuous space.

Args:
Expand All @@ -383,14 +408,17 @@ def sample_uniform(self, batch_size: int = 1) -> pd.DataFrame:

if not self.parameters:
return pd.DataFrame(index=pd.RangeIndex(0, batch_size))

if (
len(self.constraints_lin_eq) == 0
and len(self.constraints_lin_ineq) == 0
and len(self.constraints_cardinality) == 0
):
# If the space is completely unconstrained, we can sample from bounds.
if not self.is_constrained:
return self._sample_from_bounds(batch_size, self.comp_rep_bounds.values)

if self.has_interpoint_constraints:
return self._sample_from_polytope_with_interpoint_constraints(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it feasible to keep splittitng the sample function up like that? eg what happens if we have interpoint+cardinality constraints

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt 'sample_from_polytope' a special case of the new function? If so I think it could be better design if one was contained int he other or one calls the other

batch_size, self.comp_rep_bounds.values
)

# If there are neither cardinality nor interpoint constraints, we sample
# directly from the polytope
if len(self.constraints_cardinality) == 0:
return self._sample_from_polytope(batch_size, self.comp_rep_bounds.values)

Expand All @@ -404,6 +432,79 @@ def _sample_from_bounds(self, batch_size: int, bounds: np.ndarray) -> pd.DataFra

return pd.DataFrame(points, columns=self.parameter_names)

def _sample_from_polytope_with_interpoint_constraints(
self,
batch_size: int,
bounds: np.ndarray,
) -> pd.DataFrame:
"""Draw uniform random samples from a polytope with interpoint constraints."""
# If the space has interpoint constraints, we need to sample from a larger
# searchspace that models the batch size via additional dimension. This is
# necessary since `get_polytope_samples` cannot handle inter-point constraints,
# see https://github.com/pytorch/botorch/issues/2468

import torch
from botorch.utils.sampling import get_polytope_samples

from baybe.utils.numerical import DTypeFloatNumpy
from baybe.utils.torch import DTypeFloatTorch

# The number of parameters is needed at some places for adjusting indices
num_of_params = len(self.parameters)

eq_constraints, ineq_constraints = [], []

# We start with the general constraints before going to interpoint constraints
for c in [*self.constraints_lin_eq, *self.constraints_lin_ineq]:
if not c.is_interpoint:
param_indices, coefficients, rhs = c.to_botorch(
self.parameters, batch_size=batch_size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.parameters, batch_size=batch_size
self.parameters

)
for b in range(batch_size):
botorch_tuple = (
param_indices + b * num_of_params,
coefficients,
rhs,
)
if c.is_eq:
eq_constraints.append(botorch_tuple)
else:
ineq_constraints.append(botorch_tuple)
else:
# Get the indices of the parameters used in the constraint
param_index = {
name: self.parameter_names.index(name) for name in c.parameters
}
param_indices_list = [
batch * num_of_params + param_index[param]
for param in c.parameters
for batch in range(batch_size)
]
coefficients_list = list(
chain(*zip(*repeat(c.coefficients, batch_size)))
)
botorch_tuple = (
torch.tensor(param_indices_list),
torch.tensor(coefficients_list, dtype=DTypeFloatTorch),
np.asarray(c.rhs, dtype=DTypeFloatNumpy).item(),
)
if c.is_eq:
eq_constraints.append(botorch_tuple)
else:
ineq_constraints.append(botorch_tuple)

bounds_joint = torch.cat(
[torch.from_numpy(bounds) for _ in range(batch_size)], dim=-1
)
points = get_polytope_samples(
n=1,
bounds=bounds_joint,
equality_constraints=eq_constraints,
inequality_constraints=ineq_constraints,
)
points = points.reshape(batch_size, points.shape[-1] // batch_size)
return pd.DataFrame(points, columns=self.parameter_names)

def _sample_from_polytope(
self, batch_size: int, bounds: np.ndarray
) -> pd.DataFrame:
Expand Down
62 changes: 60 additions & 2 deletions examples/Constraints_Continuous/linear_constraints.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@
from botorch.test_functions import Rastrigin

from baybe import Campaign
from baybe.constraints import ContinuousLinearConstraint
from baybe.constraints import (
ContinuousLinearConstraint,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strange reformat

)
from baybe.objectives import SingleTargetObjective
from baybe.parameters import NumericalContinuousParameter
from baybe.searchspace import SearchSpace
Expand Down Expand Up @@ -89,7 +91,7 @@

SMOKE_TEST = "SMOKE_TEST" in os.environ

BATCH_SIZE = 2 if SMOKE_TEST else 3
BATCH_SIZE = 4 if SMOKE_TEST else 5
N_ITERATIONS = 2 if SMOKE_TEST else 3

for k in range(N_ITERATIONS):
Expand Down Expand Up @@ -140,3 +142,59 @@
"2.0*x_2 + 3.0*x_4 <= 1.0 satisfied in all recommendations? ",
(2.0 * measurements["x_2"] + 3.0 * measurements["x_4"]).le(1.0 + TOLERANCE).all(),
)


### Using inter-point constraints
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interpoint should be its own example file feels just 'drangepappt' for this example


# It is also possible to require inter-point constraints which constraint the value of
# a single parameter across a full batch.
# Since these constraints require information about the batch size, they are not used
# during the creation of the search space but handed over to the `recommend` call.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated due to design change

# This example models the following inter-point constraints and combines them also
# with regular constraints.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that does not appear to be true

# 1. The sum of `x_1` across all batches needs to be >= 2.5.
# 2. The sum of `x_2` across all batches needs to be exactly 5.
# 3. The sum of `2*x_3` minus the sum of `x_4` across all batches needs to be >= 5.


inter_constraints = [
ContinuousLinearConstraint(
parameters=["x_1"], operator=">=", coefficients=[1], rhs=2.5, is_interpoint=True
),
ContinuousLinearConstraint(
parameters=["x_2"], operator="=", coefficients=[1], rhs=5, is_interpoint=True
),
ContinuousLinearConstraint(
parameters=["x_3", "x_4"],
operator=">=",
coefficients=[2, -1],
rhs=5,
is_interpoint=True,
),
]

### Construct search space without the previous constraints

inter_searchspace = SearchSpace.from_product(
parameters=parameters, constraints=inter_constraints
)

inter_campaign = Campaign(
searchspace=inter_searchspace,
objective=objective,
)

for k in range(N_ITERATIONS):
rec = inter_campaign.recommend(batch_size=BATCH_SIZE)

# target value are looked up via the botorch wrapper
target_values = []
for index, row in rec.iterrows():
target_values.append(WRAPPED_FUNCTION(*row.to_list()))

rec["Target"] = target_values
inter_campaign.add_measurements(rec)
# Check inter-point constraints
assert rec["x_1"].sum() >= 2.5 - TOLERANCE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prints are more important than asserts in examples (nbut nothing speaks against having both)

assert np.isclose(rec["x_2"].sum(), 5)
assert 2 * rec["x_3"].sum() - rec["x_4"].sum() >= 2.5 - TOLERANCE
Loading
Loading