Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring the package to Pytorch autodiff #14

Merged
merged 57 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
642aa20
fix quadratic module
fabian-sp Mar 8, 2024
3c65d53
test passing (for now)
fabian-sp Mar 12, 2024
15e533d
formatting
fabian-sp Mar 12, 2024
4dc7bda
add function module
fabian-sp Mar 12, 2024
1711d35
try merge
fabian-sp Mar 12, 2024
26cda8e
adapt multi axis test
fabian-sp Mar 13, 2024
65b3feb
start the function object
fabian-sp Mar 13, 2024
db8ffbf
add differentiability flag
fabian-sp Mar 13, 2024
5dcc855
max of linear module
fabian-sp Mar 15, 2024
347f644
adapt training script
fabian-sp Mar 15, 2024
29c751b
reshape test
fabian-sp Mar 15, 2024
0f27a88
fix bug in old function, and add torch replacement
fabian-sp Mar 18, 2024
f6fd926
set default as float
fabian-sp Mar 18, 2024
e36995d
log some metrics
fabian-sp Mar 18, 2024
b594714
start to move jacobian computation
fabian-sp Apr 5, 2024
d97980f
progress
fabian-sp Apr 5, 2024
1096a0d
seems to work
fabian-sp Apr 8, 2024
75bf7a6
rewrite tests and scripts
fabian-sp Apr 8, 2024
93170ba
update readme
fabian-sp Apr 8, 2024
791c7fc
update readme
fabian-sp Apr 8, 2024
890958f
add device
fabian-sp Apr 8, 2024
e3cf0e5
remove copy of matrix, and add timings
fabian-sp Apr 9, 2024
65eb32e
add example and plotting utils
fabian-sp Apr 9, 2024
66aff8e
set as in paper
fabian-sp Apr 9, 2024
fa17989
minor edits
fabian-sp Apr 9, 2024
f86df42
Improve timing logging
phschiele Apr 17, 2024
c5c7af6
Uses OSQP
phschiele Apr 17, 2024
9326cfb
make qp solver configurable
fabian-sp Apr 17, 2024
d50853b
add default qp solver
fabian-sp Apr 17, 2024
62cc1a2
switch off verbosity
fabian-sp Apr 17, 2024
775ee8e
tests pass locally
fabian-sp Apr 18, 2024
7a1c796
Merge pull request #10 from fabian-sp/p-speed-up-qp
fabian-sp Apr 18, 2024
5a444a0
move subproblem to single file
fabian-sp Apr 18, 2024
e0e10f3
direct call initial
fabian-sp Apr 19, 2024
bc46c5d
one test fails due to dual infeasible
fabian-sp Apr 19, 2024
ccbd52b
add tikohonov, and subproblem test
fabian-sp Apr 22, 2024
39cf7da
handle device properly
fabian-sp Apr 22, 2024
d9f37fa
tikhonov regularization as option
fabian-sp Apr 23, 2024
253ee9d
Update src/ncopt/functions/main.py
fabian-sp Apr 24, 2024
db7c5e0
Merge pull request #12 from fabian-sp/f-direct-qp-call
fabian-sp Apr 24, 2024
baa478c
sth fails for qp when having zero samples
fabian-sp Apr 24, 2024
cfe380f
torch empty caused the problem
fabian-sp Apr 25, 2024
7c18349
make sample points an attribute
fabian-sp Apr 25, 2024
b12920e
update readme
fabian-sp Apr 26, 2024
79c9ef2
update readme
fabian-sp Apr 29, 2024
d90120c
update readme
fabian-sp Apr 29, 2024
6f8b5d0
change default sampling points and example folder
fabian-sp Apr 30, 2024
5c03d6f
add checkpoint example
fabian-sp Apr 30, 2024
25787e2
fix math formatting
fabian-sp Apr 30, 2024
5237127
add timings plot
fabian-sp Apr 30, 2024
0f82153
add docstrings
fabian-sp May 2, 2024
15f01ab
correct typos
fabian-sp May 2, 2024
05c8e90
Merge pull request #13 from fabian-sp/f-handle-differentiable
fabian-sp May 6, 2024
c8bcd47
change license and names
fabian-sp May 6, 2024
7ef96f4
log at first iter
fabian-sp May 7, 2024
d1dca6c
typos
phschiele May 15, 2024
60b8951
editable install command
fabian-sp May 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ cff-version: 1.2.0
authors:
- family-names: Schaipp
given-names: Fabian
title: "Python implementation of the SQP-GS algorithm"
- family-names: Schiele
given-names: Philipp
title: "Constrained optimization for Pytorch using the SQP-GS algorithm."
version: 0.1.0
date-released: 2021-09-03
date-released: 2024-xx-xx
30 changes: 5 additions & 25 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,29 +1,9 @@
BSD 3-Clause License
MIT License

Copyright (c) 2021, Fabian Schaipp
All rights reserved.
Copyright 2024 Fabian Schaipp, Philipp Schiele

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
138 changes: 108 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,142 @@
# ncOPT
This repository contains a `Python` implementation of the SQP-GS (*Sequential Quadratic Programming - Gradient Sampling*) algorithm by Curtis and Overton [1].
This repository is for solving **constrained optimization problems** where objective and/or constraint functions are (arbitrary) **PyTorch modules**. It is mainly intended for optimization with pre-trained networks, but might be useful also in other contexts.

**Note:** this implementation is a **prototype code**, it has been tested only for a simple problem and it is not performance-optimized. A Matlab implementation is available from the authors of the paper, see [2].
## Short Description

## Installation

If you want to install an editable version of this package in your Python environment, run the command
The algorithms in this package can solve problems of the form

```
python -m pip install --editable .
min f(x)
s.t. g(x) <= 0
h(x) = 0
```

## Mathematical description
where `f`, `g` and `h` are locally Lipschitz functions.

Key functionalities of the package:

* forward and backward pass of functions can be done on GPU
* batched evaluation and Jacobian computation using PyTorch's `autograd`; no gradient implementation needed
* support for inequality *and* equality constraints; constraints can use only a subset of the optimization variable as input

The algorithm can solve problems of the form

### Table of contents

1. [Installation](#installation)
2. [Getting started](#getting-started)
- [Solver interface](#solver-interface)
- [The `ObjectiveOrConstraint` class](#the-objectiveorconstraint-class)
- [Functionalities](#functionalities)
3. [Examples](#examples)
4. [References](#references)

### Disclaimer

1) We have not (yet) extensively tested the solver on large-scale problems.
2) The implemented solver is designed for nonsmooth, nonconvex problems, and as such, can solve a very general problem class. If your problem has a specific structure (e.g. convexity), then you will almost certainly get better performance by using software/solvers that are specifically written for the respective problem type. As starting point, check out [`cvxpy`](https://www.cvxpy.org/).
3) The solver is not guaranteed to converge to a (global) solution (see the theoretical results in [1] for convergence to local solutions under certain assumptions).



## Installation

The package is available from PyPi with

```
min f(x)
s.t. g(x) <= 0
h(x) = 0
python -m pip install ncopt
```

where `f`, `g` and `h` are locally Lipschitz functions. Hence, the algorithm can solve problems with nonconvex and nonsmooth objective and constraints. For details, we refer to the original paper.

## Example
Alternatively, for an editable version of this package in your Python environment, clone this repository and run the command

The code was tested for a 2-dim nonsmooth version of the Rosenbrock function, constrained with a maximum function. See Example 5.1 in [1]. For this problem, the analytical solution is known. The picture below shows the trajectory of SQP-GS for different starting points. The final iterates are marked with the black plus while the analytical solution is marked with the golden star. We can see that the algorithm finds the minimizer consistently.
```
python -m pip install --editable .
```

To reproduce this experiment, see the file `example_rosenbrock.py`.
## Getting started

![SQP-GS trajectories for a 2-dim example](data/img/rosenbrock.png "SQP-GS trajectories for a 2-dim example")
### Solver interface

The main solver implemented in this package is called SQP-GS, and has been developed by Curtis and Overton in [1]. See [the detailed documentation](src/ncopt/sqpgs/).

## Implementation details
The solver can be called via
The SQP-GS solver can be called via

```python
from ncopt.sqpgs import SQPGS
problem = SQPGS(f, gI, gE)
problem.solve()
```
It has three main arguments, called `f`, `gI` and `gE`. `f` is the objective. `gI` and `gE` are lists of inequality and equality constraint functions. Each element of `gI` and `gE` as well as the objective `f` needs to be an instance of a class which contains the following properties. The constraint functions are allowed to have multi-dimensional output.
Here `f` is the objective function, and `gI` and `gE` are a list of inequality and equality constraints. An empty list can be passed if no (in)equality constraints are needed.

### The `ObjectiveOrConstraint` class

The objective `f` and each element of `gI` and `gE` should be passed as an instance of [`ncopt.functions.ObjectiveOrConstraint`](src/ncopt/functions/main.py) (a simple wrapper around a `torch.nn.Module`).

* Each constraint function is allowed to have multi-dimensional output (see example below).
* **IMPORTANT:** For the objective, we further need to specify the dimension of the optimization variable with the argument `dim`. For each constraint, we need to specify the output dimension with `dim_out`.


For example, a linear constraint function `Ax - b <= 0` can be implemented as follows:

```python
from ncopt.functions import ObjectiveOrConstraint
A = .. # your data
b = .. # your data
g = ObjectiveOrConstraint(torch.nn.Linear(2, 2), dim_out=2)
g.model.weight.data = A # pass A
g.model.bias.data = -b # pass b
```

### Functionalities

Let's assume we have a `torch.nn.Module`, call it `model`, which we want to use as objective/constraint. For the solver, we can pass the function as

```
f = ObjectiveOrConstraint(model)
```

* **Dimension handling:** Each function must be designed for batched evaluation (that is, the first dimension of the input is the batch size). Also, the output shape of `model` must be two-dimensional, that is `(batch_size, dim_out)`, where `dim_out` is the output dimension for a constraint, and `dim_out=1` for the objective.

* **Device handling:** The forward pass, and Jacobian calculation is done on the device on which the parameters of your model. For example, you can use `model.to(device)` before creating `f`. See this [Colab example](https://colab.research.google.com/drive/1scsusR4Fggo-vT-IPYsoa3ccROmGQkZ8?usp=sharing) how to use a GPU.

### Attributes
* **Input preparation**: Different constraints might only need a part of the optimization variable as input, or might require additional preparation such as reshaping from vector to image. (Note that the optimization variable is handled always as vector). For this, you can specify a callable `prepare_input` when initializing a `ObjectiveOrConstraint` object. Any reshaping or cropping etc. can be handled with this function. Please note that `prepare_input` should be compatible with batched forward passes.

* `self.dim`: integer, specifies dimension of the input argument.
* `self.dimOut`: integer, specifies dimension of the output.
## Examples
### 2D Nonsmooth Rosenbrock

### Methods
This example is taken from Example 5.1 in [1] and involves minimizing a nonsmooth Rosenbrock function, constrained with a maximum function. The picture below shows the trajectory of the SQP-GS solver for different starting points. The final iterates are marked with the black plus while the analytical solution is marked with the golden star. We can see that the algorithm finds the minimizer consistently.

[Link to example script](examples/example_rosenbrock.py)

![SQP-GS trajectories for a 2-dim example](data/img/rosenbrock.png "SQP-GS trajectories for a 2-dim example")

* `self.eval`: evaluates the function at a point `x`.
* `self.grad`: evaluates the gradient at a point `x`. Here, the Jacobian must be returned, i.e. an array of shape `dimOut x dim`.
### Sparse signal recovery

This example is taken from Example 5.3 in [1]. We minimize the q-norm $||x||_q$ under the constraint of approximate signal recovery $||Rx-y|| \leq \delta$. Here $R$ comes from the Discrete Cosine Transform.

[Link to example script](examples/example_residual.py)

When $q\geq 1$ the problem is convex and rather easy to solve. For $q<1$, we observed that the number of sample points need to be increased a lot in order to make the subproblems solvable in reasonable time.

This problem has dimension 256, with one scalar constraint. To give a feeling for runtime, below is the runtime per iteration (for $q=1$), split into the main parts: sample points and compute gradients (`sample_and_grad`), solve the quadratic subproblem (`subproblem`), do the update step of iterate and approximate Hessian (`step`), and all other routines.

![Timings for SQP-GS with dim 256](data/img/timings_residual.png "Timings for SQP-GS with dim 256")

### Pretrained neural network constraint

This toy example illustrates how to use a pretrained neural network as constraint function in `ncOPT`. We train a simple model to learn the mapping $(x_1,x_2) \mapsto \max(\sqrt{2}x_1, 2x_2) -1$. Then, we load the model checkpoint to use it as constraint.

Below we show the feasible set (in blue), and the final iterate, if we use as objective the squared distance to the vector of ones.

[Link to example script](examples/example_checkpoint.py)

[Link to training script](scripts/train_max_fun.py)


![SQP-GS final iterate with learned constraint](data/img/checkpoint.png "SQP-GS final iterate with learned constraint")

For an example, see the classes defined in `ncopt/funs.py`.
Moreover, we implemented a class for a constraint coming from a Pytorch neural network (i.e. `g_i(x)` is an already trained neural network). For this, see `ncopt/torch_obj.py`.



## References
[1] Frank E. Curtis and Michael L. Overton, A sequential quadratic programming algorithm for nonconvex, nonsmooth constrained optimization,
SIAM Journal on Optimization 2012 22:2, 474-500, https://doi.org/10.1137/090780201.

[2] Frank E. Curtis and Michael L. Overton, MATLAB implementation of SLQP-GS, https://coral.ise.lehigh.edu/frankecurtis/software/.
Binary file modified data/checkpoints/max2d.pt
Binary file not shown.
Binary file added data/img/checkpoint.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified data/img/rosenbrock.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/img/timings_residual.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 74 additions & 0 deletions examples/example_checkpoint.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
"""
Illustrates how to use a pretrained neural network as constraint function.

We load a checkpoint, that has been trained with the script in scripts/train_max_fun.py.
"""

import matplotlib.pyplot as plt
import numpy as np
import torch

from ncopt.functions import ObjectiveOrConstraint
from ncopt.functions.max_linear import MaxOfLinear
from ncopt.functions.quadratic import Quadratic
from ncopt.sqpgs import SQPGS

# %% Load the checkpoint
checkpoint_dir = "../data/checkpoints/max2d.pt"

model = MaxOfLinear(input_dim=2, output_dim=2)

checkpoint = torch.load(checkpoint_dir)
model.load_state_dict(checkpoint["model_state_dict"])

print("Weights:", model.linear.weight)
print("Bias:", model.linear.bias)


# %% Problem setup

# Define the constraint
g = ObjectiveOrConstraint(model, dim_out=1)

# Define the objective: f(x) = 0.5*||x-(1,1)||^2
params = (torch.eye(2), -torch.ones(2), torch.tensor(1.0))
f = ObjectiveOrConstraint(Quadratic(params=params), dim=2, is_differentiable=True)

problem = SQPGS(f, [g], [], x0=None, tol=1e-10, max_iter=500, verbose=True)
x = problem.solve()

print("Final iterate: ", x)

# %% Plot the feasible region

_x, _y = np.linspace(-2, 2, 100), np.linspace(-2, 2, 100)
X, Y = np.meshgrid(_x, _y)
Z = np.zeros_like(X)

for j in np.arange(X.shape[0]):
for i in np.arange(X.shape[1]):
Z[i, j] = g.single_eval(np.array([X[i, j], Y[i, j]]))

Z[Z > 0] = np.nan # only show feasible set

fig, ax = plt.subplots(figsize=(4, 4))

# Plot contour of feasible set
ax.contourf(X, Y, Z, cmap="Blues", levels=np.linspace(-4, 0, 20), antialiased=True, lw=0, zorder=0)

# plot circle
ax.scatter(1, 1, marker="o", s=50, c="k")
dist_to_ones = np.linalg.norm(x - np.ones(2))
c = plt.Circle((1, 1), radius=dist_to_ones, facecolor=None, fill=False, edgecolor="grey")
ax.add_patch(c)
# plot final iterate
ax.scatter(x[0], x[1], marker="o", s=50, c="silver", label="Final iterate")


ax.set_xlim(-2, 2)
ax.set_ylim(-2, 2)
ax.legend(loc="upper left")
ax.set_title("Minimize distance to black dot")

fig.tight_layout()
fig.savefig("../data/img/checkpoint.png")
91 changes: 91 additions & 0 deletions examples/example_residual.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
"""
Implements Example 5.3 in

Frank E. Curtis and Michael L. Overton, A sequential quadratic programming
algorithm for nonconvex, nonsmooth constrained optimization,
SIAM Journal on Optimization 2012 22:2, 474-500, https://doi.org/10.1137/090780201.

Useful script for testing and performance optimization.
"""

import matplotlib.pyplot as plt
import numpy as np
import torch
from scipy.fft import dct

from ncopt.functions import ObjectiveOrConstraint
from ncopt.functions.norm_residual import NormResidual
from ncopt.sqpgs import SQPGS

# %%
np.random.seed(1234)

d = 256 # problem dimension
m = 32 # number of samples
q = 1.0 # residual norm order. Value less than 1.0 makes problem non-convex.

device = "cuda" if torch.cuda.is_available() else "cpu"

# %% Objective function: ||x||_q

Id, zeros = torch.eye(d, d), torch.zeros(d)
obj = NormResidual(params=(Id, zeros), q=q, offset=0.0)

obj.to(device)

# %% Constraint: ||Rx-y|| <= delta

num_zeros = int(0.9 * d)
oracle = np.concatenate((np.zeros(num_zeros), np.random.randn(d - num_zeros)))
np.random.shuffle(oracle)

# first m rows of discrete dxd cosine transformation matrix
R = torch.from_numpy(dct(np.eye(d), axis=0)[:m, :]).type(torch.float32)
y = (R @ oracle).type(torch.float32)
delta = 1.0

const = NormResidual(params=(R, y), q=2, offset=delta)
const.to(device)

assert np.allclose(
const.forward(torch.from_numpy(oracle).type(torch.float32).reshape(1, -1).to(device)).item(),
-delta,
)

# %% Set up problem

f = ObjectiveOrConstraint(obj, dim=d)

gI = [ObjectiveOrConstraint(const, dim_out=1)]
gE = []

if q >= 1:
options = {"num_points_obj": 5, "num_points_gI": 5, "qp_solver": "osqp"}
else:
# only tested for q=0.7
options = {"num_points_obj": 100, "num_points_gI": 100, "qp_solver": "osqp"}

problem = SQPGS(f, gI, gE, x0=None, tol=1e-10, max_iter=450, options=options, verbose=True)

# %% Solve

np.random.seed(123)
torch.manual_seed(123)
x = problem.solve()

# %% Plotting

fig, ax = problem.plot_timings()
fig.savefig("../data/img/timings_residual.png")
fig, ax = problem.plot_metrics()


# plot solution vs oracle
fig, ax = plt.subplots(figsize=(7, 3))
ax.plot(oracle, c="k", lw=1, label="Oracle")
ax.plot(x, c="steelblue", lw=2, label="Final iterate")
ax.set_xlabel("Coordinate")
ax.set_ylabel(r"$x_i$")
fig.tight_layout()

# %%
Loading