Replies: 6 comments 2 replies
-
Hey @mcleantom ! Best way at the time of writing would probably be to implement the vector math as-is. Your network inputs One of the future goals on the AeroSandbox TODO list is to implement an interface to black-box functions where you can either supply your own gradient or use finite-differencing. Once this is in place, you'll be able to drop in your PyTorch/TF model and simply provide |
Beta Was this translation helpful? Give feedback.
-
Long-winded update, but see NeuralFoil and its AeroSandbox implementation as an example of exactly this! |
Beta Was this translation helpful? Give feedback.
-
Hey @peterdsharpe, thanks for the reply! Ive had a look at the NeuralFoil library and I've had a go at building a simple neural network in ASB (or, casadi under the hood) and I'm having an issue calculating the dot product within casadi. The neural network code, with the conversion from pytorch, looks like: import torch.nn as nn
import aerosandbox.numpy as np
class Layer:
def __init__(self, input_size, output_size, weights=None, biases=None):
self.input_size = input_size
self.output_size = output_size
if weights is None:
self.weights = np.random.randn(input_size, output_size)
else:
self.weights = weights
if biases is None:
self.biases = np.random.randn(output_size)
else:
self.biases = biases
def forward(self, x):
return np.dot(x, self.weights) + self.biases
class Tanh:
def __init__(self):
pass
def forward(self, x):
return np.tanh(x)
class Relu:
def __init__(self):
pass
def forward(self, x):
return np.maximum(x, 0)
class Sequential:
def __init__(self, layers):
self.layers = layers
def forward(self, x):
for layer in self.layers:
x = layer.forward(x)
return x
@classmethod
def from_torch(cls, model):
layers = []
for layer in model:
if isinstance(layer, nn.Linear):
layers.append(Layer(
input_size=layer.in_features,
output_size=layer.out_features,
weights=np.array(layer.weight.detach().numpy().T),
biases=np.array(layer.bias.detach().numpy())
))
elif isinstance(layer, nn.ReLU):
layers.append(Relu())
elif isinstance(layer, nn.Tanh):
layers.append(Tanh())
return cls(layers)
def __call__(self, *args, **kwargs):
return self.forward(*args, **kwargs) And my unit test for this looks like: from HydroSandbox.models.neural_network import Sequential
from unittest import TestCase
import torch
import torch.nn as tnn
import aerosandbox.numpy as np
import aerosandbox as asb
torch.manual_seed(0)
class TestNeuralNetworks(TestCase):
def get_torch_model(self):
torch_model = tnn.Sequential(
tnn.Linear(3, 5),
tnn.ReLU(),
tnn.Linear(5, 2),
tnn.ReLU(),
tnn.Linear(2, 1)
)
torch_model.eval()
return torch_model
def test_torch_model_and_aerosandbox_model_same_result(self):
model_input = np.ones((1, 3))
torch_model = self.get_torch_model()
torch_input = torch.tensor(model_input, dtype=torch.float32)
torch_output = torch_model(torch_input)
aero_model = Sequential.from_torch(torch_model)
aero_output = aero_model(model_input)
assert np.isclose(torch_output.detach().numpy(), aero_output, atol=1e-5).all()
opti = asb.Opti()
opti_input = opti.variable(init_guess=model_input)
aero_model(opti_input) The assertion works fine, but in the line Error
Traceback (most recent call last):
File "/home/tom/src/vpp3/HydroSandbox/models/test_neural_network.py", line 34, in test_torch_model_and_aerosandbox_model_same_result
aero_model(opti_input.T)
File "/home/tom/src/vpp3/HydroSandbox/models/neural_network.py", line 70, in __call__
return self.forward(*args, **kwargs)
File "/home/tom/src/vpp3/HydroSandbox/models/neural_network.py", line 49, in forward
x = layer.forward(x)
File "/home/tom/src/vpp3/HydroSandbox/models/neural_network.py", line 21, in forward
return np.dot(x, self.weights) + self.biases
File "/home/tom/miniconda3/envs/vpp3/lib/python3.10/site-packages/aerosandbox/numpy/linalg_top_level.py", line 19, in dot
return _cas.dot(a, b)
File "/home/tom/miniconda3/envs/vpp3/lib/python3.10/site-packages/casadi/casadi.py", line 36361, in dot
return _casadi.dot(*args)
RuntimeError: .../casadi/core/mx_node.cpp:955: Assertion "size2()==y.size2() && size1()==y.size1()" failed:
MXNode::dot: Dimension mismatch. dot requires its two arguments to have equal shapes, but got (3, 1) and (5, 3). With numpy, the matricies are casted correctly for the dot product: import numpy as np
A = np.array([[1, 2, 3]])
B = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
C = np.dot(A, B) # array([[30, 36, 42]]) However when you use a casadi matrix, there is a dimension mismatch error: import casadi as ca
A = ca.MX(*A.shape)
B = ca.MX(*B.shape)
C = ca.dot(A, B)
Traceback (most recent call last):
File "/home/tom/.local/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3433, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-72-07337ba0e18e>", line 1, in <module>
C = ca.dot(A, B)
File "/home/tom/miniconda3/envs/vpp3/lib/python3.10/site-packages/casadi/casadi.py", line 36361, in dot
return _casadi.dot(*args)
RuntimeError: .../casadi/core/matrix_impl.hpp:2000: Assertion "x.size()==y.size()" failed:
dot: Dimension mismatch I was wondering if you knew why this might happen? |
Beta Was this translation helpful? Give feedback.
-
Hi @mcleantom, I am regularly having the same issue. This is because casadi.dot and numpy.dot behave differently according to input arrays shapes.
On the other hand, casadi does not call matrix multiplication instead of dot product when A or B are 2d arrays. This is why you get this error. Standard dot product is not actually defined with the shapes you are using. I thought about two solutions as a workaround for this problem:
@peterdsharpe any opinion on this? |
Beta Was this translation helpful? Give feedback.
-
More specifically for this problem - Charles' diagnosis is exactly correct in that a dot product probably isn't the right tool for the job here, and the NumPy definition of the dot product function is quite overloaded. Rather, this should really be a matrix multiply rather than a dot product (just based on tensor shapes). One idea for a backend-agnostic implementation (i.e., plays well with NumPy, CasADi, and PyTorch) is to use the Python matmul np.dot(x, self.weights) + self.biases with: self.weights @ x + self.biases which is the same syntax seen in NeuralFoil's evaluation. I know for sure that this works in CasADi and NumPy, and I would bet (fingers crossed) that this works with PyTorch as well. Let me know! |
Beta Was this translation helpful? Give feedback.
-
Thanks for the help, I managed to get it working. Here is my updated code: import torch.nn as nn
import aerosandbox.numpy as np
class Layer:
def __init__(self, input_size, output_size, weights=None, biases=None):
self.input_size = input_size
self.output_size = output_size
if weights is None:
self.weights = np.random.randn(input_size, output_size)
else:
self.weights = weights
if biases is None:
self.biases = np.random.randn(output_size)
else:
self.biases = biases
def forward(self, x):
return self.weights @ x + np.reshape(self.biases, (-1, 1))
class Tanh:
def __init__(self):
pass
def forward(self, x):
return np.tanh(x)
class Relu:
def __init__(self):
pass
def forward(self, x):
return np.maximum(x, 0)
class Sequential:
def __init__(self, layers):
self.layers = layers
def forward(self, x):
for layer in self.layers:
x = layer.forward(x)
return x
@classmethod
def from_torch(cls, model):
layers = []
for layer in model:
if isinstance(layer, nn.Linear):
layers.append(Layer(
input_size=layer.in_features,
output_size=layer.out_features,
weights=np.array(layer.weight.detach().numpy()),
biases=np.array(layer.bias.detach().numpy())
))
elif isinstance(layer, nn.ReLU):
layers.append(Relu())
elif isinstance(layer, nn.Tanh):
layers.append(Tanh())
return cls(layers)
def __call__(self, *args, **kwargs):
return self.forward(*args, **kwargs) There was a mistake in the conversion from pytorch model to the sequential model (previously I transposed the weights to try and get it to work The unit test now looks like: from HydroSandbox.models.neural_network import Sequential
from unittest import TestCase
import torch
import torch.nn as tnn
import aerosandbox.numpy as np
import aerosandbox as asb
torch.manual_seed(0)
class TestNeuralNetworks(TestCase):
def get_torch_model(self):
torch_model = tnn.Sequential(
tnn.Linear(3, 5),
tnn.ReLU(),
tnn.Linear(5, 2),
tnn.ReLU(),
tnn.Linear(2, 1)
)
torch_model.eval()
return torch_model
def test_torch_model_and_aerosandbox_model_get_same_result_2(self):
N_in_features = 26
N_out_features = 6
torch_model = tnn.Sequential(
tnn.Linear(N_in_features, 128),
tnn.ReLU(),
tnn.Linear(128, 128),
tnn.ReLU(),
tnn.Linear(128, 128),
tnn.ReLU(),
tnn.Linear(128, N_out_features),
)
x_in = np.random.randn(1, N_in_features)
torch_result = torch_model(torch.tensor(x_in).float())
aero_model = Sequential.from_torch(torch_model)
aero_result = aero_model(x_in.T)
self.assertTrue(np.isclose(torch_result.detach().numpy(), aero_result.T, atol=1e-5).all())
opti = asb.Opti()
x_opti = opti.variable(init_guess=x_in.T).T # Cant create the variable without transposing
opti_result = aero_model(x_opti.T)
opti.minimize(opti_result[0])
try: # We just want to be able to use opti.debug.value function, which is only available after solve.
sol = opti.solve(max_iter=0)
except Exception:
pass
# Not sure why it returns a 1d array, maybe that information is lost when converted to a variable?
x_opti_value = np.atleast_2d(opti.debug.value(x_opti))
opti_result_value = opti.debug.value(opti_result)
aero_model_result_2 = aero_model(x_opti_value.T).flatten()
self.assertTrue(np.isclose(opti_result_value, aero_model_result_2, atol=1e-5).all()) I would be interested in working on getting the aerosandbox implementation of I would also be interested in getting either l4casadi or ml-casadi to work with aerosandbox, It would be useful to be able to generalize any torch model to work (as currently, this code only works with Sequential neural networks with tanh/relu activations). Do you have any examples of using the casadi.Function object within aerosandbox? I see it's use quite frequently, however I am not sure how it actually works. Thanks for your help. |
Beta Was this translation helpful? Give feedback.
-
I have a model which uses a neural network to predict a force, given a set of inputs. Is there any easy way to implement this model in AeroSandbox which will allow the optimisation algorithm to work? Should I just manually implement the vector math of w*a + b or is there an easier way to integrate a tensorflow/Pytorch NN model into the code?
Beta Was this translation helpful? Give feedback.
All reactions