Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot construct JacobianFactor in Python #248

Open
gchenfc opened this issue Aug 28, 2021 · 11 comments
Open

Cannot construct JacobianFactor in Python #248

gchenfc opened this issue Aug 28, 2021 · 11 comments
Labels
bug Something isn't working invalid This doesn't seem right

Comments

@gchenfc
Copy link
Member

gchenfc commented Aug 28, 2021

Description

I believe something is wrong with constructing JacobianFactors from the result of a linearize() operation when using the python wrapper. Tagging @varunagrawal and @ProfFan due to the fact this is python related.

Specifically, the RTTI dynamic casts used in JacobianFactor constructors do not seem to be working properly. Here are two example usages:

https://github.com/borglab/gtsam/blob/ff7ddf48bd8bb2f993437f01cd32a4727d6f023b/gtsam/linear/JacobianFactor.cpp#L58-L67

https://github.com/borglab/gtsam/blob/ff7ddf48bd8bb2f993437f01cd32a4727d6f023b/gtsam/linear/JacobianFactor.cpp#L196-L211

For whatever reason, nonlinear factors constructed in python, which are then called with .linearize() to create a JacobianFactor, cannot be passed into the JacobianFactor(factor) constructors because boost::dynamic_pointer_cast<JacobianFactor>(factor) returns a null pointer.

Steps to reproduce

This example python script fails:

import gtsam
import gtdynamics as gtd

def main():
    f = gtd.MinTorqueFactor(0, gtsam.noiseModel.Unit.Create(1))

    init = gtsam.Values()
    init.insert(0, 0.0)

    flin = f.linearize(init)
    flin.print("Linearized Factor: ")
    print("\nLinearized factor has type: ", type(flin), '\n')
    # output: Linearized factor has type:  <class 'gtsam.gtsam.JacobianFactor'>

    gtsam.JacobianFactor(flin)  # this throws an error

if __name__ == '__main__':
    main()

Which throws the error:

In JacobianFactor(const GaussianFactor& rhs), rhs is neither a JacobianFactor nor a HessianFactor

Expected behavior

The python result should match that of this example C++ script succeeds:

#include <gtdynamics/factors/MinTorqueFactor.h>

#include <gtsam/nonlinear/NonlinearFactorGraph.h>

#include <iostream>

using namespace gtsam;
using namespace gtdynamics;

int main() {
  auto f = MinTorqueFactor(0, noiseModel::Unit::Create(1));

  auto init = Values();
  init.insert(0, 0.0);

  auto flin = f.linearize(init);
  flin->print("Linearized Factor: ");
  std::cout << "\nLinearized factor has type: " << typeid(*flin).name() << std::endl<< std::endl;
  // Output: Linearized factor has type: N5gtsam14JacobianFactorE

  auto jf = JacobianFactor(*flin);
}

Environment

MacOS 11.3.1, MacBook Air (M1, 2020),
AppleClang 12.0.5.12050022
gtsam hash b99bf4e92912f4ad0020f79d6b222a3d6593514f

I also experienced similar errors when I ran on an ubuntu intel lab machine, but I didn't rigorously test these particular test files.

Additional information

This only occurs for nonlinear factors defined outside of the GTSAM repo. So, for example, PriorFactor, BetweenFactor, etc all still work as expected.

This causes noticeable failures on optimizations with Constrained noise models, may be causing silent correctness errors on other optimizations, and is likely causing all non-GTSAM factors to be calling the HessianFactor constructor instead of the JacobianFactor constructor (performance penalty).

@gchenfc gchenfc added bug Something isn't working invalid This doesn't seem right high priority High priority task item labels Aug 28, 2021
@ProfFan
Copy link
Contributor

ProfFan commented Aug 28, 2021

I hate C++ RTTI, we should adopt a LLVM-styled RTTI some day...

*No capacity to work on this for now, need to get ICRA ddl done

@ProfFan ProfFan closed this as completed Aug 28, 2021
@ProfFan ProfFan reopened this Aug 28, 2021
@gchenfc gchenfc removed the high priority High priority task item label Aug 28, 2021
@gchenfc
Copy link
Member Author

gchenfc commented Aug 28, 2021

Temporarily using this workaround to use name-based type matching:

JacobianFactor::JacobianFactor(const GaussianFactor& gf) {
  // Copy the matrix data depending on what type of factor we're copying from
  if (!strcmp(typeid(gf).name(), typeid(JacobianFactor).name()))
    *this = JacobianFactor(*static_cast<const JacobianFactor*>(&gf));
  else if (!strcmp(typeid(gf).name(), typeid(HessianFactor).name()))
    *this = JacobianFactor(*static_cast<const HessianFactor*>(&gf));
  else
    throw std::invalid_argument(
        "In JacobianFactor(const GaussianFactor& rhs), rhs is neither a JacobianFactor nor a HessianFactor");
}

And do a similar thing in _convertOrCastToJacobians. See borglab/gtsam@bb39346

Note: for constrained optimization, you still need to manually set the linearSolverType param to QR - for some reason GTSAM is also not properly recognizing that the constrained noise models and auto-switching to QR. e.g.

params = gtsam.LevenbergMarquardtParams()
params.setLinearSolverType("MULTIFRONTAL_QR")
result = gtsam.LevenbergMarquardtOptimizer(graph, init, params).optimize()

I'm leaving this open because it needs to eventually be fixed, but this workaround works for now.

@varunagrawal
Copy link
Collaborator

Yeah we'll figure it out after ICRA :)

@gchenfc
Copy link
Member Author

gchenfc commented Oct 27, 2021

Just checking, does anyone else get this error? For me just compiling everything standard and running make python-test in gtdynamics produces this issue in the Simulator. Alternatively, minimum failing example (copy-pasted from above):

import gtsam
import gtdynamics as gtd

def main():
    f = gtd.MinTorqueFactor(0, gtsam.noiseModel.Unit.Create(1))

    init = gtsam.Values()
    init.insert(0, 0.0)

    flin = f.linearize(init)
    flin.print("Linearized Factor: ")
    print("\nLinearized factor has type: ", type(flin), '\n')
    # output: Linearized factor has type:  <class 'gtsam.gtsam.JacobianFactor'>

    gtsam.JacobianFactor(flin)  # this throws an error

if __name__ == '__main__':
    main()

@varunagrawal @yetongumich @ProfFan

@ProfFan
Copy link
Contributor

ProfFan commented Oct 27, 2021

I think I already fixed this in GTSAM, maybe the GTD wrapper does not have such changes applied.

Search for RTTI in the GTSAM Issues

@gchenfc
Copy link
Member Author

gchenfc commented Oct 27, 2021

Are you referring to borglab/gtsam#218 ?

If so, it looks like -fvisibility-ms-compat flag is already set inside function pybind_wrap

But this definitely sounds like the right issue (Mac dynamic linking related)

@ProfFan
Copy link
Contributor

ProfFan commented Oct 27, 2021

AFAIK I am suspecting that we have some violation of ODR such that multiple copies of the same class is generated.
Have no proof, the only way to be sure is adding prints in the JacobianFactor constructor and print out the typeinfo struct.

@gchenfc
Copy link
Member Author

gchenfc commented Oct 27, 2021

If that were the case then wouldn't this be affecting all systems and not just mac?

@ProfFan
Copy link
Contributor

ProfFan commented Oct 28, 2021

No, this linking fiasco only happens with macOS... I don't exactly know why, but the visibility of the symbols of dynamic libs on macOS seems to be the culprit.

I also think it would be really helpful if you can check if the GTD lib and GTSAM is both generating typeinfo for JacobianFactor.

@gchenfc
Copy link
Member Author

gchenfc commented Dec 23, 2021

Sorry for forgetting about this for a long time.

Adding debug prints

In JacobianFactor constructor, I added:

std::cout << "Constructing a Jacobian factor from a Gaussian Factor!!\n"  //
          << "typeid of JacobianFactor is " << typeid(JacobianFactor).name()
          << '\t' << typeid(JacobianFactor).hash_code() << "\n"   //
          << "typeid of input argument is " << typeid(gf).name()  //
          << '\t' << typeid(gf).hash_code() << std::endl;

Then in GTDynamics MinTorqueFactor construtor, I added

std::cout << "Constructing a MinTorqueFactor..." << std::endl;
std::cout << "typeid of gtdynamics' JacobianFactor is "
          << typeid(gtsam::JacobianFactor).name() << '\t'
          << typeid(gtsam::JacobianFactor).hash_code() << std::endl;

Test code

The test code in c++:

auto values = gtsam::Values();
values.insert(0, 0.0);
{
  auto f = gtsam::PriorFactor(0, 0.0, gtsam::noiseModel::Unit::Create(1));
  auto gaussian_factor = f.linearize(values);
  gtsam::JacobianFactor jacobian_factor(*gaussian_factor);
}
{
  auto f = gtdynamics::MinTorqueFactor(0, gtsam::noiseModel::Unit::Create(1));
  auto gaussian_factor = f.linearize(values);
  gtsam::JacobianFactor jacobian_factor(*gaussian_factor);
}

and in python:

values = gtsam.Values()
values.insert(0, 0.0)

f = gtsam.PriorFactorDouble(0, 0.0, gtsam.noiseModel.Unit.Create(1))
gaussian_factor = f.linearize(values)
jacobian_factor = gtsam.JacobianFactor(gaussian_factor)

f = gtd.MinTorqueFactor(0, gtsam.noiseModel.Unit.Create(1))
gaussian_factor = f.linearize(values)
jacobian_factor = gtsam.JacobianFactor(gaussian_factor)  # this throws an error

Results

c++ results:

Constructing a Jacobian factor from a Gaussian Factor!!
typeid of JacobianFactor is N5gtsam14JacobianFactorE    4351243596
typeid of input argument is N5gtsam14JacobianFactorE    4351243596
Constructing a MinTorqueFactor...
typeid of gtdynamics' JacobianFactor is N5gtsam14JacobianFactorE        4351243596
Constructing a Jacobian factor from a Gaussian Factor!!
typeid of JacobianFactor is N5gtsam14JacobianFactorE    4351243596
typeid of input argument is N5gtsam14JacobianFactorE    4351243596

python results:

Constructing a Jacobian factor from a Gaussian Factor!!
typeid of JacobianFactor is N5gtsam14JacobianFactorE    4398406401
typeid of input argument is N5gtsam14JacobianFactorE    4398406401
Constructing a MinTorqueFactor...
typeid of gtdynamics' JacobianFactor is N5gtsam14JacobianFactorE        4796888396
Constructing a Jacobian factor from a Gaussian Factor!!
typeid of JacobianFactor is N5gtsam14JacobianFactorE    4398406401
typeid of input argument is N5gtsam14JacobianFactorE    4796888396
Traceback (most recent call last):
  File "/Users/gerry/GIT_REPOS/GTDynamics/gtdynamics/cablerobot/src/test.py", line 22, in <module>
    main()
  File "/Users/gerry/GIT_REPOS/GTDynamics/gtdynamics/cablerobot/src/test.py", line 14, in main
    jacobian_factor = gtsam.JacobianFactor(gaussian_factor)  # this throws an error
ValueError: In JacobianFactor(const GaussianFactor& rhs), rhs is neither a JacobianFactor nor a HessianFactor

So I think this is an indication that the python wrapper is using a different version of gtdynamics, so maybe we just need GTSAM_EXPORT in gtdynamics?

Going to continue working on this.

@gchenfc
Copy link
Member Author

gchenfc commented Dec 23, 2021

I figured out the problem: my gtdynamics.cpython-39-darwin.so and gtsam.cpython-39-darwin.so files are linking to different places. Specifically:

  • gtdynamics.cpython-39-darwin.so is linking to /Users/gerry/lib/libgtsamDebug.4.dylib while
  • gtsam.cpython-39-darwin.so is linking to /Users/gerry/GIT_REPOS/gtsam/build/gtsam/libgtsamDebug.4.dylib

(found using otool -l)

Note that also gtdynamics.cpython-39-darwin.so is using @rpath/libgtdynamicsDebug.dylib, but rpath is set to:

Load command 26
          cmd LC_RPATH
      cmdsize 64
         path /Users/gerry/GIT_REPOS/GTDynamics/build/gtdynamics (offset 12)
Load command 27
          cmd LC_RPATH
      cmdsize 32
         path /Users/gerry/lib (offset 12)

so actually I think it's also going to be linking to the "wrong" file.

(the -fvisibility-ms-compat was not the issue - linking to the correct executable fixes the problem)


I don't know if this is specific to my machine or not. Would someone else mind pasting their output of:

otool -l /Users/gerry/miniforge3/lib/python3.9/site-packages/gtdynamics-1.0.0-py3.9.egg/gtdynamics/gtdynamics.cpython-39-darwin.so
otool -l /Users/gerry/miniforge3/lib/python3.9/site-packages/gtsam-4.1.0-py3.9.egg/gtsam/gtsam.cpython-39-darwin.so

(replacing /Users/gerry/miniforge3 with wherever your python packages are; the folder can be found using python -c 'import gtsam; print(gtsam.__path__)')

??? @varunagrawal @ProfFan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

3 participants