-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test-cpp-contact-cholesky failure with GCC 13.3.0 #2304
Comments
Same version of Eigen? |
yes |
And, only on aarch64:
Not sure if related to #2277 or not, but we'll deactivate also this one, for aarch64, for now. |
I would check if we're initializing all matrices/vectors with zeros. |
I think I can confirm this is an issue in GCC 13.3.0 with this reproducer: FROM debian
RUN --mount=type=cache,sharing=locked,target=/var/cache/apt \
--mount=type=cache,sharing=locked,target=/var/lib/apt \
apt-get update -y && DEBIAN_FRONTEND=noninteractive apt-get install -qqy --no-install-recommends \
autoconf \
build-essential \
bzip2 \
cmake \
g++-multilib \
gcc-multilib \
git \
libboost-all-dev \
libeigen3-dev \
liburdfdom-dev \
libtinyxml-dev \
make \
wget \
xz-utils
ARG GCC_VERSION=13.2.0
ENV GCC_VERSION=$GCC_VERSION
WORKDIR /src
ADD https://gmplib.org/download/gmp/gmp-6.3.0.tar.xz .
ADD https://www.mpfr.org/mpfr-current/mpfr-4.2.1.tar.xz .
ADD https://ftp.gnu.org/gnu/mpc/mpc-1.3.1.tar.gz .
ADD https://gcc.gnu.org/pub/gcc/infrastructure/isl-0.24.tar.bz2 .
RUN tar xf gmp* \
&& tar xf mpfr* \
&& tar xf mpc* \
&& tar xf isl*
ADD https://gcc.gnu.org/pub/gcc/releases/gcc-$GCC_VERSION/gcc-$GCC_VERSION.tar.xz .
RUN tar xf gcc-$GCC_VERSION.tar.xz
WORKDIR gcc-$GCC_VERSION
RUN mv ../gmp-6.3.0 gmp \
&& mv ../mpfr-4.2.1 mpfr \
&& mv ../mpc-1.3.1 mpc \
&& mv ../isl-0.24 isl
RUN ./configure --enable-languages=c,c++ \
&& make -j 8 \
&& make -j 8 install
WORKDIR /src
ADD https://github.com/stack-of-tasks/pinocchio/releases/download/v3.0.0/pinocchio-3.0.0.tar.gz .
RUN tar xf pinocchio-3.0.0.tar.gz
WORKDIR pinocchio-3.0.0
RUN cmake -B build -S . -DCMAKE_BUILD_TYPE=Release -DBUILD_PYTHON_INTERFACE=OFF -DCMAKE_CXX_STANDARD=14
RUN cmake --build build -j 4
ENV LD_LIBRARY_PATH=/usr/local/lib64
#CMD ./build/unittest/test-cpp-contact-cholesky running
|
the top right corners in l. 359 start with:
vs
|
Taking another look at this, it seems we have a RUN cmake -B build -S . -DCMAKE_BUILD_TYPE=Release -DBUILD_PYTHON_INTERFACE=OFF -DCMAKE_CXX_STANDARD=14
RUN sed -i 's/-O3/-O2/' build/CMakeCache.txt
RUN cmake --build build -j 4 -t test-cpp-contact-cholesky
ENV LD_LIBRARY_PATH=/usr/local/lib64
RUN ./build/unittest/test-cpp-contact-cholesky
Digging further, looking at https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-O3, I tried to replace the failing The good solution would obviously be to trash all our C/C++ code base and start again in a sane language. But this might not be on the roadmap for now, so I guess I'll just go for |
Conda-forge is now using gcc 13.3 as default and we can now reproduce this issue. Here all the failing tests:
|
As stated here -On is not equivalent to using all the -f optimization flag. There is some unamed optimization activated by the -On options. |
Ok, thanks. It looks once again like a GCC issue then. But to report that we should probably work on a MRE. For that we can try to dump the matrices used in one of those failing tests. |
I'm working on it :) |
Note for tomorrow:
|
I found some piece of code that run differently between O2 and O3 with g++ 13.3 : https://github.com/stack-of-tasks/pinocchio/blob/master/include/pinocchio/algorithm/contact-info.hpp#L747-L750 for (int k = 0; k < joint1.nv(); ++k, ++current1_col_id)
{
colwise_joint1_sparsity[current1_col_id] = true;
} This loop doesn't run in O3 for the first joint. Then, colwise_joint1_sparsity is not well initialized. Also, if I remove the following else clause the code is also working in O2 and O3 (in the case the else clause is not needed). else
{
const JointModel & joint2 = model.joints[current2_id];
joint2_span_indexes.push_back((Eigen::DenseIndex)current2_id);
Eigen::DenseIndex current2_col_id = joint2.idx_v();
for (int k = 0; k < joint2.nv(); ++k, ++current2_col_id)
{
colwise_joint2_sparsity[current2_col_id] = true;
}
current2_id = model.parents[current2_id];
} @nim65s, @jcarpent do you see a particular undefined behavior that can explain why G++ don't run the last iteration of the for loop when the else clause is here ? I think I have three options:
I think I will try the last option first, but I'm afraid it will not be so easy, maybe the optimization is activated because of some previous code (contact-info.hpp code is all inline). |
As I feared it will be difficult to have a MRE. The following doesn't reproduce the bug. MRE.cpp #include <cstddef>
#include <iostream>
#include <Eigen/Core>
#include <vector>
extern std::size_t JOINT1_ID;
extern std::size_t JOINT2_ID;
extern int NV;
extern int NJOINT;
extern int JOINT_IDX_V[];
extern int JOINT_NV[];
extern int JOINT_PARENTS[];
void printVector(const Eigen::VectorX<bool>& v);
int main()
{
Eigen::VectorX<bool> colwise_joint1_sparsity(NV);
Eigen::VectorX<bool> colwise_joint2_sparsity(NV);
std::vector<int> joint1_span_indexes;
joint1_span_indexes.reserve(NJOINT);
std::vector<int> joint2_span_indexes;
joint2_span_indexes.reserve(NJOINT);
static const bool default_sparsity_value = false;
colwise_joint1_sparsity.fill(default_sparsity_value);
colwise_joint2_sparsity.fill(default_sparsity_value);
std::size_t current1_id = 0;
if (JOINT1_ID > 0)
current1_id = JOINT1_ID;
std::size_t current2_id = 0;
if (JOINT2_ID > 0)
current2_id = JOINT2_ID;
while (current1_id != current2_id)
{
if (current1_id > current2_id)
{
int current1_col_id = JOINT_IDX_V[current1_id];
joint1_span_indexes.push_back(current1_id);
for (int k = 0; k < JOINT_NV[current1_id]; ++k, ++current1_col_id)
{
colwise_joint1_sparsity[current1_col_id] = true;
}
current1_id = JOINT_PARENTS[current1_id];
}
else
{
int current2_col_id = JOINT_IDX_V[current2_id];
joint2_span_indexes.push_back(current2_id);
for (int k = 0; k < JOINT_NV[current2_id]; ++k, ++current2_col_id)
{
colwise_joint2_sparsity[current2_col_id] = true;
}
current2_id = JOINT_PARENTS[current2_id];
}
}
printVector(colwise_joint1_sparsity);
} MRE_module.cpp #include <cstddef>
#include <iostream>
#include <Eigen/Core>
std::size_t JOINT1_ID = 5;
std::size_t JOINT2_ID = 0;
int NV = 10;
int NJOINT = 5;
int JOINT_IDX_V[] = {0, 0, 6, 7, 8, 9};
int JOINT_NV[] = {0, 6, 1, 1, 1, 1};
int JOINT_PARENTS[] = {0, 0, 1, 2, 3, 4};
void printVector(const Eigen::VectorX<bool>& v)
{
std::cout << v.transpose() << std::endl;
} compile.sh #! /bin/sh
#
FLAGS="-O3 -DNDEBUG -std=gnu++17"
g++ $FLAGS -c MRE_module.cpp -o MRE_module.o -isystem $CONDA_PREFIX/include/eigen3/
g++ $FLAGS -c MRE.cpp -o MRE.o -isystem $CONDA_PREFIX/include/eigen3/
g++ $FLAGS MRE.o MRE_module.o -o mre |
Current devel:
caching nv with: --- a/include/pinocchio/algorithm/contact-info.hpp
+++ b/include/pinocchio/algorithm/contact-info.hpp
@@ -742,9 +742,10 @@ namespace pinocchio
if (current1_id > current2_id)
{
const JointModel & joint1 = model.joints[current1_id];
+ const int j1nv = joint1.nv();
joint1_span_indexes.push_back((Eigen::DenseIndex)current1_id);
Eigen::DenseIndex current1_col_id = joint1.idx_v();
- for (int k = 0; k < joint1.nv(); ++k, ++current1_col_id)
+ for (int k = 0; k < j1nv; ++k, ++current1_col_id)
{
colwise_joint1_sparsity[current1_col_id] = true;
}
@@ -753,9 +754,10 @@ namespace pinocchio
else
{
const JointModel & joint2 = model.joints[current2_id];
+ const int j2nv = joint2.nv();
joint2_span_indexes.push_back((Eigen::DenseIndex)current2_id);
Eigen::DenseIndex current2_col_id = joint2.idx_v();
- for (int k = 0; k < joint2.nv(); ++k, ++current2_col_id)
+ for (int k = 0; k < j2nv; ++k, ++current2_col_id)
{
colwise_joint2_sparsity[current2_col_id] = true;
}
@@ -770,10 +772,11 @@ namespace pinocchio
while (current_id > 0)
{
const JointModel & joint = model.joints[current_id];
+ const int jnv = joint.nv();
joint1_span_indexes.push_back((Eigen::DenseIndex)current_id);
joint2_span_indexes.push_back((Eigen::DenseIndex)current_id);
Eigen::DenseIndex current_row_id = joint.idx_v();
- for (int k = 0; k < joint.nv(); ++k, ++current_row_id)
+ for (int k = 0; k < jnv; ++k, ++current_row_id)
{
colwise_joint1_sparsity[current_row_id] = true;
colwise_joint2_sparsity[current_row_id] = true; I get:
There is something fishy in that in |
this seems to fix all those:
|
The issue doesn't appear anymore with GCC14.1. So I will close this issue. I have created a minimal example (but with pinocchio, so it's hard to share with gcc team). #include <iostream>
#include "pinocchio/multibody/sample-models.hpp"
#include <boost/test/unit_test.hpp>
BOOST_AUTO_TEST_SUITE(BOOST_TEST_MODULE)
using namespace Eigen;
using namespace pinocchio;
template<typename Scalar>
struct ContactModel
{
JointIndex joint1_id;
JointIndex joint2_id;
Eigen::Matrix<bool, Eigen::Dynamic, 1> colwise_joint1_sparsity;
Eigen::Matrix<bool, Eigen::Dynamic, 1> colwise_joint2_sparsity;
template<int OtherOptions, template<typename, int> class JointCollectionTpl>
ContactModel(
const ModelTpl<Scalar, OtherOptions, JointCollectionTpl> & model,
const JointIndex joint1_id,
const JointIndex joint2_id)
: joint1_id(joint1_id)
, joint2_id(joint2_id)
, colwise_joint1_sparsity(model.nv)
, colwise_joint2_sparsity(model.nv)
{
init(model);
}
template<int OtherOptions, template<typename, int> class JointCollectionTpl>
void init(const ModelTpl<Scalar, OtherOptions, JointCollectionTpl> & model)
{
typedef ModelTpl<Scalar, OtherOptions, JointCollectionTpl> Model;
typedef typename Model::JointModel JointModel;
static const bool default_sparsity_value = false;
colwise_joint1_sparsity.fill(default_sparsity_value);
colwise_joint2_sparsity.fill(default_sparsity_value);
JointIndex current1_id = 0;
if (joint1_id > 0)
current1_id = joint1_id;
JointIndex current2_id = 0;
if (joint2_id > 0)
current2_id = joint2_id;
while (current1_id != current2_id)
{
if (current1_id > current2_id)
{
const JointModel & joint1 = model.joints[current1_id];
const int j1nv = joint1.nv();
Eigen::DenseIndex current1_col_id = joint1.idx_v();
for (int k = 0; k < j1nv; ++k, ++current1_col_id)
{
colwise_joint1_sparsity[current1_col_id] = true;
}
current1_id = model.parents[current1_id];
}
else
{
const JointModel & joint2 = model.joints[current2_id];
const int j2nv = joint2.nv();
Eigen::DenseIndex current2_col_id = joint2.idx_v();
for (int k = 0; k < j2nv; ++k, ++current2_col_id)
{
colwise_joint2_sparsity[current2_col_id] = true;
}
current2_id = model.parents[current2_id];
}
}
}
};
template<typename Scalar>
struct ContactModelNotWorking
{
JointIndex joint1_id;
JointIndex joint2_id;
Eigen::Matrix<bool, Eigen::Dynamic, 1> colwise_joint1_sparsity;
Eigen::Matrix<bool, Eigen::Dynamic, 1> colwise_joint2_sparsity;
template<int OtherOptions, template<typename, int> class JointCollectionTpl>
ContactModelNotWorking(
const ModelTpl<Scalar, OtherOptions, JointCollectionTpl> & model,
const JointIndex joint1_id,
const JointIndex joint2_id)
: joint1_id(joint1_id)
, joint2_id(joint2_id)
, colwise_joint1_sparsity(model.nv)
, colwise_joint2_sparsity(model.nv)
{
init(model);
}
template<int OtherOptions, template<typename, int> class JointCollectionTpl>
void init(const ModelTpl<Scalar, OtherOptions, JointCollectionTpl> & model)
{
typedef ModelTpl<Scalar, OtherOptions, JointCollectionTpl> Model;
typedef typename Model::JointModel JointModel;
static const bool default_sparsity_value = false;
colwise_joint1_sparsity.fill(default_sparsity_value);
colwise_joint2_sparsity.fill(default_sparsity_value);
JointIndex current1_id = 0;
if (joint1_id > 0)
current1_id = joint1_id;
JointIndex current2_id = 0;
if (joint2_id > 0)
current2_id = joint2_id;
while (current1_id != current2_id)
{
if (current1_id > current2_id)
{
const JointModel & joint1 = model.joints[current1_id];
Eigen::DenseIndex current1_col_id = joint1.idx_v();
for (int k = 0; k < joint1.nv(); ++k, ++current1_col_id)
{
colwise_joint1_sparsity[current1_col_id] = true;
}
current1_id = model.parents[current1_id];
}
else
{
const JointModel & joint2 = model.joints[current2_id];
Eigen::DenseIndex current2_col_id = joint2.idx_v();
for (int k = 0; k < joint2.nv(); ++k, ++current2_col_id)
{
colwise_joint2_sparsity[current2_col_id] = true;
}
current2_id = model.parents[current2_id];
}
}
}
};
BOOST_AUTO_TEST_CASE(test_gcc13_3)
{
using namespace Eigen;
using namespace pinocchio;
typedef JointCollectionDefaultTpl<double, pinocchio::context::Options> JC;
using buildModels::details::addJointAndBody;
Inertia Ijoint(.1, Inertia::Vector3::Zero(), Inertia::Matrix3::Identity() * .01);
Model model;
auto ffidx = model.addJoint(0, typename JC::JointModelFreeFlyer(), SE3::Identity(), "root_joint");
model.lowerPositionLimit.template segment<4>(3).fill(-1.);
model.upperPositionLimit.template segment<4>(3).fill(1.);
model.appendBodyToJoint(ffidx, Ijoint);
model.addJointFrame(ffidx);
buildModels::details::addManipulator(model, ffidx);
const std::string LF = "wrist2_joint";
const Model::JointIndex LF_id = model.getJointId(LF);
ContactModel ci_LF(model, LF_id, 0);
std::cout << ci_LF.colwise_joint1_sparsity.transpose() << std::endl;
ContactModelNotWorking ci_LF_not_work(model, LF_id, 0);
std::cout << ci_LF_not_work.colwise_joint1_sparsity.transpose() << std::endl;
}
BOOST_AUTO_TEST_SUITE_END() |
Hi,
This is a preliminary bug report, I'm trying a few things to check that.
But so far, with nix, since gcc was upgraded from 13.2.0 to 13.3.0, test-cpp-contact-cholesky fail with big differences in some matrices (ie. 0.207 != 5e-17).
I suspect this is an issue in GCC, so I will deactivate this test on nix, and try to make a proper bug report to GCC.
The text was updated successfully, but these errors were encountered: