-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
macos: fix missing pthread mutex init after calloc #21
base: mariadb-4.x
Are you sure you want to change the base?
Conversation
calls constructor for a mutex in a struct value init-ed with gu_calloc. in path `gcs_core_create() -> gcs_group_init()`, the first one allocates `gcs_core_t* core` with gu_calloc() whereas `gcs_code_t` has `gcs_group_t group` with `gu::Mutex memb_mtx_`. After memory allocation gu::Mutex constructor was not called that lead to an error on Darwin in a call to pthread mutex lock. Signed-off-by: Ivan Prisyazhnyy <[email protected]>
67981b9
to
ec7d79d
Compare
created MDEV-34717 to refer to this. |
@@ -61,6 +61,7 @@ gcs_group_init (gcs_group_t* group, gu::Config* const cnf, gcache_t* const cache | |||
int const appl_proto_ver) | |||
{ | |||
// here we also create default node instance. | |||
new (&group->memb_mtx_) gu::Mutex(NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me of the recent MDEV-34625 and MariaDB/server#3408. I understood that macOS would use a clang
based compiler by default.
I think that use of placement new
after a call to a calloc
like operation may lead to surprises when using GCC 6 or later, if the constructor is expecting that some fields were already initialized by a previous write. GCC 6 and later could optimize away such pre-constructor writes, thanks to -flifetime-dse
.
That macOS (as well as AIX) are not happy with zero initialization for pthread_mutex_t
bit me in MariaDB/server#3433.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me of the recent MDEV-34625 and MariaDB/server#3408. I understood that macOS would use a clang based compiler by default.
I think that use of placement new after a call to a calloc like operation may lead to surprises when using GCC 6 or later, if the constructor is expecting that some fields were already initialized by a previous write. GCC 6 and later could optimize away such pre-constructor writes, thanks to -flifetime-dse.
MacOS uses Clang yes. But MDEV-34625 IMHO is different. Moreover, the Godbolt example does not look correct - it is fine that the memset
in https://gcc.godbolt.org/z/5n87z1raG is optimized out because it is expected (I suppose) that after a class constructor is called all class variables are initialized. Thus, the compiler may conclude that if we have void *buf = malloc(size S); s = new (buf) buf;
is equivalent to S *s = new S;
.
But it is different if you have malloc()
buf size bigger than the memory that the constructor() is expected to touch:
https://gcc.godbolt.org/z/Y43YW7vKj
struct S {
int i; // uninitialized in consturctor
S() {};
};
struct A {
char a[256];
S s;
};
int bar() {
A *buf = (A *)malloc(sizeof(A));
memset(buf, 0, sizeof(A));
S* s = new(&buf->s) S;
return s->i;
}
became call calloc
in assembly or rep stosq
. So memset is Not eliminated in this case and will not be eliminated in the case of gcs_core_t* core = GU_CALLOC (1, gcs_core_t);
as far as sizeof(gcs_core_t) >> gcs_group_t
.
it does not eliminate calloc() even if -flifetime-dse=0
(try saving *buf to external volatile var) (default is 2) (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html).
if you still think it is not safe, I can offer that we could replace the mutex there instead of gu::Mutex memb_mtx_
we could write gu_mutex_t mutable value_
and init without calling class constructors. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the constructor is expecting that some fields were already initialized by a previous write.
I think its forbidden by the spec (speculating) - constructor must init all class fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is UB, with no doubt. If all fields of gu::Mutex
are initialized by the constructor, then there should be no issue with the GCC -flifetime-dse
on any platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from what we can see in gu_mutex.hpp they all are init-ed
namespace gu
{
class Mutex
{
public:
Mutex (const wsrep_mutex_key_t* key) : value_()
#ifdef GU_MUTEX_DEBUG
, owned_()
, locked_()
#endif /* GU_MUTEX_DEBUG */
{
if (gu_mutex_init (key, &value_))
gu_throw_fatal;
}
protected:
gu_mutex_t mutable value_;
#ifdef GU_MUTEX_DEBUG
gu_thread_t mutable owned_;
bool mutable locked_;
#endif /* GU_MUTEX_DEBUG */
};
}
This issue is currently worked by Alexey and his comment was "I'd rather make a constructor for group struct." |
@janlindstrom shall we close this PR then? Speaking of "I'd rather make a constructor for group struct." the less invasive thing could be just to replace the |
@sitano Lets wait for Alexey's decision. Galera and MariaDB currently do not support officially MacOS so this is not very high-priority issue currently. |
Sorry, I am a bit short on time. Hope I will do the trials tomorrow. |
tested 78f68e9 on MBP M3Max (
works well! I have managed to execute Galera node and join another node. (@janlindstrom, @ayurchen ) |
calls constructor for a mutex in a struct value init-ed with gu_calloc.
in path
gcs_core_create() -> gcs_group_init()
, the first one allocatesgcs_core_t* core
with gu_calloc() whereasgcs_code_t
hasgcs_group_t group
withgu::Mutex memb_mtx_
. After memory allocation gu::Mutex constructor was not called that lead to an error on Darwin in a call to pthread mutex lock.cherry-pick from #20.
Stacktraces from Darwin
Originally it failed on boot in:
where as init previously called from: