Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++] Specialize unique_ptr for the default deleter #96504

Closed
wants to merge 1 commit into from

Conversation

zmodem
Copy link
Collaborator

@zmodem zmodem commented Jun 24, 2024

As mentioned in #93069, the compressed_pair used for storing the deleter in unique_ptr contributes significantly to debug info size.

It seems that the best solution for that will be #76756 ("Use [[no_unique_address]] in place of __compressed_pair") However, that is blocked on supporting old Clang revisions (maybe not for much longer?) and LLDB support (#93809 is in progress, but do we also need to wait for it to ship, and how "far" down the pipe?).

In the meantime, another idea (from @cjdb I think) is to provide a specialization of unique_ptr for the default deleter case, which avoids the need to store the deleter at all.

I've experimented with a patch, and learned a few things:

  • A specialization such as unique_ptr<_Tp, default_delete<_Tp>> would be ambiguous with the unique_ptr<_Tp[], _Dp> specialization for array types (they're "the same amount of specialized"). So in the patch I only specialized for non-array types, which are the most common.
  • We still need to go through the default deleter in reset(), since T may have a private destructor and friend declaration for std::default_delete<T>

Using the patch to build Chromium for Windows,

  • Build time seems about the same (maybe reduced a bit, but I didn't measure carefully)
  • Total .obj file size reduced 1.9%
  • chrome.dll.pdb size reduced 2.5%
  • The type information size in the PDB reduced 4.8%

I had hoped for more, but removing 4.8% of the type info is still a good chunk of bytes, and maybe it's possible to improve things further.

@rnk @cjdb @ldionne @philnik777 what do you think?

@zmodem zmodem requested a review from a team as a code owner June 24, 2024 15:07
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jun 24, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jun 24, 2024

@llvm/pr-subscribers-libcxx

Author: Hans (zmodem)

Changes

As mentioned in #93069, the compressed_pair used for storing the deleter in unique_ptr contributes significantly to debug info size.

It seems that the best solution for that will be #76756 ("Use [[no_unique_address]] in place of __compressed_pair") However, that is blocked on supporting old Clang revisions (maybe not for much longer?) and LLDB support (#93809 is in progress, but do we also need to wait for it to ship, and how "far" down the pipe?).

In the meantime, another idea (from @cjdb I think) is to provide a specialization of unique_ptr for the default deleter case, which avoids the need to store the deleter at all.

I've experimented with a patch, and learned a few things:

  • A specialization such as unique_ptr&lt;_Tp, default_delete&lt;_Tp&gt;&gt; would be ambiguous with the unique_ptr&lt;_Tp[], _Dp&gt; specialization for array types (they're "the same amount of specialized"). So in the patch I only specialized for non-array types, which are the most common.
  • We still need to go through the default deleter in reset(), since T may have a private destructor and friend declaration for std::default_delete&lt;T&gt;

Using the patch to build Chromium for Windows,

  • Build time seems about the same (maybe reduced a bit, but I didn't measure carefully)
  • Total .obj file size reduced 1.9%
  • chrome.dll.pdb size reduced 2.5%
  • The type information size in the PDB reduced 4.8%

I had hoped for more, but removing 4.8% of the type info is still a good chunk of bytes, and maybe it's possible to improve things further.

@rnk @cjdb @ldionne @philnik777 what do you think?


Full diff: https://github.com/llvm/llvm-project/pull/96504.diff

1 Files Affected:

  • (modified) libcxx/include/__memory/unique_ptr.h (+169)
diff --git a/libcxx/include/__memory/unique_ptr.h b/libcxx/include/__memory/unique_ptr.h
index 3bd02a7cc26aa..72702929f7736 100644
--- a/libcxx/include/__memory/unique_ptr.h
+++ b/libcxx/include/__memory/unique_ptr.h
@@ -471,6 +471,175 @@ class _LIBCPP_UNIQUE_PTR_TRIVIAL_ABI _LIBCPP_TEMPLATE_VIS unique_ptr<_Tp[], _Dp>
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 void swap(unique_ptr& __u) _NOEXCEPT { __ptr_.swap(__u.__ptr_); }
 };
 
+
+
+template <class _Tp>
+class _LIBCPP_UNIQUE_PTR_TRIVIAL_ABI _LIBCPP_TEMPLATE_VIS unique_ptr<_Tp, __enable_if_t<!is_array<_Tp>::value, default_delete<_Tp>> > {
+public:
+  typedef _Tp element_type;
+  typedef default_delete<_Tp> _Dp;
+  typedef _Dp deleter_type;
+  typedef _LIBCPP_NODEBUG typename __pointer<_Tp, deleter_type>::type pointer; // XXX: What is this?
+
+  static_assert(!is_rvalue_reference<deleter_type>::value, "the specified deleter type cannot be an rvalue reference");
+
+  // A unique_ptr contains the following members which may be trivially relocatable:
+  // - pointer : this may be trivially relocatable, so it's checked
+  // - deleter_type: this may be trivially relocatable, so it's checked
+  //
+  // This unique_ptr implementation only contains a pointer to the unique object and a deleter, so there are no
+  // references to itself. This means that the entire structure is trivially relocatable if its members are.
+  using __trivially_relocatable = __conditional_t<
+      __libcpp_is_trivially_relocatable<pointer>::value && __libcpp_is_trivially_relocatable<deleter_type>::value,
+      unique_ptr,
+      void>;
+
+private:
+  pointer __ptr_;
+
+  typedef _LIBCPP_NODEBUG __unique_ptr_deleter_sfinae<_Dp> _DeleterSFINAE;
+
+  template <bool _Dummy>
+  using _LValRefType _LIBCPP_NODEBUG = typename __dependent_type<_DeleterSFINAE, _Dummy>::__lval_ref_type;
+
+  template <bool _Dummy>
+  using _GoodRValRefType _LIBCPP_NODEBUG = typename __dependent_type<_DeleterSFINAE, _Dummy>::__good_rval_ref_type;
+
+  template <bool _Dummy>
+  using _BadRValRefType _LIBCPP_NODEBUG = typename __dependent_type<_DeleterSFINAE, _Dummy>::__bad_rval_ref_type;
+
+  template <bool _Dummy, class _Deleter = typename __dependent_type< __type_identity<deleter_type>, _Dummy>::type>
+  using _EnableIfDeleterDefaultConstructible _LIBCPP_NODEBUG =
+      __enable_if_t<is_default_constructible<_Deleter>::value && !is_pointer<_Deleter>::value>;
+
+  template <class _ArgType>
+  using _EnableIfDeleterConstructible _LIBCPP_NODEBUG = __enable_if_t<is_constructible<deleter_type, _ArgType>::value>;
+
+  template <class _UPtr, class _Up>
+  using _EnableIfMoveConvertible _LIBCPP_NODEBUG =
+      __enable_if_t< is_convertible<typename _UPtr::pointer, pointer>::value && !is_array<_Up>::value >;
+
+  template <class _UDel>
+  using _EnableIfDeleterConvertible _LIBCPP_NODEBUG =
+      __enable_if_t< (is_reference<_Dp>::value && is_same<_Dp, _UDel>::value) ||
+                     (!is_reference<_Dp>::value && is_convertible<_UDel, _Dp>::value) >;
+
+  template <class _UDel>
+  using _EnableIfDeleterAssignable = __enable_if_t< is_assignable<_Dp&, _UDel&&>::value >;
+
+public:
+  template <bool _Dummy = true, class = _EnableIfDeleterDefaultConstructible<_Dummy> >
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR unique_ptr() _NOEXCEPT : __ptr_() {}
+
+  template <bool _Dummy = true, class = _EnableIfDeleterDefaultConstructible<_Dummy> >
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR unique_ptr(nullptr_t) _NOEXCEPT
+      : __ptr_() {}
+
+  template <bool _Dummy = true, class = _EnableIfDeleterDefaultConstructible<_Dummy> >
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 explicit unique_ptr(pointer __p) _NOEXCEPT
+      : __ptr_(__p) {}
+
+  template <bool _Dummy = true, class = _EnableIfDeleterConstructible<_LValRefType<_Dummy> > >
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unique_ptr(pointer __p, _LValRefType<_Dummy>) _NOEXCEPT
+      : __ptr_(__p) {}
+
+  template <bool _Dummy = true, class = _EnableIfDeleterConstructible<_GoodRValRefType<_Dummy> > >
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unique_ptr(pointer __p, _GoodRValRefType<_Dummy>) _NOEXCEPT
+      : __ptr_(__p) {
+    static_assert(!is_reference<deleter_type>::value, "rvalue deleter bound to reference");
+  }
+
+  template <bool _Dummy = true, class = _EnableIfDeleterConstructible<_BadRValRefType<_Dummy> > >
+  _LIBCPP_HIDE_FROM_ABI unique_ptr(pointer __p, _BadRValRefType<_Dummy> __d) = delete;
+
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unique_ptr(unique_ptr&& __u) _NOEXCEPT
+      : __ptr_(__u.release()) {}
+
+  template <class _Up,
+            class _Ep,
+            class = _EnableIfMoveConvertible<unique_ptr<_Up, _Ep>, _Up>,
+            class = _EnableIfDeleterConvertible<_Ep> >
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unique_ptr(unique_ptr<_Up, _Ep>&& __u) _NOEXCEPT
+      : __ptr_(__u.release()) {}
+
+#if _LIBCPP_STD_VER <= 14 || defined(_LIBCPP_ENABLE_CXX17_REMOVED_AUTO_PTR)
+  template <class _Up,
+            __enable_if_t<is_convertible<_Up*, _Tp*>::value && is_same<_Dp, default_delete<_Tp> >::value, int> = 0>
+  _LIBCPP_HIDE_FROM_ABI unique_ptr(auto_ptr<_Up>&& __p) _NOEXCEPT : __ptr_(__p.release(), __value_init_tag()) {}
+#endif
+
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unique_ptr& operator=(unique_ptr&& __u) _NOEXCEPT {
+    reset(__u.release());
+    return *this;
+  }
+
+  template <class _Up,
+            class _Ep,
+            class = _EnableIfMoveConvertible<unique_ptr<_Up, _Ep>, _Up>,
+            class = _EnableIfDeleterAssignable<_Ep> >
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unique_ptr& operator=(unique_ptr<_Up, _Ep>&& __u) _NOEXCEPT {
+    reset(__u.release());
+    return *this;
+  }
+
+#if _LIBCPP_STD_VER <= 14 || defined(_LIBCPP_ENABLE_CXX17_REMOVED_AUTO_PTR)
+  template <class _Up,
+            __enable_if_t<is_convertible<_Up*, _Tp*>::value && is_same<_Dp, default_delete<_Tp> >::value, int> = 0>
+  _LIBCPP_HIDE_FROM_ABI unique_ptr& operator=(auto_ptr<_Up> __p) {
+    reset(__p.release());
+    return *this;
+  }
+#endif
+
+#ifdef _LIBCPP_CXX03_LANG
+  unique_ptr(unique_ptr const&)            = delete;
+  unique_ptr& operator=(unique_ptr const&) = delete;
+#endif
+
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 ~unique_ptr() { reset(); }
+
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unique_ptr& operator=(nullptr_t) _NOEXCEPT {
+    reset();
+    return *this;
+  }
+
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 __add_lvalue_reference_t<_Tp> operator*() const {
+    return *__ptr_;
+  }
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 pointer operator->() const _NOEXCEPT { return __ptr_; }
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 pointer get() const _NOEXCEPT { return __ptr_; }
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 deleter_type& get_deleter() _NOEXCEPT {
+    static deleter_type __deleter;
+    return __deleter; // XXX: This is probably not the right way?
+  }
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 const deleter_type& get_deleter() const _NOEXCEPT {
+    static deleter_type __deleter;
+    return __deleter; // XXX: This is probably not the right way?
+  }
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 explicit operator bool() const _NOEXCEPT {
+    return __ptr_ != nullptr;
+  }
+
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 pointer release() _NOEXCEPT {
+    pointer __t = __ptr_;
+    __ptr_ = pointer();
+    return __t;
+  }
+
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 void reset(pointer __p = pointer()) _NOEXCEPT {
+    pointer __tmp  = __ptr_;
+    __ptr_ = __p;
+    if (__tmp)
+      deleter_type()(__tmp);
+  }
+
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 void swap(unique_ptr& __u) _NOEXCEPT {
+    pointer __tmp  = __ptr_;
+    __ptr_ = __u.__ptr_;
+    __u.__ptr_ = __tmp;
+  }
+};
+
 template <class _Tp, class _Dp, __enable_if_t<__is_swappable_v<_Dp>, int> = 0>
 inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 void
 swap(unique_ptr<_Tp, _Dp>& __x, unique_ptr<_Tp, _Dp>& __y) _NOEXCEPT {

@philnik777
Copy link
Contributor

The plan is to ship #76756 early in the next release cycle. I don't think we need to wait for LLDB. The issue exists only in assertions builds (i.e. most users wouldn't have LLDB crashing on them) and we haven't waited for LLDB before to update their pretty printers. As long as the issue is fixed in the same release cycle this should be absolutely fine. Given that, I don't think we want to walk this path.

What's with the fuzz about replacing __compressed_pair recently anyways? It's not like this is a new issue in any way. Or maybe I just perceive it as recent for some reason, IDK.

@zmodem
Copy link
Collaborator Author

zmodem commented Jun 24, 2024

The plan is to ship #76756 early in the next release cycle. I don't think we need to wait for LLDB.

Great, thanks for confirming.

What's with the fuzz about replacing __compressed_pair recently anyways?

PDB files have a 2^31 byte limit on the type info, which Chromium is approaching (crbug.com/338094922). __compressed_pair seems to account for a large portion of the type info size in our builds.

Plus clang-cl finally getting support for no_unique_address last years means that's not blocking things anymore.

@Michael137
Copy link
Member

Just to update re. LLDB support, there's two issues outstanding:

  1. The assertions during AST lowering that @philnik777 mentioned (this is mainly a problem for the LLDB buildbot maintainers)
  2. The data-formatters don't support the new layout yet (this is a more serious issue, but not hard to address)

I've put up a draft review for the formatters in #96538. The assertion failures should be addressed by #96422. I'm actively working on getting these ready to land by the time #76756 is ready.

@philnik777
Copy link
Contributor

@Michael137 Thanks for all the work on this! I'm pretty sure this whole effort will drastically improve the QoI and maintainability of libc++. BTW sorry for being responsible for a significant amount of the additional complexity due to changed layouts in the pretty printers. Do you have any plans to remove support for old versions of the layouts at any point?

@zmodem Thanks for the info. Would you be able to also provide numbers for #76756? Something in the release notes like "this change resulted in a 5% decrease in debug info size for Chromium" always sounds great.

I hope it's good enough if #76756 lands in ~2 months? It would also be great if you could run the patch through its paces within google.

P.S. I noticed that my wording on "not caring about the pretty printers" was a bit misleading - I meant that we don't care to wait for them to propagate, not that we don't care that they work within the same release.

@Michael137
Copy link
Member

@Michael137 Thanks for all the work on this! I'm pretty sure this whole effort will drastically improve the QoI and maintainability of libc++. BTW sorry for being responsible for a significant amount of the additional complexity due to changed layouts in the pretty printers. Do you have any plans to remove support for old versions of the layouts at any point?

No problem! The way the pretty-printers are written makes them susceptible to these kinds of changes. We'll probably maintain both layouts for some time, but we already do that for other types like std::string too. With some refactoring I don't think it's going to be a big problem. (side-note, a nice benefit from the new simpler layout is that we'll be able to avoid bugs like #68396 (comment)).

@zmodem
Copy link
Collaborator Author

zmodem commented Jun 25, 2024

@zmodem Thanks for the info. Would you be able to also provide numbers for #76756? Something in the release notes like "this change resulted in a 5% decrease in debug info size for Chromium" always sounds great.

Sure! Compared to libc++ head, applying that:

  • Total .obj file size reduced 5.5%
  • chrome.dll.pdb size reduced 5.1%
  • The type information size in the PDB reduced 10.6%

Those are very appealing numbers for us :)

These are my local build time measurements:

With head:

real    37m23.793s
user    4121m5.728s
sys     408m9.989s

With my patch:

real    36m47.368s
user    4051m46.384s
sys     401m42.658s

With #76756:

real    36m4.842s
user    3970m57.689s
sys     398m46.022s

Even if these measurements are not scientific, the numbers point in the right direction.

I hope it's good enough if #76756 lands in ~2 months?

Yes, that's good for us.

It would also be great if you could run the patch through its paces within google.

I've asked folks more familiar with the internal processes to take a look.

@ldionne
Copy link
Member

ldionne commented Jul 3, 2024

Just as a matter of process, let's close this if we are confident that we will not be pursuing it, just to keep the review queue tidy :-)

@zmodem zmodem closed this Jul 3, 2024
@zmodem
Copy link
Collaborator Author

zmodem commented Jul 10, 2024

It would also be great if you could run the patch through its paces within google.

I've asked folks more familiar with the internal processes to take a look.

Following up on this, #76756 seems to work fine internally. The only issues I saw were some tests relying on pretty-printing std::string, but that should get fixed with the LLDB patches.

I don't have any overall statistics, but we're seeing some binaries shrink by 3-7% in non-optimized builds, and 2-4% for optimized. At scale, that's a huge improvement.

@philnik777
Copy link
Contributor

@zmodem Thanks for the info! It's good to hear that there doesn't seem to be any obvious problems with the patch. I'm quite surprised that it makes a difference for optimized builds. Does that show in actual reduced code, or is that other information, like symbol tables? I absolutely agree that this is a huge improvement. I didn't think it would have that much of an impact. And we can do even more cleanups after that, although I don't expect nearly the impact of replacing __compressed_pair itself.

@zmodem
Copy link
Collaborator Author

zmodem commented Jul 10, 2024

I'm quite surprised that it makes a difference for optimized builds. Does that show in actual reduced code, or is that other information, like symbol tables?

Sorry, I should have made that clearer. These are builds with optimizations enabled, but they still have debug info. The size reductions are from the debug sections. (At least if I'm reading things correctly.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants