-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add static build config. #510
add static build config. #510
Conversation
@blapie Please comment and review. |
It seems that, we need this PR to fix pytorch sleef build issue. pytorch/pytorch#118980 |
Hello! I think I understand the issue/suggested solution, although it did not sound extremely clear. I don't really understand how this works as is. While the change to CMakeLists.txt seems to make sense, in Configure.cmake and src/common/CMakeLists.txt occurrences of BUILD_SHARED_LBS are replaced by SLEEF_STATIC_LIBS, which I believe is the opposite of what you want. Btw SLEEF_BUILD_STATIC_LIBS is not defined as a user-facing option, which doesn't help understanding the solution. I'm not very keen on that change anyway, even if it was fixed. The use case is too particular to justify making the build system more complex than it already is. If pytorch needs to setup its submodules differently locally then the solution implementing Save/Restore (https://github.com/pytorch/pytorch/blob/cee16353db384dd33a4804f9f26829cacfcfbab7/aten/src/ATen/CMakeLists.txt#L424-L440) was probably the way to go, or something along those lines. I don't believe it is a relevant feature to add to SLEEF, but happy to be convinced otherwise. |
@blapie Let me analysis the detailed issue:Phenomenon:pytorch Windows can only build it pass first time by save/restoe "BUILD_SHARED_LIBS", the second time(incremental build) will cause the runtime libraries conflict: #509 (comment) Reason:
We need to drop the reserved variable "BUILD_SHARED_LIBS" and use sleef dedicate variable "SLEEF_BUILD_SHARED_LIBS". |
Also, you can check the another static lib mimalloc: https://github.com/pytorch/pytorch/blob/main/CMakeLists.txt#L1060-L1068
Acturlly, save/restore is a workaround for sleef exist "BUILD_SHARED_LIBS" issue. Please check this comment: https://github.com/pytorch/pytorch/blob/cee16353db384dd33a4804f9f26829cacfcfbab7/aten/src/ATen/CMakeLists.txt#L483 |
Finally, it is continued to rename "BUILD_SHARED_LIBS" to "SLEEF_BUILD_SHARED_LIBS", please check my last PR: #509 (comment) |
@blapie Please review it again. |
@blapie could you please priority to review this PR? pytorch/pytorch#118980 this pytorch PR id depends on it. |
Hello @xuhancn! I'm looking into now. I would prefer a slightly different approach though but I will look at how feasible that is. |
CMakeLists.txt
Outdated
@@ -1,6 +1,6 @@ | |||
|
|||
# Options | |||
|
|||
option(SLEEF_BUILD_SHARED_LIBS "Build sleef as static library" ON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try this change instead and switch back to old name everywhere? I believe this can work according to the documentation (https://cmake.org/cmake/help/latest/variable/BUILD_SHARED_LIBS.html)
-option(SLEEF_BUILD_SHARED_LIBS "Build sleef as static library" ON)
+# Declare BUILD_SHARED_LIBS option to make it accessible to top-level projects.
+option(BUILD_SHARED_LIBS "Build using shared libraries (CMake reserved)" ON)
+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried this method, It not constant work, the reason as above: #510 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytorch default want setup “BUILD_SHARED_LIBS” as shared, but it want to use sleef as static library. Current pytorch is used a save/restore this variable to WA it.
Save: https://github.com/pytorch/pytorch/blob/cee16353db384dd33a4804f9f26829cacfcfbab7/aten/src/ATen/CMakeLists.txt#L424-L440
Restore: https://github.com/pytorch/pytorch/blob/cee16353db384dd33a4804f9f26829cacfcfbab7/aten/src/ATen/CMakeLists.txt#L480-L485
And the original author think we should use SLEEF_ to pass the sleef build config also: https://github.com/pytorch/pytorch/blob/cee16353db384dd33a4804f9f26829cacfcfbab7/aten/src/ATen/CMakeLists.txt#L483
Right, let me know if I'm missing something but I believe we don't have to rename the variable to get the behaviour you are expecting. Could you try defining the BUILD_SHARED_LIBS option in SLEEF's CMakeLists.txt see comment in review? https://cmake.org/cmake/help/latest/variable/BUILD_SHARED_LIBS.html
https://discourse.cmake.org/t/build-shared-libs/4200
But it might need to be done the right way. In order to fix the parallel build I think you need to put the save/restore stages in a function (since it will have a scope), see |
The original pytorch is use this method, it works on Linux but not works on Windows. If It works, why I do a lot of work to fix it? |
First, what I was getting at is not simply moving back to using BUILD_SHARED_LIBS but also making sure save/add_subdirectory/restore occurs inside a function (or at least a scope) so it builds safely in parallel.
Second, if the above approach does not work, maybe it would help to dive a little bit more into why this actually fails. If PyTorch's use of BUILD_SHARED_LIBS diverges from its definition (by building both static and shared components) then it needs to be handled properly in PyTorch and not in the components. |
From the example, gen sleef_static and sleef(shared) at the same time? do you think it is acceptable solution. |
Not sure I understand. I don't think this will solve the issue, but feel fee to try. I was picturing defining a function encapsulating everything at this link https://github.com/pytorch/pytorch/blob/cee16353db384dd33a4804f9f26829cacfcfbab7/aten/src/ATen/CMakeLists.txt#L442-L478. Then just calling this function.
The benefit of the function is that you don't even have to save and restore BUILD_SHARED_LIBS and BUILD_TESTS, the scope of the function itself protects from overriding BUILD_SHARED_LIBS at the global scope. |
Please read the offical guide. The offical guide said that: Use the |
Hi @blapie
In git-cmake build system, we usual build dependency via We maintains a open source project should consider easy to developer use. Use
|
We have no issue with people using SLEEF as a submodule, if anything that is probably the preferred approach and we provide examples for that.
No, you don't have to define it, but if you need to switch between static and shared libs then that is the flag you should use. Sorry I am going to repeat myself but your PR "solves" the problem from your point of view (or rather hides it) by introducing flaws in our codebase. We cannot accept a change like that simply because it will give unexpected results to SLEEF users (who also use it as a submodule). Instead I suggested a simple alternative, that you don't want to try, which is encapsulating the options and I don't see what's wrong with this approach
|
I tried your code just now, snapshot as below:
blapie_example_output.txt |
* default build sleef in shared mode. * fix install files to dummy folder.
3050f5e
to
f98ce8f
Compare
@blapie some days passed, any update? |
Hi! Sorry, I did not find time to test before. But I just did it on MSYS2 (Unix-like Windows environment) and as expected I get static libs when setting It would be good to go to the bottom of that, as it could simply be that a minor change in SLEEF is needed to work on Windows on x86 with cl. Please note that SLEEF support for MSVC/CL is limited for now, in particular reproducing and testing is not available on the repo. |
Sounds very strange. I don't see the connection between the diff and the error message. |
Should xuhancn/pytorch#8 be rebased first? |
No, it is passed if use SLEEF_BUILD_SHARED_LIB. |
@jgong5 I have update code to compatible to
|
I'm looking into your last commit and will leave comments in patch. Meanwhile let me go back to the initial problem for a second. I just cannot believe that CMake created a variable that is not usable when doing such trivial thing as adding components as submodules. In fact what CMake says is (sorry for quoting again but this is pretty relevant to our position)
The "Failure to do so" refers to the need to add Could it be a simple solution to the issue? Could you please try that? |
My solution is merge this PR. |
@alexreinking Hello! Sorry to bother you, but we stumbled upon your blog (https://alexreinking.com/blog/building-a-dual-shared-and-static-library-with-cmake.html) and wondered if maybe you could shine some light on this situation? |
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | ||
if(NOT DEFINED ${SLEEF_BUILD_SHARED_LIBS}) | ||
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}) | ||
else() | ||
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | ||
endif() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be done with options instead? For instance,
option(BUILD_SHARED_LIBS "Build sleef as shared library" ON)
option(SLEEF_BUILD_SHARED_LIBS "Build sleef as shared library" ${BUILD_SHARED_LIBS})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blapie -- These are not equivalent because this defines SLEEF_BUILD_SHARED_LIBS
as a cache variable when no variable by that name (normal or cache) exists. So then changing BUILD_SHARED_LIBS
only works as expected on the first configuration.
My blog post suggests the following:
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | |
if(NOT DEFINED ${SLEEF_BUILD_SHARED_LIBS}) | |
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}) | |
else() | |
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | |
endif() | |
if (DEFINED SLEEF_BUILD_SHARED_LIBS) | |
set(BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | |
endif () |
Then you can just look at the truthiness of BUILD_SHARED_LIBS
down the line. If you care about documenting this variable in the CMake CLI/GUI, then this approach works:
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | |
if(NOT DEFINED ${SLEEF_BUILD_SHARED_LIBS}) | |
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}) | |
else() | |
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | |
endif() | |
set(SLEEF_BUILD_SHARED_LIBS "undefined" | |
CACHE BOOL "Override BUILD_SHARED_LIBS while configuring SLEEF") | |
if (NOT SLEEF_BUILD_SHARED_LIBS STREQUAL "undefined") | |
set(BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | |
endif () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a little carried away with code review outside the question at hand; sorry about this. It's just that CMakeLists.txt that don't start with cmake_minimum_required
+project
are liable to break without warning on any future version.
I don't understand how my blog post was used to justify splatting out _INTERNAL_SLEEF_BUILD_SHARED_LIBS
everywhere. Just use SLEEF_BUILD_SHARED_LIBS
as the value of BUILD_SHARED_LIBS
if and only if it is defined by the user. I repeat the approach in my blog post and give an alternative approach.
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | ||
if(NOT DEFINED ${SLEEF_BUILD_SHARED_LIBS}) | ||
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}) | ||
else() | ||
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | ||
endif() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blapie -- These are not equivalent because this defines SLEEF_BUILD_SHARED_LIBS
as a cache variable when no variable by that name (normal or cache) exists. So then changing BUILD_SHARED_LIBS
only works as expected on the first configuration.
My blog post suggests the following:
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | |
if(NOT DEFINED ${SLEEF_BUILD_SHARED_LIBS}) | |
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}) | |
else() | |
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | |
endif() | |
if (DEFINED SLEEF_BUILD_SHARED_LIBS) | |
set(BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | |
endif () |
Then you can just look at the truthiness of BUILD_SHARED_LIBS
down the line. If you care about documenting this variable in the CMake CLI/GUI, then this approach works:
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | |
if(NOT DEFINED ${SLEEF_BUILD_SHARED_LIBS}) | |
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}) | |
else() | |
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | |
endif() | |
set(SLEEF_BUILD_SHARED_LIBS "undefined" | |
CACHE BOOL "Override BUILD_SHARED_LIBS while configuring SLEEF") | |
if (NOT SLEEF_BUILD_SHARED_LIBS STREQUAL "undefined") | |
set(BUILD_SHARED_LIBS ${SLEEF_BUILD_SHARED_LIBS}) | |
endif () |
if(SLEEF_ENABLE_LTO AND _INTERNAL_SLEEF_BUILD_SHARED_LIBS) | ||
message(FATAL_ERROR "SLEEF_ENABLE_LTO and _INTERNAL_SLEEF_BUILD_SHARED_LIBS cannot be specified at the same time") | ||
endif(SLEEF_ENABLE_LTO AND _INTERNAL_SLEEF_BUILD_SHARED_LIBS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With my other suggestions:
if(SLEEF_ENABLE_LTO AND _INTERNAL_SLEEF_BUILD_SHARED_LIBS) | |
message(FATAL_ERROR "SLEEF_ENABLE_LTO and _INTERNAL_SLEEF_BUILD_SHARED_LIBS cannot be specified at the same time") | |
endif(SLEEF_ENABLE_LTO AND _INTERNAL_SLEEF_BUILD_SHARED_LIBS) | |
if (SLEEF_ENABLE_LTO AND BUILD_SHARED_LIBS) | |
message(FATAL_ERROR "SLEEF_ENABLE_LTO and (SLEEF_)BUILD_SHARED_LIBS cannot be enabled at the same time") | |
endif () |
Note that duplicating the condition in the endif
hasn't been a requirement since at least 2.8.x but it might be as far back as 2.4.x. Definitely not needed in 3.18, which is your minimum.
@@ -305,7 +311,7 @@ if(SLEEF_SHOW_CONFIG) | |||
message(" Native build dir: ${NATIVE_BUILD_DIR}") | |||
endif(CMAKE_CROSSCOMPILING) | |||
message(STATUS "Using option `${SLEEF_C_FLAGS}` to compile libsleef") | |||
message(STATUS "Building shared libs : " ${BUILD_SHARED_LIBS}) | |||
message(STATUS "Building shared libs : " ${_INTERNAL_SLEEF_BUILD_SHARED_LIBS}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
message(STATUS "Building shared libs : " ${_INTERNAL_SLEEF_BUILD_SHARED_LIBS}) | |
message(STATUS "Building shared libs : " ${BUILD_SHARED_LIBS}) |
@@ -7,7 +7,7 @@ include(CheckLanguage) | |||
|
|||
if (SLEEF_BUILD_STATIC_TEST_BINS) | |||
set(CMAKE_FIND_LIBRARY_SUFFIXES ".a") | |||
set(BUILD_SHARED_LIBS OFF) | |||
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set(_INTERNAL_SLEEF_BUILD_SHARED_LIBS OFF) | |
set(BUILD_SHARED_LIBS OFF) |
@@ -834,7 +834,7 @@ endif() | |||
|
|||
# Set common definitions | |||
|
|||
if (NOT BUILD_SHARED_LIBS) | |||
if (NOT _INTERNAL_SLEEF_BUILD_SHARED_LIBS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (NOT _INTERNAL_SLEEF_BUILD_SHARED_LIBS) | |
if (NOT BUILD_SHARED_LIBS) |
@@ -157,9 +163,9 @@ separate build directory. Note: Please remove autogenerated file \ | |||
`CMakeCache.txt` and directory `CMakeFiles` in the current directory.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everywhere you should just use SLEEF_SOURCE_DIR
and SLEEF_BINARY_DIR
since those are the standard and expected names. The project
command defines them and the VERSION
variables.
https://cmake.org/cmake/help/v3.18/command/project.html#project
endif(SLEEF_ENABLE_LTO AND BUILD_SHARED_LIBS) | ||
if(SLEEF_ENABLE_LTO AND _INTERNAL_SLEEF_BUILD_SHARED_LIBS) | ||
message(FATAL_ERROR "SLEEF_ENABLE_LTO and _INTERNAL_SLEEF_BUILD_SHARED_LIBS cannot be specified at the same time") | ||
endif(SLEEF_ENABLE_LTO AND _INTERNAL_SLEEF_BUILD_SHARED_LIBS) | ||
|
||
if(SLEEF_ENABLE_LTO) | ||
cmake_policy(SET CMP0069 NEW) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmake_policy(SET CMP0069 NEW) |
That policy is already enabled by cmake_minimum_required(VERSION 3.18)
. It's enabled for 3.9+.
@@ -133,7 +133,7 @@ optimized, or any other special set of flags. | |||
- `SLEEF_SHOW_CONFIG` : Show relevant CMake variables upon configuring a build | |||
- `SLEEF_SHOW_ERROR_LOG` : Show the content of CMakeError.log | |||
|
|||
- `BUILD_SHARED_LIBS` : Static libs are built if set to FALSE | |||
- `SLEEF_BUILD_SHARED_LIBS` : Static libs are built if set to FALSE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `SLEEF_BUILD_SHARED_LIBS` : Static libs are built if set to FALSE | |
- `SLEEF_BUILD_SHARED_LIBS` : Static libs are built if set to FALSE. If not manually set, then `BUILD_SHARED_LIBS` is used. |
if(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | ||
add_library(${TARGET_LIBDFT} SHARED $<TARGET_OBJECTS:dftcommon_obj> $<TARGET_OBJECTS:${TARGET_LIBARRAYMAP_OBJ}>) | ||
else() | ||
add_library(${TARGET_LIBDFT} STATIC $<TARGET_OBJECTS:dftcommon_obj> $<TARGET_OBJECTS:${TARGET_LIBARRAYMAP_OBJ}>) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | |
add_library(${TARGET_LIBDFT} SHARED $<TARGET_OBJECTS:dftcommon_obj> $<TARGET_OBJECTS:${TARGET_LIBARRAYMAP_OBJ}>) | |
else() | |
add_library(${TARGET_LIBDFT} STATIC $<TARGET_OBJECTS:dftcommon_obj> $<TARGET_OBJECTS:${TARGET_LIBARRAYMAP_OBJ}>) | |
endif() | |
add_library(${TARGET_LIBDFT} $<TARGET_OBJECTS:dftcommon_obj> $<TARGET_OBJECTS:${TARGET_LIBARRAYMAP_OBJ}>) |
if(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | ||
add_library(${TARGET_LIBSLEEF} SHARED ${STANDARD_SOURCES}) | ||
else() | ||
add_library(${TARGET_LIBSLEEF} STATIC ${STANDARD_SOURCES}) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(_INTERNAL_SLEEF_BUILD_SHARED_LIBS) | |
add_library(${TARGET_LIBSLEEF} SHARED ${STANDARD_SOURCES}) | |
else() | |
add_library(${TARGET_LIBSLEEF} STATIC ${STANDARD_SOURCES}) | |
endif() | |
add_library(${TARGET_LIBSLEEF} ${STANDARD_SOURCES}) |
I'm going to stop highlighting all the places this needs to be reverted.
With the approach I outlined above, PyTorch should be able to build SLEEF by setting If you can show me with clear, reproducible steps in a self-contained post why my approach doesn't work, then I will be very happy to refine it further. I don't have time to read through 100+ comments across multiple PRs. |
Thanks @alexreinking for the suggestions. @blapie Sounds like this is exactly what we discussed in the meeting. I believe it should work with PyTorch. |
@alexreinking I use |
Oh, it seems I misunderstood the approach proposed by @alexreinking . Thanks @xuhancn for the reminder. I guess the difference is whether to overwrite the global |
@jgong5 what I suggested does not alter the global
This is exactly what |
@xuhancn - the documentation you cite outlines a pitfall of using
My solution does not suffer from this problem because it does not touch the cache. |
Currently, we need to guarantee it works on pytorch's submodule. My PR code is validated on pytorch all PreCI. |
@blapie -- I think you're 100% correct in your assessment that PyTorch could use a function to set up |
Currently,we have two PRs can fix
I made a comparsion table to summary the two PRs. My PR is simpler than alexreinking's. My suggestion is merge my PR into master branch. Any comments? @blapie @jgong5 @alexreinking |
I don't have strong preference on the option. Both options LGTM. |
@xuhancn - I think your comparison is wildly unfair.
My PR consists of three lines that matter for the shared libraries issue. It also fixes a real misuse of CMake and improves some console output. Your PR requires ongoing maintenance to ensure that any new With my PR, nothing special needs to be done going forward.
This is also untrue. One just needs basic facts about how variable scope works in CMake.
I encourage the maintainers here to look at both diffs and draw their own conclusions. |
@alexreinking - Hello! Thanks a lot for the insight, I learned a lot reading through your suggestion. First of all, I agree that SLEEF's cmake was in need for some proper housekeeping, which made it prone to bugs. @jgong5 - I believe this solution meets in the middle while make SLEEF build system way more robust. Thanks for contributing to the discussion and sorry I could not reply before. I will review #531 but from the first couple of messages it sounded all good. |
Happy we finally got to a satisfying solution for everyone, thanks @alexreinking, @xuhancn and @jgong5. |
@blapie @alexreinking Thanks a lot for your help and effort for landing it! |
Why we avoid to use
BUILD_SHARED_LIBS
andadd_library
withoutSTATIC/SHARED
parameter. Please check the offical documentNote
part.Additional, I write a test project to test this PR: https://github.com/xuhancn/sleef_cmake_example
Turn on/off this line to control the build type.
https://github.com/xuhancn/sleef_cmake_example/blob/main/CMakeLists.txt#L14
Static:
Shared: