Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CMake VOL builds with FetchContent from local directory #3455

Merged
merged 11 commits into from
Sep 1, 2023
75 changes: 57 additions & 18 deletions CMakeVOL.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,9 @@ endfunction ()
option (HDF5_VOL_ALLOW_EXTERNAL "Allow building of external HDF5 VOL connectors with FetchContent" OFF)
mark_as_advanced (HDF5_VOL_ALLOW_EXTERNAL)
if (HDF5_VOL_ALLOW_EXTERNAL)
if (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "NO" OR NOT HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "GIT")
message (FATAL_ERROR "HDF5_ALLOW_EXTERNAL_SUPPORT must be set to 'GIT' to allow building of external HDF5 VOL connectors")
endif ()
if (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "NO" OR (NOT HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "GIT" AND NOT HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "LOCAL_DIR"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOCAL_DIR is not currently a valid choice - just TGZ

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be reasonable to separate the external support source used for VOLs and the external suppport source used elsewhere in the library? There could be a HDF5_VOL_EXTERNAL_SUPPORT for VOL sources, and HDF5_ALLOW_EXTERNAL_SUPPORT could be unchanged. This would avoid the current weirdness where wanting to get a VOL from github means you also need to get ZLIB and SZIP from github.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For maintenance reasons, I didn't want to create a separate path just for support for VOL connectors, since external VFD plugins are another thing we'll have to think about at some point. There shouldn't be any need to grab ZLIB and SZIP from github just by turning on HDF5_ALLOW_EXTERNAL_SUPPORT though. I think that's just because the options for building SZIP and ZLIB are on by default.

Copy link
Contributor

@byrnHDF byrnHDF Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HDF5_ALLOW_EXTERNAL_SUPPORT is just three states NO: go find it, GIT: use a repo and TGZ: use a file.
Because at that time it was all do the same thing, compression, plugins, etc.
There is a supervariable per item: XXX_USE_EXTERNAL which determines whether to do a find_package or use the HDF5_ALLOW_EXTERNAL_SUPPORT option. See the CMakeFilters.cmake file.
For the vol you could create a VOL_USE_EXTERNAL or even a new variable and leave the last option to follow the global HDF5_ALLOW_EXTERNAL_SUPPORT.
Does this need refactored - yes, but maybe for version 2.0 of HDF5.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple lines in CMakeFilters.cmake that tell CMake that ZLIB will come from github if "GIT" is set:

if (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "GIT" OR HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "TGZ")
  set (ZLIB_USE_EXTERNAL "Use External Library Building for ZLIB" 1)

You can get around this by disabling ZLIB and SZIP when getting a VOL from Github, or just getting them from Github as well, but it's an unintuitive thing to encounter.

message (FATAL_ERROR "HDF5_ALLOW_EXTERNAL_SUPPORT must be set to 'GIT' or 'LOCAL_DIR' to allow building of external HDF5 VOL connectors")
endif()

# For compatibility, set some variables that projects would
# typically look for after calling find_package(HDF5)
Expand All @@ -51,7 +51,7 @@ if (HDF5_VOL_ALLOW_EXTERNAL)

foreach (vol_idx RANGE 1 ${HDF5_MAX_EXTERNAL_VOLS})
# Generate fixed-width index number prepended with 0s
# so URLs come in order from 1 - HDF5_MAX_EXTERNAL_VOLS
# so VOL sources come in order from 1 - HDF5_MAX_EXTERNAL_VOLS
set (vol_idx_num_digits 2) # Based on HDF5_MAX_EXTERNAL_VOLS
set (vol_idx_fixed "${vol_idx}")
string (LENGTH "${vol_idx_fixed}" vol_idx_len)
Expand All @@ -60,33 +60,55 @@ if (HDF5_VOL_ALLOW_EXTERNAL)
math (EXPR vol_idx_len "${vol_idx_len}+1")
endwhile ()

set (HDF5_VOL_URL${vol_idx_fixed} "" CACHE STRING "Git repository URL of an external HDF5 VOL connector to build")
mark_as_advanced (HDF5_VOL_URL${vol_idx_fixed})
if (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "GIT")
set (HDF5_VOL_URL${vol_idx_fixed} "" CACHE STRING "Git repository URL of an external HDF5 VOL connector to build")
mark_as_advanced (HDF5_VOL_URL${vol_idx_fixed})
set (HDF5_VOL_SOURCE "${HDF5_VOL_URL${vol_idx_fixed}}")
elseif(HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "LOCAL_DIR")
set (HDF5_VOL_PATH${vol_idx_fixed} "" CACHE STRING "Path to the source directory of an external HDF5 VOL connector to build")
mark_as_advanced (HDF5_VOL_PATH${vol_idx_fixed})
set (HDF5_VOL_SOURCE "${HDF5_VOL_PATH${vol_idx_fixed}}")
else()
jhendersonHDF marked this conversation as resolved.
Show resolved Hide resolved
endif()

if (NOT "${HDF5_VOL_SOURCE}" STREQUAL "")
# Deal with trailing slash in path for LOCAL_DIR case
if (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "LOCAL_DIR")
# Erase trailing slash
string (REGEX REPLACE "/$" "" HDF5_VOL_SOURCE ${HDF5_VOL_SOURCE})
endif()

if (NOT "${HDF5_VOL_URL${vol_idx_fixed}}" STREQUAL "")
# Extract the name of the VOL connector
string (FIND "${HDF5_VOL_URL${vol_idx_fixed}}" "/" hdf5_vol_name_pos REVERSE)
string (FIND "${HDF5_VOL_SOURCE}" "/" hdf5_vol_name_pos REVERSE)
if (hdf5_vol_name_pos EQUAL -1)
message (SEND_ERROR "Invalid URL '${HDF5_VOL_URL${vol_idx_fixed}}' specified for HDF5_VOL_URL${vol_idx_fixed}")
if (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "GIT")
message (SEND_ERROR "Invalid URL '${HDF5_VOL_SOURCE}' specified for HDF5_VOL_URL${vol_idx_fixed}")
elseif (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "LOCAL_DIR")
message (SEND_ERROR "Invalid source path '${HDF5_VOL_SOURCE}' specified for HDF5_VOL_PATH${vol_idx_fixed}")
else()
jhendersonHDF marked this conversation as resolved.
Show resolved Hide resolved
endif()
endif ()

math (EXPR hdf5_vol_name_pos "${hdf5_vol_name_pos}+1")

string (SUBSTRING "${HDF5_VOL_URL${vol_idx_fixed}}" ${hdf5_vol_name_pos} -1 hdf5_vol_name)
string (SUBSTRING "${HDF5_VOL_SOURCE}" ${hdf5_vol_name_pos} -1 hdf5_vol_name)
string (REPLACE ".git" "" hdf5_vol_name "${hdf5_vol_name}")
string (STRIP "${hdf5_vol_name}" hdf5_vol_name)
string (TOUPPER "${hdf5_vol_name}" hdf5_vol_name_upper)
string (TOLOWER "${hdf5_vol_name}" hdf5_vol_name_lower)

message (VERBOSE "Building VOL connector '${hdf5_vol_name}' with FetchContent")
message (VERBOSE "Building VOL connector '${hdf5_vol_name}' with FetchContent from source ${HDF5_VOL_SOURCE}")

# Set some cache variables that can be set by users when building
if (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "GIT")
set ("HDF5_VOL_${hdf5_vol_name_upper}_BRANCH" "main" CACHE STRING "Git branch (or tag) to use when building VOL connector '${hdf5_vol_name}'")
mark_as_advanced ("HDF5_VOL_${hdf5_vol_name_upper}_BRANCH")
endif()

set ("HDF5_VOL_${hdf5_vol_name_upper}_NAME" "" CACHE STRING "Name of VOL connector to set for the HDF5_VOL_CONNECTOR environment variable")
set ("HDF5_VOL_${hdf5_vol_name_upper}_BRANCH" "main" CACHE STRING "Git branch (or tag) to use when building VOL connector '${hdf5_vol_name}'")
option ("HDF5_VOL_${hdf5_vol_name_upper}_TEST_PARALLEL" "Whether to test VOL connector '${hdf5_vol_name}' against the parallel API tests" OFF)

mark_as_advanced ("HDF5_VOL_${hdf5_vol_name_upper}_NAME")
mark_as_advanced ("HDF5_VOL_${hdf5_vol_name_upper}_BRANCH")
mark_as_advanced ("HDF5_VOL_${hdf5_vol_name_upper}_TEST_PARALLEL")

if (HDF5_TEST_API)
Expand All @@ -95,21 +117,38 @@ if (HDF5_VOL_ALLOW_EXTERNAL)
endif ()
endif ()

if ("${HDF5_VOL_${hdf5_vol_name_upper}_BRANCH}" STREQUAL "")
if ((HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "GIT") AND ("${HDF5_VOL_${hdf5_vol_name_upper}_BRANCH}" STREQUAL ""))
message (SEND_ERROR "HDF5_VOL_${hdf5_vol_name_upper}_BRANCH must be set to a valid git branch name (or git tag) to build VOL connector '${hdf5_vol_name}'")
endif ()

FetchContent_Declare (HDF5_VOL_${hdf5_vol_name_lower}
GIT_REPOSITORY "${HDF5_VOL_URL${vol_idx_fixed}}"
if ((HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "LOCAL_DIR")
AND NOT (EXISTS ${HDF5_VOL_SOURCE} AND IS_DIRECTORY ${HDF5_VOL_SOURCE}))
message (FATAL_ERROR "HDF5_VOL_PATH${vol_idx_fixed} must be an absolute path to a valid directory")
endif ()

if (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "GIT")
FetchContent_Declare (HDF5_VOL_${hdf5_vol_name_lower}
GIT_REPOSITORY "${HDF5_VOL_SOURCE}"
GIT_TAG "${HDF5_VOL_${hdf5_vol_name_upper}_BRANCH}"
)
)
elseif(HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "LOCAL_DIR")
FetchContent_Declare (HDF5_VOL_${hdf5_vol_name_lower}
SOURCE_DIR "${HDF5_VOL_SOURCE}"
)
else()
jhendersonHDF marked this conversation as resolved.
Show resolved Hide resolved
endif()

FetchContent_GetProperties(HDF5_VOL_${hdf5_vol_name_lower})
if (NOT hdf5_vol_${hdf5_vol_name_lower}_POPULATED)
FetchContent_Populate(HDF5_VOL_${hdf5_vol_name_lower})

if (NOT EXISTS "${hdf5_vol_${hdf5_vol_name_lower}_SOURCE_DIR}/CMakeLists.txt")
message (SEND_ERROR "The git repository branch '${HDF5_VOL_${hdf5_vol_name_upper}_BRANCH}' for VOL connector '${hdf5_vol_name}' does not appear to contain a CMakeLists.txt file")
if (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "GIT")
message (SEND_ERROR "The git repository branch '${HDF5_VOL_${hdf5_vol_name_upper}_BRANCH}' for VOL connector '${hdf5_vol_name}' does not appear to contain a CMakeLists.txt file")
elseif (HDF5_ALLOW_EXTERNAL_SUPPORT MATCHES "LOCAL_DIR")
message(SEND_ERROR "The local directory '${HDF5_VOL_SOURCE}' for VOL connector '${hdf5_vol_name}' does not appear to contain a CMakeLists.txt file")
else()
jhendersonHDF marked this conversation as resolved.
Show resolved Hide resolved
endif()
endif ()

# If there are any calls to find_package(HDF5) in the connector's
Expand Down
56 changes: 42 additions & 14 deletions doc/cmake-vols-fetchcontent.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,13 @@ two CMake variables must first be set:

HDF5_ALLOW_EXTERNAL_SUPPORT (Default: "NO")
This variable is a string that specifies the manner in which the source code for
an external VOL connector will be retrieved. Currently, this variable must be set
to "GIT" for building external VOL connectors.
an external VOL connector will be retrieved. This variable must be set
to "GIT" for building external VOL connectors from a Github repository, or
set to "LOCAL_DIR" to build from a local source directory.

Once the `HDF5_VOL_ALLOW_EXTERNAL` option is set to ON and the `HDF5_ALLOW_EXTERNAL_SUPPORT`
### Building

If the `HDF5_VOL_ALLOW_EXTERNAL` option is set to ON and the `HDF5_ALLOW_EXTERNAL_SUPPORT`
variable is set to "GIT", the CMake cache will be populated with a predefined
(currently 10) amount of new variables, named:

Expand All @@ -49,26 +52,44 @@ For each URL specified, HDF5's CMake code will attempt to use CMake's
[FetchContent](https://cmake.org/cmake/help/latest/module/FetchContent.html)
functionality to retrieve the source code for a VOL connector pointed to by
that URL and will try to build that VOL connector as part of the HDF5 library
build process. The VOL connector must be able to be built by CMake and currently
build process.

If `HDF5_ALLOW_EXTERNAL_SUPPORT` is set to "LOCAL_DIR", then the CMake cache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the existing use of HDF5_ALLOW_EXTERNAL_SUPPORT; these changes are overwriting current usage reqs.

will instead be populated with the variables:

HDF5_VOL_PATH01
HDF5_VOL_PATH02
HDF5_VOL_PATH03
...

For each of these variables, an absolute path that points to a local
directory containing source code for an HDF5 VOL connector
can be specified. For example, to specify a local clone of the
REST VOL connector stored under one's home directory, one can provide
the following option to `cmake`:

-DHDF5_VOL_PATH01=/home/vol-rest

Regardless of the method used to obtain the VOL source code,
the VOL connector must be able to be built by CMake and currently
must have a CMakeLists.txt file in the top level of the source tree in order to
be buildable by this process. If the source code for a VOL connector is successfully
retrieved, the HDF5 build's CMake cache will be populated with variables from
the VOL connector's CMake code, as if one were building the connector by itself.
This gives one the ability to customize the build of the connector as usual.

The CMake cache will also be populated with a few new variables for each VOL
connector that was successfully retrieved from a given URL. To generate these
variables, the CMake code first creates an internal name for the VOL connector
connector that was successfully retrieved. To generate these
variables, the CMake code first creates an internal name for the VOL connector.
If the source was retrieved from a URL, then the name is generated
by stripping off the last part of the Git repository URL given for the connector,
removing the ".git" suffix and any whitespace and then upper-casing the result.
For example, the name of the VOL connector located at the URL
https://github.com/hpc-io/vol-async.git would become "VOL-ASYNC". Then, the following
new variables get created:
https://github.com/hpc-io/vol-async.git would become "VOL-ASYNC". If the source was
retrieved from a local directory, then the source directory's name is trimmed of whitespace,
upper-cased, and has any trailing slashes removed.

HDF5_VOL_<VOL name>_BRANCH (Default: "main")
This variable specifies the git branch name or tag to use when fetching
the source code for the VOL connector with the CMake-internal name
'<VOL name>'.
After the VOL's internal name is generated, the following new variables get created:

HDF5_VOL_<VOL name>_NAME (Default: "")
This variable specifies the string that should be used when setting the
Expand All @@ -84,8 +105,15 @@ new variables get created:
This variable determines whether the VOL connector with the CMake-internal
name '<VOL name>' should be tested against HDF5's parallel tests.

If the source was retrieved from a Git URL, then the following variable will additionally be created:

HDF5_VOL_<VOL name>_BRANCH (Default: "main")
This variable specifies the git branch name or tag to use when fetching
the source code for the VOL connector with the CMake-internal name
'<VOL name>'.

As an example, this would create the following variables for the
previously-mentioned VOL connector:
previously-mentioned VOL connector if it is retrieved from a URL:

HDF5_VOL_VOL-ASYNC_BRANCH
HDF5_VOL_VOL-ASYNC_NAME
Expand All @@ -99,7 +127,7 @@ doesn't parse the string as a list. If `cmake` is run from a shell, extra care
may need to be taken when escaping the semicolons depending on how the
shell interprets backslashes.

### Example - Build and test HDF5 Asynchronous I/O VOL connector
### Example - Build and test HDF5 Asynchronous I/O VOL connector from GIT

Assuming that the HDF5 source code has been checked out and a build directory
has been created, running the following cmake command from that build directory
Expand Down