Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix segfault on user compound dtype conversion callback #3842

Merged
merged 6 commits into from
Nov 22, 2023

Conversation

mattjala
Copy link
Contributor

Fix for #3840, with associated test.

The bad memory access occurred while checking if the two datatypes were compound subsets of one another. This check is done using information in cdata->priv, which is never set when a user conversion callback is in use. This fixes the issue by returning NULL, which is already checked for in the macros that use the compound subset information (H5D__SCATGATH_USE_CMPD_OPT_READ, H5D__SCATGATH_USE_CMPD_OPT_WRITE).

@mattjala mattjala added Merge - To 1.14 Priority - 2. Medium ⏹ It would be nice to have this in the next release Component - C Library Core C library issues (usually in the src directory) Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub labels Nov 10, 2023
@fortnern
Copy link
Member

I'm concerned that a user conversion function might use the priv field for its own purposes, and then still trigger the bug because the library code only checks to see if priv is NULL. Instead, we could add a check in H5T_path_compound_subset() to see if it's using an app conversion function (and maybe check to see if the function is H5T__conv_struct() instead of checking for are_compounds) before calling H5T__conv_struct_subset(). H5T_path_compound_subset() is the only place that calls H5T__conv_struct_subset().

test/dtypes.c Outdated Show resolved Hide resolved
@lrknox lrknox merged commit af4c6c4 into HDFGroup:develop Nov 22, 2023
@mattjala mattjala deleted the conversion_segfault_fix branch November 22, 2023 17:22
mattjala added a commit to mattjala/hdf5 that referenced this pull request Mar 18, 2024
mattjala added a commit to mattjala/hdf5 that referenced this pull request Mar 18, 2024
derobins pushed a commit that referenced this pull request Mar 19, 2024
qkoziol pushed a commit to qkoziol/hdf5 that referenced this pull request Mar 19, 2024
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Mar 21, 2024
lrknox added a commit that referenced this pull request Mar 21, 2024
* Added new H5E with tests. (#4049)

Added Fortran H5E APIs:
h5eregister_class_f, h5eunregister_class_f, h5ecreate_msg_f, h5eclose_msg_f
h5eget_msg_f, h5epush_f, h5eget_num_f, h5ewalk_f, h5eget_class_name_f,
h5eappend_stack_f, h5eget_current_stack_f, h5eset_current_stack_f, h5ecreate_stack_f,
h5eclose_stack_f, h5epop_f, h5eprint_f (C h5eprint v2 signature)

Addresses Issue #3987

* Don't load toolchain file in CMake workflows (#4077)

* Add support for the new MSVC preprocessor (#4078)

Microsoft has added a new, standards-conformant preprocessor
to MSVC, which can be enabled with /Zc:preprocessor. This
preprocessor trips over our HDopen() function-like variadic
macro since it uses a hack that only works with the legacy
MSVC preprocessor.

This fix adds ifdefs to use the correct HDopen() macro
depending on the MSVC preprocessor selected.

Fixes #2515

* Increased H5FD_ROS3_MAX_SECRET_TOK_LEN to 4096 to accommodate long AWS secret tokens (#4064)

ros3: increased H5FD_ROS3_MAX_SECRET_TOK_LEN to 4096; stratified the debugging statements so there is more control over the output

* Close and reopen file during dset vlen IO API tests (#4050)

- Close/reopen file and file objects to prevent cache from being used instead of actual I/O.
- Moved vlen io test datasets under the dset container group instead of the root group
- Moved the PASSED() invocation to after individual test cleanup in case an error occurs during H5Treclaim

* New option for building with static CRT in Windows (#4062)

* addressed compilation errors with gfortran 4.8 (#4082)

* Fix bin/trace script w/ out params (#4074)

The bin/trace script adds TRACE macros to public API calls in the main
C library. This script had a parsing bug that caused functions that
were annotated with /*out*/, etc. to be labeled as void pointers
instead of typed pointers.

This is mainly a developer feature and not visible to consumers
of the public API.

The bin/trace script now annotates public API calls properly.

Fixes GH #3733

* Use H5T_STD_I32LE to create datatype in vds examples (#4070)

Fixes issues when VDS examples are tested on BE systems

* Remove printf debugging in H5HL code (#4086)

* Fixed asserts due to H5Pset_est_link_info() values (#4081)

* Fixed asserts due to H5Pset_est_link_info() values

If large values for est_num_entries and/or est_name_len were passed
to H5Pset_est_link_info(), the library would attempt to create an
object header NIL message to reserve enough space to hold the links in
compact form (i.e., concatenated), which could exceed allowable object
header message size limits and trip asserts in the library.

This bug only occurred when using the HDF5 1.8 file format or later and
required the product of the two values to be ~64k more than the size
of any links written to the group, which would cause the library to
write out a too-large NIL spacer message to reserve the space for the
unwritten links.

The library now inspects the phase change values to see if the dataset
is likely to be compact and checks the size to ensure any NIL spacer
messages won't be larger than the library allows.

Fixes GitHub #1632

* Fix copy-paste comments

* update macOS support statement (#4084)

* fixes compilation failures due to H5E additions (#4090)

* Remove extra whitespaces from nvhpc-cmake action. (#4091)

* Remove printf debugging in H5I package (#4088)

* Add subfiling for h5dump filedriver option help message (#3878)

* Merge HDF4 release workflow changes to hdf5 (#4093)

* Update long double test with correct values (#4060)

Update long double test with correct values

* virtual creates must use the same datatype as the main file (#4092)

* Fixed -Wdeprecated-copy-dtor warnings by implementing a copy assignment operator (#3306)

Example warning was:

warning: definition of implicit copy assignment operator for 'Group' is deprecated because it has a user-declared destructor [-Wdeprecated-copy-dtor]

* Expand check for variable-length or reference types when clearing datatype conversion paths (#4085)

When clearing out datatype conversion paths involving variable-length or reference datatypes
on file close, also check for these datatypes inside compound or array datatypes

* Remove H5B debug checks (#4089)

The H5B (version 1 B-tree) package would add some computationally
expensive integrity checks when H5B_DEBUG was defined. Due to their
negative effects on performance, this option was rarely turned on,
making the H5B__assert() check function stale, if not dead, code.

This change:

* Builds H5B__assert() when NDEBUG is not defined (the function
  relies on assert()) so it gets compiled more often.
* Removes some printf debugging statements in the B-tree code
* Removes all H5B "extra debug" checks that are leftover from
  past debugging sessions. Maintainers can add H5B__assert()
  selectively to perform integrity checks when debugging.
* Removes the HDF5_ENABLE_DEBUG_H5B CMake option

H5B_DEBUG now has no effect

* Fix the last C++ stack size warning (#4099)

* Clean up off_t usage (#4095)

* Add comments to C++ and Fortran API calls that use off_t
* Remove noise casts for small integers

* Correct missing change of source path for S3 build (#4100)

* Remove leading / from relative path. (#4101)

* msvc: don't declare `HAVE_TIMEZONE` for older MSVC (#3956)

It was introduced in MSVC 15 (Visual Studio 2017).

* Remove a few H5O printf debugging statements (#4096)

These were in H5Oint.c, were protected by H5O_DEBUG, and only dumped
to stdout if the HDF5_DEBUG environment variable were set to do so.

* Remove H5DEBUG() calls from H5Dmpio.c (#4103)

Just use stdout when a stream is needed.

* Remove printf debugging from H5Smpio.c (#4098)

* Change how stats are printed in H5Z (#4097)

H5Z used the soon-to-be-removed HDEBUG macro to decide if stats
would be dumped and to what stream. This is now handled by a
DUMP_DEBUG_STATS_g variable and the output is always sent to
stdout.

This is an internal change, not normally visible to users.

* Refactor datatype conversion code to use pointers rather than IDs (#4104)

The datatype conversion code previously used IDs for the source and
destination datatypes rather than pointers to the internal structures
for those datatypes. This was mostly due to the need for an ID for these
datatypes that can be passed to an application-registered datatype
conversion function or datatype conversion exception function. However,
using IDs internally caused a lot of unnecessary ID lookups and hurt
performance of datatype conversions in general. This was especially
problematic for compound datatype conversions, where the ID lookups were
occuring on every member of every compound element of a dataset. The
code has now been refactored to use pointers internally and only create
IDs for datatypes when necessary.

Fixed a test issue in dt_arith where a library datatype conversion
function was being cast to an application conversion function. Since the
two have different prototypes, this started failing after the parameters
for a library conversion function changed from hid_t to H5T_t * and an
extra parameter was added. This appears to have worked coincidentally in
the past since the only different between a library conversion function
and application conversion function was an extra DXPL parameter at the
end of an application conversion function

Fixed an issue where memory wasn't being freed in the h5fc_chk_idx test
program. Even though the program exits quickly after allocating the
memory, it still causes failures when testing with -fsanitize=address

* Minimize use of abort() (#4110)

The abort() call is used at several places where it probably shouldn't.

* Clean up a few things in H5T.c (#4105)

* remove (size_t) noise casts
* replace (hid_t)FAIL with H5I_INVALID_HID

* Convert H5B__assert to use error checks (#4109)

Switches assert() calls to HGOTO_ERROR in H5B__assert() so it can be
used in production mode. Also renames it to H5B__verify_structure()
to better reflect what it checks.

* Move common variables out of cache test blocks (#4108)

Fixes a stack size warning w/ XCode

* Remove lint comments (#4107)

* Change compression tests reference files to use masking for compression ratios (#4083)

Rework TEST_FILTER tests to handle slightly different compression ratios

* Add Doxygen for HDFS VFD (#4106)

* Add Doxygen for HDFS VFD

* Fix Doxygen warning

* Update H5FDhdfs.h

* long double tests has problems setting precision with offset (#4102)

* long double tests has problems setting precision with offset

* Disable long double until more development fixes issues

* Fix up dsets test for some platforms with different long double format (#4114)

* Adjust aocc workflow to fit the autotools/cmake pattern of other callable workflows. (#4115)

* Implement ID creation optimization for container datatype conversions (#4113)

Makes the datatype conversion context object available during both the
initialization and conversion processes for a datatype conversion
function, allowing the compound, variable-length and array datatype
conversion functions to avoid creating IDs for the datatypes when they
aren't necessary

Adds internal H5CX_pushed routine to determine if an API context is
available to retrieve values from

Also adds error checking to several places in H5T.c and H5Tconv.c where
the code had previously assumed object close operations would succeed

* Handle IBM long double issues in dsets.c test_floattypes test (#4116)

* Handle IBM long double issues in dsets.c test_floattypes test

* Remove old cmake check (#4117)

* Use AC_SYS_LARGEFILE on Autotools (#4119)

We previously used a hack introduced in 1.8.5 to paper over differences
in off_t and off64_t when determining the type sizes. We no longer explicitly
support off64_t in the library and AC_SYS_LARGEFILE works fine.

* Initialize selection type in chunk struct (#4087)

* Overhaul CMake LFS support (#4122)

Externally visible:
* The HDF_ENABLE_LARGE_FILE option (advanced) has been removed
* We no longer run a test program to determine if LFS works, which
  will help with cross-compiling
* On Linux we now unilaterally set -D_LARGEFILE_SOURCE and
  -D_FILE_OFFSET_BITS=64, regardless of 32/64 bit system. CMake
  doesn't offer a nice equivalent to AC_SYS_LARGEFILE and since
  those options do nothing on 64-bit systems, this seems safe and
  covers all our bases. We don't set -D_LARGEFILE64_SOURCE since
  we don't use any of the POSIX 64-bit specific API calls like
  ftello64, as noted above.
* We didn't test for LFS support on non-Linux platforms. We've added
  comments for how LFS should probably be supported on AIX and Solaris,
  which seem to be alive, though uncommon. PRs would be appreciated if
  anyone wishes to test this.

Internal:
* Drops off64_t size checks since this is unused (as in Autotools)
* Remove HDF_EXTRA_FLAGS, which is now unused
* Remove hack around deprecated LINUX_LFS

Fixes #2395

* Update CMake comment about _POSIX_C_SOURCE (#4124)

Was missng the 2008 pread/write info

* Deprecate bin/cmakehdf5 (#4127)

* Deprecate bin/cmakehdf5

* Add reference text

* Don't set the rpath when linking statically (#4125)

* Remove invalid compile flag (#4129)

* Fix segfault in vlen io API test (#4130)

* Update URLs in RELEASE.txt (#4132)

* Add cygwin CI and update yaml files for consistency and accuracy (#4131)

* Add cygwin CI

* add cygwin packages

* Correct option names

* Cleanup yaml file and synch look and feel

* Synch CI look and feel and correct path issues

* Upgrade oneapi version

* pwsh needs env: for vars

* No continuation char for pwsh

* restore correct pwsh step

* Run subset of tests for cygwin workflow

* Remove space chars in regex

* restore full tests

* Remove ros3 and hdfs VFDs from Autotools VFD list (#4142)

These will never pass `make check` and would require a custom test
suite for more comprehensive testing.

* Skip part of dsets.c test for IBM long double type (#4136)

* Capitalize option message for consistency. (#4141)

* Fixed misc. H5E fortran failures due to previous PR (#4138)

* fixed promotion of integers and reals tests and check-passthrough-vol failure

* fixed cygwin issue

* Fix Autotools -Werror cleanup (#4144)

The Autotools temporarily scrub -Werror(=whatever) from CFLAGS, etc.
  so configure checks don't trip over warnings generated by configure
  check programs. The sed line originally only scrubbed -Werror but not
  -Werror=something, which would cause errors when the '=something' was
  left behind in CFLAGS.

  The sed line has been updated to handle -Werror=something lines.

  Fixes one issue raised in #3872

* Fix doxygen link to example function usage (#4133)

* Remove useless headers (#4145)

Removes unnecessary headers from C library source files.

* Clean up some hbool_t/TRUE/FALSE stragglers (#4143)

It looks like most of these snuck in via selection I/O work

* Fix error when overwriting an indirectly nested vlen with a shorter sequence (#4140)

* defined CMAKE_H5_HAVE_DARWIN (#4146)

* Make the newsletter scheme work like HDF4 (#4149)

* Remove  at the end of list item. (#4151)

* Fix buffer size calculation in the deflate filter (#4147)

* Remove H5O header and friend status from H5A.c (#4150)

* Remove HDF from Fortran 2003 configuration check message. (#4157)

* Suppress H5Dmpio debugging output unless HDF5_DEBUG=d is set (#4155)

* Header cleanup in C library (#4154)

* Ensure H5FL header is included everywhere

* Ensure H5SL header is included everywhere

* Ensure H5MM header is included everywhere

* Add Doxygen to H5FDmirror.h (#4158)

* Remove lseek64 and stat64 symbols from CMake (#4163)

We don't use these in the library.

* Remove HAVE_IOEO checks from CMake (#4160)

This was intended to check for thread-safety functionality on Windows.
The required functionality has been standard since Windows Vista, so
these checks can be removed.

* Fix some minor warnings (#4165)

* Bump the size of the mirror VFD IP field (#4167)

The IP address string isn't big enought to hold an IPv4-mapped IPv6
address.

* Fix mirror VFD script (#4170)

This had directory problems when running locally.

* Fix an issue where the Subfiling VFD's context cache grows too large (#4159)

* Address code page issues w/ Windows file paths (#4172)

On Windows, HDF5 attempted to convert file paths passed to open() and
remove() to UTF-16 in order to handle Unicode file paths. This scheme
does not work when the system uses code pages to handle non-ASCII
file names.

As suggested in the forum post below, we now also try to see if we
can open the file with open(), which should handle systems where
non-ASCII code pages are in use.

https://forum.hdfgroup.org/t/open-create-hdf5-files-with-non-utf8-chars-such-as-shift-jis/11785

* Add Doxygen to API calls in H5VLnative.h (#4173)

* Allow H5Soffset_simple to accept NULL offsets (#4152)

The reference manual states that the offset parameter of H5Soffset_simple()
  can be set to NULL to reset the offset of a simple dataspace to 0. This
  has never been true, and passing NULL was regarded as an error.

  The library will now accept NULL for the offset parameter and will
  correctly set the offset to zero.

  Fixes HDFFV-9299

* Add filter plugin user guide text. Fix registered URL in docs (#4169)

* Add support for _Float16 16-bit floating point type (#4065)

Fixed some conversion issues with Clang due to problematic undefined
behavior when casting a negative floating-point value to an integer

Fixed a bug in the library's software integer to floating-point
conversion function where a user's conversion exception function
returning H5T_CONV_UNHANDLED in the case of overflows would result in
incorrect data after conversion

Added configure checks for functions and macros related to _Float16
usage since some compilers expose the datatype but not the functions or
macros

Fixed a dt_arith test failure when H5_WANT_DCONV_EXCEPTION isn't defined

Fixed a few warnings from not explicitly casting some _Float16 variables
upwards

* Remove some H5T_copy calls that are now unnecessary (#4164)

Removes some datatype copying calls that are now unnecessary after
refactoring the datatype conversion code to use pointers internally
rather than IDs

Rewrites the enum conversion function so that it uses cached copies
of the source and destination datatypes in order to avoid modifying
the datatypes passed in

Adds a 'recursive' field to the datatype conversion context which
allows the conversion functions for members of a container datatype
to skip unnecessary repetitive conversion setup code

Changes internal datatype conversion callback functions so that the
source and destination datatype structure pointers are const

Removes some unused and unnecessary internal IDs registered with
H5I_register

* Add RELEASE.txt note for cmpd segfault fix (#4175)

RELEASE notice for the fix in #3842

* Clean up CMake direct VFD handling (#4161)

There's no need to build and run programs, or even check the operating
system. We just need to check for O_DIRECT and posix_memalign().

* Capitalize linux for consistency (#4178)

* Reworked H5Epush_f (#4153)

* Add const to new _Float16 conversion routine parameters (#4181)

* Update Release Specific Information link. (#4179)

* Filter plugins updates for registration URL (#4180)

* Update filter plugin URL to new location

* Adjust test array size

* Add daily VFD CI workflow (#4176)

Adds testing of Subfiling VFD

* Exclude shell tests from sanitizers (#4186)

* Add a missing period at the end of sentence. (#4184)

* last-file.txt should not be created for release workflow (#4185)

* Skip part of dtypes.c _Float16 file size check for certain VFDs (#4182)

* Fixes several MinGW + Autotools issues (#4190)

* Fixes detection of various Windows libraries, etc.
* Corrects alarm(2) configure checks
* Uses Win32 threads by default w/ Pthreads override, if desired
* Set _WIN32_WINNT correctly for MinGW
* Fix setenv(3) wrapper for MinGW, which does not have getenv_s()

MinGW Autotools support is still not Amazing, but this at least
allows the library and tools build and is better about thread-safety

* Add semicolons to the end of HSYS_GOTO_ERROR (#4193)

Looks like we forgot these when we did the other macros.

* Remove broken links (#4187)

* Skip vlen IO API test for cache VOL (#4135)

* Fix cache VOL segfault in vlen io test
* Skip vlen IO API test

* Handle certain empty subfiling environment variables  (#4038)

* h5diff compares attribute data like dataset data (#4191)

Updates tools docs to indicate that dataset and attribute data are compared in the same way

* A path component may include a dot with other characters (#4192)

* Add RELEASE.txt note for recent datatype conversion improvements (#4195)

* Add NEWSLETTER item about _Float16 support (#4197)

* Correct download link for develop doxygen (#4196)

* Update version in new .yml files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 2. Medium ⏹ It would be nice to have this in the next release Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants