diff --git a/_layouts/default.html b/_layouts/default.html index 25f3e83a..4d253cbf 100644 --- a/_layouts/default.html +++ b/_layouts/default.html @@ -36,6 +36,7 @@

{{ site.title | default: site.github.repo
  • Documentation
  • Community Forum
  • Help Desk
  • +
  • FTP
  • Licenses

  • diff --git a/documentation/hdf5-docs/advanced_topics/cp_committed_dt_H5Ocopy.md b/documentation/hdf5-docs/advanced_topics/cp_committed_dt_H5Ocopy.md new file mode 100644 index 00000000..8de9ba82 --- /dev/null +++ b/documentation/hdf5-docs/advanced_topics/cp_committed_dt_H5Ocopy.md @@ -0,0 +1,93 @@ +--- +title: Copying Committed Datatypes with H5Ocopy +redirect\_from: + +--- +##\*\*\* UNDER CONSTRUCTION \*\*\* + +# Copying Committed Datatypes with H5Ocopy + +HDF5 Release 1.8.9 May 2012 + +Contents +1. Copying Committed Datatypes with H5Ocopy +Committed datatypes can be a powerful feature in HDF5. They can be used to share a single datatype description among multiple datasets, to save space or ensure that the datatypes are truly identical, and to assign a name to that datatype within the HDF5 group structure. The object copy API, H5Ocopy, can be used to copy HDF5 objects from one file to another, including committed datatypes and objects that use them. However, problems can occur when a dataset using a committed datatype or an object with an attribute that uses a committed datatype is copied to another file with H5Ocopy. + +When copying a dataset that uses a committed datatype or an object with an attribute that uses a committed datatype between files, the library by default does not look for a matching committed datatype in the destination file. The library creates a new committed datatype in the destination file without any links to it (an anonymous committed datatype) and then links the dataset to the anonymous committed datatype. This means that, when copying multiple datasets in separate calls to H5Ocopy, a new committed datatype is created for each H5Ocopy call. While it is possible to have all of the copied datasets share the same committed datatype by copying them in a single call to H5Ocopy, this is not always attainable. + +For example, imagine that a user has an application that automatically creates many data files, each with many datasets that all use a single committed datatype. At the end of a project, the user wants to merge all of these files into a single file. The HDF5 Library can have all of the datasets in the combined file use the same committed datatype, but the default behavior of the library is to create an anonymous committed datatype for each dataset. + +To make sure that shared committed datatypes in the source are shared in the copy, use the H5Pset\_copy\_object property list API routine to set the H5O\_COPY\_MERGE\_COMMITTED\_DTYPE\_FLAG flag. When this flag is set and H5Ocopy encounters an object or attribute that uses a committed datatype, H5Ocopy will search for a matching committed datatype in the destination file. If a matching committed datatype is found, then it will be used by the copied dataset or attribute. The next few paragraphs describe in more detail the process that H5Ocopy goes through. + +When the H5O\_COPY\_MERGE\_COMMITTED\_DTYPE\_FLAG flag is set, H5Ocopy will search the destination file for committed datatypes and build a temporary list in memory of all the committed datatypes it finds. Then, whenever H5Ocopy encounters a dataset that uses a committed datatype or an object with an attribute that uses a committed datatype in the source, it will check that list to see if it contains a datatype equal to the source datatype. If H5Ocopy finds an equal datatype, it will modify the copied object or attribute to use the found committed datatype as its datatype. H5Ocopy will then update the list if a new committed datatype is created in the destination file as a result of the copy. When later datasets and attributes using committed datatypes are encountered, the library will again check to see if the list contains a matching datatype. + +To determine if two committed datatypes are equal, the library will compare their descriptions in a manner similar to H5Tequal. In addition, if either committed datatype has one or more attributes, then all attributes must be present in both committed datatypes, and the attributes must all be identical. Each attribute’s datatype description, dataspace, and raw data must be identical. However, if an attribute uses a committed datatype, then the attributes of the attribute’s committed datatype will not be compared. + +When H5Ocopy encounters a committed datatype object in the source file, it will similarly search for a matching committed datatype in the destination file. If a match is found, the library will create a hard link in the destination file to the found datatype. If a match is not found, the library will copy the committed datatype normally and add it to the temporary list of committed datatypes in the destination file. + +By default, H5Ocopy will search the entire destination file for a matching committed datatype. It is possible to focus where H5Ocopy will search. This focusing should result in a faster search. If there are locations in the destination file where a matching committed datatype might be found, then those locations can be specified with the H5Padd\_merge\_committed\_dtype\_path property. + +The example below shows how to enable the feature described above for use with H5Ocopy. + +Example 1. Setting the object copy property list + +hid\_t ocpypl\_id; +ocpypl\_id = H5Pcreate(H5P\_OBJECT\_COPY); +status = H5Pset\_copy\_object(ocpypl\_id, H5O\_COPY\_MERGE\_COMMITTED\_DT\_FLAG); + status = H5Ocopy(file1\_id, src\_name, file2\_id, dst\_name, ocpypl\_id, H5P\_DEFAULT); +1.1. Callback Function +If no matching datatype is found in the locations specified by the call to H5Padd\_merge\_committed\_dtype\_path, then H5Ocopy will by default search the entire destination file. In some cases, this may not be desirable. For instance, the user may expect the datatype to always have a match in the specified locations and may wish to return an error if a match is not found. The user may also have a very large file for which the full search incurs a substantial performance penalty. In this instance, the user may wish to log these events so that other datatypes can be added with H5Padd\_merge\_committed\_dtype\_path, or the user may wish to abort the search and copy the datatype normally. + +To support these use cases, the functions H5Pset\_mcdt\_search\_cb and H5Pget\_mcdt\_search\_cb have been added. These functions allow the user to define a callback function that will be called every time the list of paths added by H5Padd\_merge\_committed\_dtype\_path has been exhausted but before beginning the full search of the file. The prototype for the callback function is defined by H5O\_mcdt\_search\_cb\_t. The only argument to the callback function is a user supplied user data pointer, and the return value is an enum, defined by H5O\_mndt\_search\_ret\_t, which tells the library to either continue with the full file search, abort the search and copy the datatype normally (create a new committed datatype in the destination file), or return an error. + +1.2. Function Summary +Functions used in committed datatype copying operations are listed below. + +Function Listing 1. Committed datatype copying related functions + +C Function +Fortran + +Purpose +H5Ocopy +H5ocopy\_f + +Allows an application to copy an object within an HDF5 file or to another HDF5 file. +H5Pset\_copy\_object +h5pset\_copy\_object\_f + +Allows an application to set properties to be used when an object is copied. +H5Padd\_merge\_committed\_dtype\_path +(none) + +Allows an application to add a path to the list of paths that will be searched in the destination file for a matching committed datatype. +H5Pfree\_merge\_committed\_dtype\_paths +(none) + +Allows an application to clear the list of paths stored in the object copy property list ocpypl\_id. +H5Pset\_mcdt\_search\_cb +(none) + +Allows an application to set the callback function that H5Ocopy will invoke before searching the entire destination file for a matching committed datatype. +H5Pget\_mcdt\_search\_cb +(none) + +Allows an application to retrieve the callback function from the specified object copy property list. +H5O\_mcdt\_search\_cb\_t +(none) + +Definition of the callback function set by H5Pset\_mcdt\_search\_cb. Provides the mechanism by which a user application may set an action for H5Ocopy to take after checking all suggested paths for a matching committed datatype but before starting the global search of the destination file. +1.3. Resources +See the following for more information. + +See the “HDF5 Datatypes” chapter in the HDF5 User’s Guide. + +See these entries in the HDF5 Reference Manual: + +H5Ocopy +H5Pset\_copy\_object +H5Padd\_merge\_committed\_dtype\_path +H5Pfree\_merge\_committed\_dtype\_paths +H5Pset\_mcdt\_search\_cb +H5Pget\_mcdt\_search\_cb + diff --git a/documentation/hdf5-docs/advanced_topics/data_flow_pline_H5Dread.md b/documentation/hdf5-docs/advanced_topics/data_flow_pline_H5Dread.md new file mode 100644 index 00000000..cd942a6f --- /dev/null +++ b/documentation/hdf5-docs/advanced_topics/data_flow_pline_H5Dread.md @@ -0,0 +1,331 @@ +--- +title: HDF5 Data Flow Pipeline for H5Dread +redirect\_from: + +--- +##\*\*\* UNDER CONSTRUCTION \*\*\* + +# HDF5 Data Flow Pipeline for H5Dread + +This document describes the HDF5 library’s data movement and processing activities when H5Dread is called for a dataset with chunked storage. The document provides an overview of how memory management, filter operations, datatype conversions, and value transformations occur as data is transferred from an HDF5 file to an application’s memory buffer. The intended audience includes application HDF5 developers who would like a better understanding of HDF5’s handling of filters, datatype conversions, and value transformations, or who are concerned with performance tuning and memory management, in the context of H5Dread. + +Contents +1. Introduction +2. Examples +2.1 Data to be read +2.2 Example A +2.3 Example B +3. Data Flow Pipeline +3.1 Step 1: Read chunk from disk +3.2 Step 2: Reverse filter(s) +3.2.1 Compute and verify checksum +3.2.2 Uncompress data +3.3 Step 3: Put chunk in cache or heap +3.4 Step 4: Allocate temporary buffer for datatype conversion / value transformation +3.5 Step 5: Copy array elements to temporary buffer +3.6 Step 6: Perform datatype conversion +3.7 Step 7: Perform value transformation +3.8 Step 8: Copy elements from temporary buffer to application’s buffer +3.9 Step 9: Scatter elements from chunk to application’s buffer +3.10 Step 10: Free memory +3.11 Step 11: Return from H5Dread +4. H5Dread Activity Diagram +Acknowledgements +Revision History +Suggested Revisions +1. Introduction +This document outlines how data stored in the chunked storage format moves from an HDF5 file on disk into an application’s memory buffer when H5Dread is called. + +Section 2 introduces two slightly different H5Dread examples that incorporate filters, datatype conversions, and value transformations. Section 3 follows the path of data from disk to application memory, giving a step-by-step explanation of the data flow pipeline for the two examples. + +Sections 2 and 3 include code samples, references to HDF5 APIs, and notes about default buffer sizes that are not crucial to the description of pipeline data movement and processing activities. This information, which appears in highlighted boxes, is included to help readers connect the pipeline description to options provided by the HDF5 library for controlling behavior and tuning performance during the H5Dread call. Note that the code samples do not include error checking. + +A UML activity diagram that summarizes the data movement and processing steps in the pipeline is presented in Section 4. The activity diagram gives a high-level view of the operations that are examined in detail in Section 3. Some readers may prefer to review Section 4 prior to reading Sections 2 and 3. + +2. Examples +Two example H5Dread requests are described in this section. Both examples read the same data from an HDF5 file stored on disk. The data and buffer sizes used in the examples are unrealistically small, but serve the purpose of illustrating the pipeline. + +In the first example, the data read completely fills the application’s memory buffer, overwriting all data that is initially in the buffer. In the second example, the application’s buffer is larger than the data read, so only some of the data in the buffer is overwritten. The first example applies a value transformation to the data being read from disk, while the second does not. + +In both examples, the file datatype is different than the memory datatype, so a datatype conversion is done as part of the read operation. + +2.1 Data to be read +An HDF5 file stored on disk has a dataset, D. In both of the examples, a region of dataset D will be read. The desired region is 4 elements x 4 elements, with the first element to be read at index <1,1>. + +HDF5 dataset D has these characteristics: + +Number of dimensions: 2 +Dimension sizes: 32 x 64 +Chunked storage with chunks of size 4x4 +32-‐bit integer atomic datatype +Little-­endian representation +Compressed with the DEFLATE filter +Fletcher 32 checksum filter applied +Region of Interest: + +Number of dimensions: 2 +Dimension sizes: 4 x 4 +Offset in dataset: <1,1> +Therefore: + +Total cells in array equals: 32 * 64 = 2048 +Size of array (uncompressed) is: 2048 * 4 bytes = 8192 bytes +Each chunk has uncompressed size of: 4*4*4 = 64 bytes +There are 128 chunks in the file +Total cells in region of interest equals: 4 * 4 = 16 +Size of region of interest (uncompressed) is : 16 * 4 bytes = 64 bytes +Hyperslab selection for source + +A hyperslab selection is used to specify the region from dataset D that will be transferred with the H5Dread call. A 4 x 4 array of elements, positioned at <1,1> will be read. The C code to do the selection of the hyperslab in the file dataspace is shown here: + +/* get the file dataspace */ +space\_src = H5Dget\_space(dataset); + +/* define hyperslab in the dataset */ +start\_src[0] = 1; start\_src[1] = 1; +count\_src[0] = 4; count\_src[1] = 4; +status = H5Sselect\_hyperslab(space\_src, H5S\_SELECT\_SET, start\_src, NULL, + + +Figure 1 shows a conceptual representation of dataset D with uncompressed data. The desired region and the chunks that contain it are shown in green and yellow, respectively. + + +Figure 1: Conceptual representation of dataset D + + +In Figure 1, the chunks and region of interest are represented by the yellow and green areas of the diagram. Figure 2 shows an enlarged view of the region and chunks, with labels added. The dashed lines delineate individual elements in the dataset. Elements in the region of interest have been labeled so they can be traced through the pipeline process. + + +Figure 2: Conceptual representation of region and chunks in dataset D + +Figure 3 shows a more accurate depiction of the chunks and elements in the region as they could be laid out on disk. Note that data in each chunk is stored contiguously on disk, and that the chunks have unequal sizes due to compression of the data. + + +Figure 3: Conceptual representation of chunks and region elements on disk + +2.2 Example A +In the first example, the application’s memory buffer is a 4 x 4 array. Every element in the array will be filled with elements read from dataset D, so no hyperslab selection is needed for the destination dataspace. + +The application’s memory buffer characteristics are: + +Number of dimensions: 2 +Dimension sizes: 4 x 4 +64­‐bit integer atomic datatype +Big‐endian representation +Therefore: + +Total cells in array equals: 4 * 4 = 16 +Size of array is: 16 * 8 bytes = 128 bytes +In this example, the application includes a value transformation in the data transfer property list for the H5Dread call. The transformation specifies that the integer value “2” should be added to each element in the region of interest before it is copied to the application’s memory buffer. + +Example A: H5Dread setup and call + +The application’s memory buffer and dataspace are both 4 x 4. A value transformation is used to add “2” to each element that is read. The C code to allocate the memory buffer, define the memory dataspace, specify the value transformation operation, and make the read call is shown here: + +/* memory buffer */ +int destA[4][4]; + + ... + +/* define memory dataspace */ +dims\_destA[0] = 4; dims\_destA[1] = 4; +space\_destA = H5Screate\_simple(2,dims\_destA,NULL); + +/* create data transfer property list and specify value transformation */ +dxpl\_id\_vtrans = H5Pcreate(H5P\_DATASET\_XFER); +H5Pset\_data\_transform(dxpl\_id\_vtrans,“x+2”); + +/* call H5Dread */ +status = H5Dread (dataset, H5T\_NATIVE\_INT, space\_destA, space\_src, + dxpl\_id\_vtrans, destA ) +2.3 Example B +In the second example, the application’s memory buffer is2 ax 16 array. The 16 elements read will be distributed non-sequentially in the application’s buffer, as described by a hyperslab selection in the memory dataspace parameter. + +The application memory buffer characteristics are: + +Number of dimensions: 2 +Dimension sizes: 2 x 16 +64 bit Integer atomic datatype +Big­‐endian representation +Therefore: + +Total cells in array equals: 2 * 16 = 32 +Size of array is: 32 * 8 bytes = 256 bytes +No value transformation is applied in this example. + +Example B: H5Dread setup and call + +The application’s memory buffer and dataspace are both 2 x 16. A hyperslab selection on the memory dataspace specifies the 16 elements that will be updated by the read. The C code to allocate the memory buffer, define the dataspace, select the hyperslab, and make the read call is shown here: + +/* memory buffer */ +int destB[2][16]; + +... + +/* define memory dataspace */ +dims\_destB[0] = 2; dims\_destB[1] = 16; +space\_destB = H5Screate\_simple(2,dims\_destB,NULL); + +/* define memory hyperslab selection */ +start\_destB[0] = 0; start\_destB[1] = 0; +block\_destB[0] = 2; block\_destB[1] = 1; +count\_destB[0] = 1; count\_destB[1] = 8; +stride\_destB[0] = 2; stride\_destB[1] = 2; +status = H5Sselect\_hyperslab(space\_destB, H5S\_SELECT\_SET, start\_destB, + stride\_destB, count\_destB, block\_destB); + +/* call H5Dread */ +status = H5Dread (dataset, H5T\_NATIVE\_INT, space\_destB, space\_src, H5P\_DEFAULT, destB +3. Data Flow Pipeline +The HDF5 library performs a series of steps when H5Dread is called. For datasets with chunk storage, each chunk that contains data to be read is individually processed. After all of the chunks have been read and processed, the library returns from the H5Dread call. + +The steps in the data flow processing pipeline for the H5Dread call are detailed here, using the examples outlined in the previous section to illustrate the process. + +3.1 Step 1: Read chunk from disk +The HDF5 library reads a chunk of the dataset that contains data in the region of interest from disk. For the given examples, chunk A would be read the first time this step executes. Steps 2‐9 are applied to each chunk. + +If one or more filters were applied when the dataset was written, as they were in the given examples, processing continues with Step 2. Otherwise, processing continues with Step 3. + +3.2 Step 2: Reverse filter(s) +Filters in HDF5 + +The HDF5 library allows applications to specify filters to apply when a dataset is written to an HDF5 file via the H5Pset\_filter call. These filters perform operations such as compression, encryption, a checksum computation. Each filter operation applied when a dataset is written must be “reversed” when the dataset is read. For instance, if a dataset was compressed when written, it must be uncompressed when read. + +If multiple filters are specified for the write, the application controls the order in which the filter operations are applied to the dataset. The read operation reverses the filter operations in the opposite order to the one used when the dataset was written. That is, the last filter applied when writing is the first filter reversed when reading, and so on. + +In dataset D, two filters were applied when the data was written. The DEFLATE compression filter was applied first, followed by the Fletcher 32 checksum filters. The last filter applied when the dataset was written, the checksum filter, is reversed first in the H5Dread processing pipeline. + +3.2.1 Compute and verify checksum +Using memory in the application’s memory space (heap) that is managed by the HDF5 library, the HDF5 library computes the checksum for the current chunk and compares it to the saved value. If there is a checksum mismatch and error detection is enabled, the H5Dread call will return an error at this point. Otherwise, processing continues. + +Checksum error detection + + Checksum error detection is enabled by default. H5Pset\_edc\_check can be used to disable checksum error detection + +3.2.2 Uncompress data +Again using memory in the application’s memory space (heap) that is managed by the HDF5 library, the DEFLATE filter is reversed and the current chunk is uncompressed. + +3.3 Step 3: Put chunk in cache or heap +HDF5 chunk cache + +Every HDF5 dataset with the chunked storage format has an HDF5 chunk cache associated with it. The HDF5 chunk cache for the dataset is allocated from the application’s memory and managed by the HDF5 library. As its name suggests, the cache remains allocated across multiple calls, and is used to provide performance optimizations in accessing data. + +The default size of each HDF5 chunk cache is 1 MB in HDF5 Releases 1.6.x and 1.8.x. H5Pset\_chunk\_cache, introduced in Release 1.8.3, allows control of the chunk cache size on a per‐dataset basis. + +If there is sufficient space in dataset D’s chunk cache, the data for the current chunk is stored there. Otherwise, it is temporarily stored on the heap in memory managed by the HDF5 library. Data in the chunk cache always has the disk datatype representation and is always in the “filters reversed” form. + +In the given examples, the uncompressed data for the current chunk will be stored. A chunk cache of at least 64 bytes is needed to hold a single chunk of uncompressed data for dataset D. + + +Figure 4: Steps 1-3 of data flow pipeline + +Steps 1, 2, and 3 are represented graphically in Figure 4. At this point, all filters have been reversed (checksum and decompression in the given examples), and the datatype matches its representation in the file (32-bit little­‐endian integer in this case). + +If no datatype conversion is needed and no value transformation is specified, processing continues with Step 9 for each chunk. + +Example A involves datatype conversion and value transformation, and Example B involves datatype conversion, so Steps 4­‐8 are performed for both examples. + +3.4 Step 4: Allocate temporary buffer for datatype conversion / value transformation +The first time the HDF5 library is ready to perform a datatype conversion and/or value transformation for a given H5Dread call, HDF5 allocates a temporary buffer in the application’s memory to perform the necessary operations on the array elements in the region of interest. + +The number of bytes needed to process one dataset array element depends on the “larger” of the file and memory datatypes. In the given examples, the memory datatype (64 bit integer) is larger than the disk datatype (32 bit integer). Therefore, it will take 8 bytes (64 bits) in the temporary buffer to process one dataset array element. + +For the purpose of explanation, assume the temporary buffer is 64 bytes. In the given examples, up to eight elements of the dataset (64 bytes in buffer/ 8 bytes per element) can be resident in the temporary buffer at any given time. + +Temporary buffer size + +The default size of the temporary buffer used for datatype conversion and/or value transformation is 1 MB in Release 1.8.2. The size can be controlled with H5Pset\_buffer. + +The following steps are taken for each chunk. + +3.5 Step 5: Copy array elements to temporary buffer +Unprocessed elements in the region of interest are gathered from the chunk cache (or the HDF5-managed memory on the heap if the chunk cache was too small) into the temporary buffer. The size of the temporary buffer determines the maximum number of elements that can be gathered at one time. + +Considering chunk A in the examples, eight of the nine elements that are of interest will fit into the temporary buffer. Figure 5 depicts the temporary buffer at this stage of the pipeline. + + +Figure 5: Temporary buffer with first eight elements in region +3.6 Step 6: Perform datatype conversion +If the memory representation is not the same as the disk representation, datatype conversion is performed by the HDF5 library on the values in the temporary buffer. + +In the examples, the values will be converted from 32-bit little-endian integers into 64­‐bit big­‐endian integers. Figure 6 illustrates the contents of the temporary buffer after the datatype conversion. + + +Figure 6: Temporary buffer with first eight elements after datatype conversion + +3.7 Step 7: Perform value transformation +If the property list used in H5Dread includes a data transformation, as it does in Example A, the algebraic operation specified in the transformation is applied to each element in the temporary buffer by the HDF library. + +In Example A, each of the eight 64­‐bit big­‐endian integers in the temporary buffer will have 2 added to it. For instance, if element c in the array had the value 65 in the HDF5 file on disk, it will have the value 67 in the temporary buffer after Step 7 completes. + +3.8 Step 8: Copy elements from temporary buffer to application’s buffer +The HDF library scatters elements from the temporary buffer into the application’s memory buffer, using offsets computed from the hyperslab selections specified in the dataspace parameters of the H5Dread call. + +Figure 7 represents the contents of the application’s memory buffer for Example A after this step completes the first time. The elements in the application’s memory buffer have been converted into the memory datatype and have had the value transformation applied. + + +Figure 7: Application's memory buffer after first pass through Step 8 for Example A +Figure 8 represents the contents of the application’s memory buffer for Example B after Step 8 completes the first time. The elements in the application’s memory buffer have been converted into the memory datatype. No value transformation is applied in Example B. + + +Figure 8: Application's memory buffer after first pass through step 8 in Example B + +Steps 5-8 are repeated until all elements in the region of interest for the current chunk have been processed and copied into the application’s memory buffer. + +Steps 1-8 are repeated until all chunks containing data in the region of interest have been processed and all requested data has been copied into the application’s memory buffer. + +After all requested data in the region of interest has been processed and copied into the application’s memory buffer, the HDF5 library continues with Step 10. + +3.9 Step 9: Scatter elements from chunk to application’s buffer +This step is not performed for either Example A or Example B. + +If no datatype conversion is needed and no value transformation is specified, this step follows Step 3. In this step, the HDF5 library copies the elements in the region of interest for the current chunk from the memory it manages (chunk cache or heap) into the application’s memory buffer. This gather/scatter operation is based on the chunk and application buffer offsets the library computes from the hyperslab selections specified in the file and memory dataspace parameters used in the H5Dread call. + +Steps 1-3 and 9 are repeated until all chunks containing data in the region of interest have been processed and all requested data has been copied into the application’s memory buffer. + +After all requested data in the region of interest has been processed and copied into the application’s memory buffer, the HDF5 library continues with Step 10. + +3.10 Step 10: Free memory +The HDF5 library frees the memory it allocated in the course of performing Steps 1-9. Note that memory allocated to the chunk cache is not freed until the dataset is closed. + +3.11 Step 11: Return from H5Dread +With the requested data in the application’s memory buffer, and the memory used to perform the processing associated with the read released, the HDF5 library returns from the H5Dread call. + +Figure 9 shows the contents of the application’s memory buffer when H5Dread returns for Example A, and Figure 10 shows the results for Example B. + + +Figure 9: Application's memory buffer when H5Dread returns for Example A + + +Figure 10: Application's memory buffer when H5Dread returns for Example B + + + +4. H5Dread Activity Diagram +Figure 11 shows a UML activity diagram for the H5Dread call when a dataset with chunked storage layout is being read. The diagram shows the activities involved fulfilling the read request, without the step­‐by­‐step detail given in Section 3. + + +Figure 11: H5Dread activity diagram + +Acknowledgements +This document was written as background material for a specific project. The principal author was Ruth Aydt. Quincey Koziol provided information about the HDF5 library’s behavior, patiently answering questions, and correcting technical errors in the document. Mike Folk provided advice on document structure and presentation. + +Revision History +April 8, 2009 Circulated for comment among selected parties. +April 12, 2009 Incorporated feedback; circulated for comment. +April 12, 2009 Incorporated feedback; circulated for comment. +April 15, 2009 Corrected errors in V2; circulated to The HDF Group for comment. +June 8, 2009 Updated Figure 8; posted on website. +December 29, 2010 +Light editorial pass. Modified for inclusion in HDF5 product documentation as draft. Add to collection Advanced Topics in HDF5 + +August 30, 2017 Converted to web document; updated errors in figure numbering (Names for figures 8 and 9 were used twice.) +Suggested Revisions + These suggested revisions were deferred due to time constraints. Readers are encouraged to send additional suggestions for improving the document to docs@hdfgroup.org. + +Add a simpler example with no chunking, no filters, and no subsetting. This would document how things are different without chunked storage, and introduce the concepts more gradually. +Move the Activity Diagram to the beginning of the document, and add text explaining it. Possibly add another activity diagram that shows the data flow pipeline in less detail, then show the detailed version. +Provide rigorous definitions of the terms used, either by referencing definitions provided elsewhere, or including definitions in this document. +Revisit special formatting used for code examples and information about HDF5 library settings and APIs. Current format emphasizes the sections, which were meant to be “technical asides”. +Consider adding explicit discussions about performance issues. +Remove details about default buffer sizes from this document, as they can change with different versions. Perhaps put the information in a summary table that will be updated for each release so that it can be referenced from this document. diff --git a/documentation/hdf5-docs/advanced_topics/file_image_ops.html b/documentation/hdf5-docs/advanced_topics/file_image_ops.html new file mode 100644 index 00000000..9169101d --- /dev/null +++ b/documentation/hdf5-docs/advanced_topics/file_image_ops.html @@ -0,0 +1,1011 @@ +

    title: HDF5 File Image Operations +redirect_from: + - /display/HDF5/HDF5+File+Image+Operations

    +

    *** UNDER CONSTRUCTION ***

    +

    HDF5 File Image Operations

    +

    1. Introduction to HDF5 File Image Operations

    + +

    2. C API Call Syntax

    +

    2.1. Low-level C API Routines

    + +

    2.2. High-level C API Routine

    + +

    3. C API Call Semantics

    +

    3.1. File Image Callback Semantics

    + +

    3.2. Initial File Image Semantics

    + +

    4. Examples

    +

    4.1. Reading an In-memory HDF5 File Image
    + 4.2. In-memory HDF5 File Image Construction
    + 4.3. Using HDF5 to Construct and Read a Data Packet
    + 4.4. Using a Template File

    +

    5. Java Signatures for File Image Operations API Calls

    +

    6. Fortran Signatures for File Image Operations API Calls

    +

    6.1. Low-level Fortran API Routines

    + +

    6.2. High-level Fortran API Routine

    + +

    1. Introduction to HDF5 File Image Operations

    +

    File image operations allow users to work with HDF5 files in memory in the same ways that users currently work with HDF5 files on disk. Disk I/O is not required when file images are opened, created, read from, or written to.

    +

    An HDF5 file image is an HDF5 file that is held in a buffer in main memory. Setting up a file image in memory involves using either a buffer in the file access property list or a buffer in the Core (aka Memory) file driver.

    +

    The advantage of working with a file in memory is faster access to the data.

    +

    The challenge of working with files in memory buffers is maximizing performance and minimizing memory footprint while working within the constraints of the property list mechanism. This should be a non-issue for small file images, but may be a major issue for large images.

    +

    If invoked with the appropriate flags, the H5LTopen_file_image() high level library call should deal with these challenges in most cases. However, some applications may require the programmer to address these issues directly.

    +

    1.1. File Image Operations Function Summary

    +

    Functions used in file image operations are listed below.

    +

    Function Listing 1. File image operations functions

    +

    H5Pset_file_image

    +

    Allows an application to specify an initial file image. For more information, see section 2.1.1. +H5Pget_file_image Allows an application to retrieve a copy of the file image designated for a VFD to use as the initial contents of a file. For more information, see section 2.1.2. +H5Pset_file_image_callbacks Allows an application to manage file image buffer allocation, copying, reallocation, and release. For more information, see section 2.1.3. +H5Pget_file_image_callbacks Allows an application to obtain the current file image callbacks from a file access property list. For more information, see section 2.1.4. +H5Fget_file_image Provides a simple way to retrieve a copy of the image of an existing, open file. For more information, see section 2.1.6. +H5LTopen_file_image Provides a convenient way to open an initial file image with the Core VFD. For more information, see section 2.2.1.

    +

    1.2. Abbreviations

    +

    The following abbreviations are used in this document:

    +

    Table 1. Abbreviations

    +

    FAPL or fapl

    +

    File Access Property List. In code samples, fapl is used.

    +

    VFD

    +

    Virtual File Driver

    +

    VFL

    +

    Virtual File Layer

    +

    1.3. Developer Prerequisites

    +

    Developers who use the file image operations described in this document should be proficient and experienced users of the HDF5 C Library APIs. More specifically, developers should have a working knowledge of property lists, callbacks, and virtual file drivers.

    +

    1.4. Resources

    +

    See the following for more information.

    +

    The “RFC: File Image Operations” is the primary source for the information in this document.

    +

    The “Alternate File Storage Layouts and Low-level File Drivers” section is in “The HDF5 File” chapter of the HDF5 User’s Guide .

    +

    The H5P_SET_FAPL_CORE function call can be used to modify the file access property list so that the Memory virtual file driver, H5FD_ CORE, is used. The Memory file driver is also known as the Core file driver.

    +

    Links to the Virtual File Layer and List of VFL Functions documents can be found in the HDF5 Technical Notes.

    +

    2. C API Call Syntax

    +

    The C API function calls described in this chapter fall into two categories: low-level routines that are part of the main HDF5 C Library and one high-level routine that is part of the “lite” API in the high-level wrapper library. The high-level routine uses the low-level routines and presents frequently requested functionality conveniently packaged for application developers’ use.

    +

    2.1. Low-level C API Routines

    +

    The purpose of this section is to describe the low-level C API routines that support file image operations. These routines allow an in-memory image of an HDF5 file to be opened without requiring file system I/O.

    +

    The basic approach to opening an in-memory image of an HDF5 file is to pass the image to the Core file driver, and then tell the Core file driver to open the file. We do this by using the H5Pget/set_file_image calls. These calls allow the user to specify an initial file image.

    +

    A potential problem with the H5Pget/set_file_image calls is the overhead of allocating and copying of large file image buffers. The callback routines enable application programs to avoid this problem. However, the use of these callbacks is complex and potentially hazardous: the particulars are discussed in the semantics and examples chapters below (see section 3.1 and section 4.1 respectively). Fortunately, use of the file image callbacks should seldom be necessary: the H5LTopen_file_image call should address most use cases.

    +

    The property list facility in HDF5 is employed in file image operations. This facility was designed for passing data, not consumable resources, into API calls. The peculiar ways in which the file image allocation callbacks may be used allows us to avoid extending the property list structure to handle consumable resources cleanly and to avoid constructing a new facility for the purpose.

    +

    The sub-sections below describe the low-level C APIs that are used with file image operations.

    +

    2.1.1. H5Pset_file_image

    +

    The H5Pset_file_image routine allows an application to provide an image for a file driver to use as the initial contents of the file. This call was designed initially for use with the Core VFD, but it can be used with any VFD that supports using an initial file image when opening a file. See the “Virtual File Driver Feature Flags” section for more information. Calling this routine makes a copy of the provided file image buffer. See the “H5Pset_file_image_callbacks” section for more information.

    +

    The signature of H5Pset_file_image is defined as follows:

    +

    herr_t H5Pset_file_image(hid_t fapl_id, void *buf_ptr, size_t buf_len) +The parameters of H5Pset_file_image are defined as follows:

    +

    fapl_id contains the ID of the target file access property list. +buf_ptr supplies a pointer to the initial file image, or NULL if no initial file image is desired. +buf_len contains the size of the supplied buffer, or 0 if no initial image is desired. +If either the buf_len parameter is zero, or the buf_ptr parameter is NULL, no file image will be set in the FAPL, and any existing file image buffer in the FAPL will be released. If a buffer is released, the FAPL’s file image buf_len will be set to 0 and buf_ptr will be set to NULL.

    +

    Given the tight interaction between the file image callbacks and the file image, the file image callbacks in a property list cannot be changed while a file image is defined.

    +

    With properly constructed file image callbacks, it is possible to avoid actually copying the file image. The particulars of this are discussed in greater detail in the “C API Call Semantics” chapter and in the “Examples” chapter.

    +

    2.1.2. H5Pget_file_image

    +

    The H5Pget_file_image routine allows an application to retrieve a copy of the file image designated for a VFD to use as the initial contents of a file. This routine uses the file image callbacks (if defined) when allocating and loading the buffer to return to the application, or it uses malloc and memcpy if the callbacks are undefined. When malloc and memcpy are used, it will be the caller’s responsibility to discard the returned buffer via a call to free.

    +

    The signature of H5Pget_file_image is defined as follows:

    +

    herr_t H5Pget_file_image(hid_t fapl_id, void **buf_ptr_ptr, size_t *buf_len_ptr) +The parameters of H5Pget_file_image are defined as follows:

    +

    fapl_id contains the ID of the target file access property list. +buf_ptr_ptr contains a NULL or a pointer to a void*. If buf_ptr_ptr is not NULL, on successful return, *buf_ptr_ptr will contain a pointer to a copy of the initial image provided in the last call to H5Pset_file_image for the supplied fapl_id. If no initial image has been set, *buf_ptr_ptr will be NULL. +buf_len_ptr contains a NULL or a pointer to size_t. If buf_len_ptr is not NULL, on successful return, *buf_len_ptr will contain the value of the buf_len parameter for the initial image in the supplied fapl_id. If no initial image is set, the value of *buf_len_ptr will be 0. +As with H5Pset_file_image, appropriately defined file image callbacks can allow this function to avoid buffer allocation and memory copy operations.

    +

    2.1.3. H5Pset_file_image_callbacks

    +

    The H5Pset_file_image_callbacks API call exists to allow an application to control the management of file image buffers through user defined callbacks. These callbacks will be used in the management of file image buffers in property lists and in select file drivers. These routines are invoked when a new file image buffer is allocated, when an existing file image buffer is copied or resized, or when a file image buffer is released from use. From the perspective of the HDF5 Library, the operations of the image_malloc, image_memcpy, image_realloc, and image_free callbacks must be identical to those of the corresponding C standard library calls (malloc, memcpy, realloc, and free). While the operations must be identical, the file image callbacks have more parameters. The callbacks and their parameters are described below. The return values of image_malloc and image_realloc are identical to the return values of malloc and realloc. However, the return values of image_memcpy and image_free are different than the return values of memcpy and free: the return values of image_memcpy and image_free can also indicate failure. See the “File Image Callback Semantics” section for more information.

    +

    The signature of H5Pset_file_image_callbacks is defined as follows:

    +

    typedef enum +{ + H5_FILE_IMAGE_OP_PROPERTY_LIST_SET, + H5_FILE_IMAGE_OP_PROPERTY_LIST_COPY, + H5_FILE_IMAGE_OP_PROPERTY_LIST_GET, + H5_FILE_IMAGE_OP_PROPERTY_LIST_CLOSE, + H5_FILE_IMAGE_OP_FILE_OPEN, + H5_FILE_IMAGE_OP_FILE_RESIZE, + H5_FILE_IMAGE_OP_FILE_CLOSE +} H5_file_image_op_t;

    +

    typedef struct +{ + void *(*image_malloc)(size_t size, H5_file_image_op_t file_image_op, + void *udata); + void *(*image_memcpy)(void *dest, const void *src, size_t size, + H5_file_image_op_t file_image_op, void *udata); + void *(*image_realloc)(void *ptr, size_t size, + H5_file_image_op_t file_image_op, void *udata); + herr_t (*image_free)(void *ptr, H5_file_image_op_t file_image_op, + void *udata); + void *(*udata_copy)(void *udata); + herr_t (*udata_free)(void *udata); + void *udata; +} H5_file_image_callbacks_t;

    +

    herr_t H5Pset_file_image_callbacks(hid_t fapl_id, + H5_file_image_callbacks_t *callbacks_ptr) +The parameters of H5Pset_file_image_callbacks are defined as follows:

    +

    fapl_id contains the ID of the target file access property list. +callbacks_ptr contains a pointer to an instance of the H5_file_image_callbacks_t structure. +The fields of the H5_file_image_callbacks_t structure are defined as follows:

    +

    image_malloc contains a pointer to a function with (from the perspective of HDF5) functionality identical to the standard C library malloc() call. The parameters of the image_malloc callback are defined as follows: +size contains the size in bytes of the image buffer to allocate. +file_image_op contains one of the values of H5_file_image_op_t. These values indicate the operation being performed on the file image when this callback is invoked. Possible values for file_image_op are discussed in Table 2. +udata holds the value passed in for the udata parameter to H5Pset_file_image_callbacks. +Setting image_malloc to NULL indicates that the HDF5 Library should invoke the standard C library malloc() routine when allocating file image buffers.

    +

    image_memcpy contains a pointer to a function with (from the perspective of HDF5) functionality identical to the standard C library memcpy() call except that it returns NULL on failure. Recall that the memcpy C Library routine is defined to return the dest parameter in all cases. The parameters of the image_memcpy callback are defined as follows: +dest contains the address of the destination buffer. +src contains the address of the source buffer. +size contains the number of bytes to copy. +file_image_op contains one of the values of H5_file_image_op_t. These values indicate the operation being performed on the file image when this callback is invoked. Possible values for file_image_op are discussed in Table 2. +udata holds the value passed in for the udata parameter to H5Pset_file_image_callbacks. +Setting image_memcpy to NULL indicates that the HDF5 Library should invoke the standard C library memcpy() routine when copying buffers.

    +

    image_realloc contains a pointer to a function with (from the perspective of HDF5) functionality identical to the standard C library realloc() call. The parameters of the image_realloc callback are defined as follows: +ptr contains the pointer to the buffer being reallocated. +size contains the desired size in bytes of the buffer after realloc. +file_image_op contains one of the values of H5_file_image_op_t. These values indicate the operation being performed on the file image when this callback is invoked. Possible values for file_image_op are discussed in Table 2. +udata holds the value passed in for the udata parameter to H5Pset_file_image_callbacks. +Setting image_realloc to NULL indicates that the HDF5 Library should invoke the standard C library realloc() routine when resizing file image buffers.

    +

    image_free contains a pointer to a function with (from the perspective of HDF5) functionality identical to the standard C library free() call except that it will return 0 (SUCCEED) on success and -1 (FAIL) on failure. The parameters of the image_free callback are defined as follows: +ptr contains the pointer to the buffer being released. +file_image_op contains one of the values of H5_file_image_op_t. These values indicate the operation being performed on the file image when this callback is invoked. Possible values for file_image_op are discussed in Table 2 . +udata holds the value passed in for the udata parameter to H5Pset_file_image_callbacks. +Setting image_free to NULL indicates that the HDF5 Library should invoke the standard C library free() routine when releasing file image buffers.

    +

    udata_copy contains a pointer to a function that (from the perspective of HDF5) allocates a buffer of suitable size, copies the contents of the supplied udata into the new buffer, and returns the address of the new buffer. The function returns NULL on failure. This function is necessary if a non-NULL udata parameter is supplied, so that property lists containing the image callbacks can be copied. If the udata parameter (below) is NULL, then this parameter should be NULL as well. The parameter of the udata_copy callback is defined as follows: +udata contains the pointer to the user data block being copied. +udata_free contains a pointer to a function that (from the perspective of HDF5) frees a user data block. This function is necessary if a non-NULL udata parameter is supplied so that property lists containing image callbacks can be discarded without a memory leak. If the udata parameter (below) is NULL, this parameter should be NULL as well. The parameter of the udata_free callback is defined as follows: +udata contains the pointer to the user data block to be freed. +udata_free returns 0 (SUCCEED) on success and -1 (FAIL) on failure.

    +

    udata contains a pointer value, potentially to user-defined data, that will be passed to the image_malloc, image_memcpy, image_realloc, and image_free callbacks. +The semantics of the values that can be set for the file_image_op parameter to the above callbacks are described in the table below:

    +

    Table 2. Values for the file_image_op parameter

    +

    H5_FILE_IMAGE_OP_PROPERTY_LIST_SET This value is passed to the image_malloc and image_memcpy callbacks when an image buffer is being copied while being set in a FAPL. +H5_FILE_IMAGE_OP_PROPERTY_LIST_COPY This value is passed to the image_malloc and image_memcpy callbacks when an image buffer is being copied when a FAPL is copied. +H5_FILE_IMAGE_OP_PROPERTY_LIST_GET This value is passed to the image_malloc and image_memcpy callbacks when an image buffer is being copied while being retrieved from a FAPL +H5_FILE_IMAGE_OP_PROPERTY_LIST_CLOSE This value is passed to the image_free callback when an image buffer is being released during a FAPL close operation. +H5_FILE_IMAGE_OP_FILE_OPEN This value is passed to the image_malloc and image_memcpy callbacks when an image buffer is copied during a file open operation. While the image being opened will typically be copied from a FAPL, this need not always be the case. An example of an exception is when the Core file driver takes its initial image from a file. +H5_FILE_IMAGE_OP_FILE_RESIZE This value is passed to the image_realloc callback when a file driver needs to resize an image buffer. +H5_FILE_IMAGE_OP_FILE_CLOSE This value is passed to the image_free callback when an image buffer is being released during a file close operation.

    +

    In closing our discussion of H5Pset_file_image_callbacks(), we note the interaction between this call and the H5Pget/set_file_image() calls above: since the malloc, memcpy, and free callbacks defined in the instance of H5_file_image_callbacks_t are used by H5Pget/set_file_image(), H5Pset_file_image_callbacks() will fail if a file image is already set in the target property list.

    +

    For more information on writing the file image to disk, set the backing_store parameter. See the H5Pset_fapl_core entry in the HDF5 Reference Manual.

    +

    2.1.4. H5Pget_file_image_callbacks

    +

    The H5Pget_file_image_callbacks routine is designed to obtain the current file image callbacks from a file access property list.

    +

    The signature of H5Pget_file_image_callbacks() is defined as follows:

    +

    herr_t H5Pget_file_image_callbacks(hid_t fapl_id, + H5_file_image_callbacks_t *callbacks_ptr) +The parameters of H5Pget_file_image_callbacks are defined as follows:

    +

    fapl_id contains the ID of the target file access property list. +callbacks_ptr contains a pointer to an instance of the H5_file_image_callbacks_t structure. All fields should be initialized to NULL. See the “H5Pset_file_image_callbacks” section for more information on the H5_file_image_callbacks_t structure. +Upon successful return, the fields of *callbacks_ptr shall contain values as defined below:

    +

    Upon successful return, callbacks_ptr->image_malloc will contain the pointer passed as the image_malloc field of the instance of H5_file_image_callbacks_t pointed to by the callbacks_ptr parameter of the last call to H5Pset_file_image_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks_ptr->image_memcpy will contain the pointer passed as the image_memcpy field of the instance of H5_file_image_callbacks_t pointed to by the callbacks_ptr parameter of the last call to H5Pset_file_image_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks_ptr->image_realloc will contain the pointer passed as the image_realloc field of the instance of H5_file_image_callbacks_t pointed to by the callbacks_ptr parameter of the last call to H5Pset_file_image_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks_ptr->image_free_ptr will contain the pointer passed as the image_free field of the instance of H5_file_image_callbacks_t pointed to by the callbacks_ptr parameter of the last call to H5Pset_file_image_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks_ptr->udata_copy will contain the pointer passed as the udata_copy field of the instance of H5_file_image_callbacks_t pointed to by the callbacks_ptr parameter of the last call to H5Pset_file_image_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks_ptr-> udata_free will contain the pointer passed as the udata_free field of the instance of H5_file_image_callbacks_t pointed to by the callbacks_ptr parameter of the last call to H5Pset_file_image_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks_ptr->udata will contain the pointer passed as the udata field of the instance of H5_file_image_callbacks_t pointed to by the callbacks_ptr parameter of the last call to H5Pset_file_image_callbacks() for the specified FAPL, or NULL if there has been no such call. +2.1.5. Virtual File Driver Feature Flags +Implementation of the H5Pget/set_file_image_callbacks() and H5Pget/set_file_image() function calls requires a pair of virtual file driver feature flags. The flags are H5FD_FEAT_ALLOW_FILE_IMAGE and H5FD_FEAT_CAN_USE_FILE_IMAGE_CALLBACKS. Both of these are defined in H5FDpublic.h.

    +

    The first flag, H5FD_FEAT_ALLOW_FILE_IMAGE, allows a file driver to indicate whether or not it supports file images. A VFD that sets this flag when its ‘query’ callback is invoked indicates that the file image set in the FAPL will be used as the initial contents of a file. Support for setting an initial file image is designed primarily for use with the Core VFD. However, any VFD can indicate support for this feature by setting the flag and copying the image in an appropriate way for the VFD (possibly by writing the image to a file and then opening the file). However, such a VFD need not employ the file image after file open time. In such cases, the VFD will not make an in-memory copy of the file image and will not employ the file image callbacks.

    +

    File drivers that maintain a copy of the file in memory (only the Core file driver at present) can be constructed to use the initial image callbacks (if defined). Those that do must set the H5FD_FEAT_CAN_USE_FILE_IMAGE_CALLBACKS flag, the second flag, when their ‘query’ callbacks are invoked.

    +

    Thus file drivers that set the H5FD_FEAT_ALLOW_FILE_IMAGE flag but not the H5FD_FEAT_CAN_USE_FILE_IMAGE_CALLBACKS flag may read the supplied image from the property list (if present) and use it to initialize the contents of the file. However, they will not discard the image when done, nor will they make any use of any file image callbacks (if defined).

    +

    If an initial file image appears in a file allocation property list that is used in an H5Fopen() call, and if the underlying file driver does not set the H5FD_FEAT_ALLOW_FILE_IMAGE flag, then the open will fail.

    +

    If a driver sets both the H5FD_FEAT_ALLOW_FILE_IMAGE flag and the H5FD_FEAT_CAN_USE_FILE_IMAGE_CALLBACKS flag, then that driver will allocate a buffer of the required size, copy the contents of the initial image buffer from the file access property list, and then open the copy as if it had just loaded it from file. If the file image allocation callbacks are defined, the driver shall use them for all memory management tasks. Otherwise it will use the standard malloc, memcpy, realloc, and free C library calls for this purpose.

    +

    If the VFD sets the H5FD_FEAT_ALLOW_FILE_IMAGE flag, and an initial file image is defined by an application, the VFD should ensure that file creation operations (as opposed to file open operations) bypass use of the file image, and create a new, empty file.

    +

    Finally, it is logically possible that a file driver would set the H5FD_FEAT_CAN_USE_FILE_IMAGE_CALLBACKS flag, but not the H5FD_FEAT_ALLOW_FILE_IMAGE flag. While it is hard to think of a situation in which this would be desirable, setting the flags this way will not cause any problems: the two capabilities are logically distinct.

    +

    2.1.6. H5Fget_file_image

    +

    The purpose of the H5Fget_file_image routine is to provide a simple way to retrieve a copy of the image of an existing, open file. This routine can be used with files opened using the SEC2 (aka POSIX), STDIO, and Core (aka Memory) VFDs.

    +

    The signature of H5Fget_file_image is defined as follows:

    +

    ssize_t H5Fget_file_image(hid_t file_id, void *buf_ptr, size_t buf_len) +The parameters of H5Fget_file_image are defined as follows:

    +

    file_id contains the ID of the target file. +buf_ptr contains a pointer to the buffer into which the image of the HDF5 file is to be copied. If buf_ptr is NULL, no data will be copied, but the return value will still indicate the buffer size required (or a negative value on error). +buf_len contains the size of the supplied buffer. +If the return value of H5Fget_file_image is a positive value, then the value will be the length of buffer required to store the file image (in other words, the length of the file). A negative value might be returned if the file is too large to store in the supplied buffer or on failure.

    +

    The current file size can be obtained via a call to H5Fget_filesize(). Note that this function returns the value of the end of file (EOF) and not the end of address space (EOA). While these values are frequently the same, it is possible for the EOF to be larger than the EOA. Since H5Fget_file_image() will only obtain a copy of the file from the beginning of the superblock to the EOA, it will be best to use H5Fget_file_image() to determine the size of the buffer required to contain the image.

    +

    Other Design Considerations

    +

    Here are some other notes regarding the design and implementation of H5Fget_file_image.

    +

    The H5Fget_file_image call should be part of the high-level library. However, a file driver agnostic implementation of the routine requires access to data structures that are hidden within the HDF5 Library. We chose to implement the call in the library proper rather than expose those data structures.

    +

    There is no reason why the H5Fget_file_image() API call could not work on files opened with any file driver. However, the Family, Multi, and Split file drivers have issues that make the call problematic. At present, files opened with the Family file driver are marked as being created with that file driver in the superblock, and the HDF5 Library refuses to open files so marked with any other file driver. This negates the purpose of the H5Fget_file_image() call. While this mark can be removed from the image, the necessary code is not trivial.

    +

    Thus we will not support the Family file driver in H5Fget_file_image() unless there is demand for it. Files created with the Multi and Split file drivers are also marked in the superblock. In addition, they typically use a very sparse address space. A sparse address space would require the use of an impractically large buffer for an image, and most of the buffer would be empty. So, we see no point in supporting the Multi and Split file drivers in H5Fget_file_image() under any foreseeable circumstances.

    +

    2.2. High-level C API Routine

    +

    The H5LTopen_file_image high-level routine encapsulates the capabilities of routines in the main HDF5 Library with conveniently accessible abstractions.

    +

    2.2.1. H5LTopen_file_image

    +

    The H5LTopen_file_image routine is designed to provide an easier way to open an initial file image with the Core VFD. Flags to H5LTopen_file_image allow for various file image buffer ownership policies to be requested. See the HDF5 Reference Manual for more information on high-level APIs.

    +

    The signature of H5LTopen_file_image is defined as follows:

    +

    hid_t H5LTopen_file_image(void *buf_ptr, size_t buf_len, unsigned flags) +The parameters of H5LTopen_file_image are defined as follows:

    +

    buf_ptr contains a pointer to the supplied initial image. A NULL value is invalid and will cause H5LTopen_file_image to fail. +buf_len contains the size of the supplied buffer. A value of 0 is invalid and will cause H5LTopen_file_image to fail. +flags contains a set of flags indicating whether the image is to be opened read/write, whether HDF5 is to take control of the buffer, and how long the application promises to maintain the buffer. Possible flags are described in the table below: +Table 3. Flags for H5LTopen_file_image

    +

    H5LT_FILE_IMAGE_OPEN_RW Indicates that the HDF5 Library should open the image read/write instead of the default read-only. +H5LT_FILE_IMAGE_DONT_COPY +Indicates that the HDF5 Library should not copy the file image buffer provided, but should use it directly. The HDF5 Library will release the file image when finished. The supplied buffer must have been allocated via a call to the standard C library malloc() or calloc() routines. The HDF5 Library will call free() to release the buffer. In the absence of this flag, the HDF5 Library will copy the buffer provided. The H5LT_FILE_IMAGE_DONT_COPY flag provides an application with the ability to “give ownership” of a file image buffer to the HDF5 Library.

    +

    The HDF5 Library will modify the buffer on write if the image is opened read/write and the H5LT_FILE_IMAGE_DONT_COPY flag is set.

    +

    The H5LT_FILE_IMAGE_DONT_RELEASE flag, see below, is invalid unless the H5LT_FILE_IMAGE_DONT_COPY flag is set

    +

    H5LT_FILE_IMAGE_DONT_RELEASE +Indicates that the HDF5 Library should not attempt to release the buffer when the file is closed. This implies that the application will tend to this detail and that the application will not discard the buffer until after the file image is closed.

    +

    Since there is no way to return a changed buffer base address to the application, and since realloc can change this value, calls to realloc() must be barred when this flag is set. As a result, any write that requires an increased buffer size will fail.

    +

    This flag is invalid unless the H5LT_FILE_IMAGE_DONT_COPY flag, see above, is set.

    +

    If the H5LT_FILE_IMAGE_DONT_COPY flag is set and this flag is not set, the HDF5 Library will release the file image buffer after the file is closed using the standard C library free() routine.

    +

    Using this flag and the H5LT_FILE_IMAGE_DONT_COPY flag provides a way for the application to specify a buffer that the HDF5 Library can use for opening and accessing as a file image while letting the application retain ownership of the buffer.

    +

    The following table is intended to summarize the semantics of the H5LT_FILE_IMAGE_DONT_COPY and H5LT_FILE_IMAGE_DONT_RELEASE flags (shown as “Don’t Copy Flag” and “Don’t Release Flag” respectively in the table):

    +

    Table 4. Summary of Don’t Copy and Don’t Release Flag Actions

    +

    Don’t Copy Flag

    +

    Don’t Release Flag

    +

    Make Copy of User Supplied Buffer

    +

    Pass User Supplied Buffer to File Driver

    +

    Release User Supplied Buffer When Done

    +

    Permit realloc of Buffer Used by File Driver

    +

    False

    +

    Don’t care

    +

    True

    +

    False

    +

    False

    +

    True

    +

    True

    +

    False

    +

    False

    +

    True

    +

    True

    +

    True

    +

    True

    +

    True

    +

    False

    +

    True

    +

    False

    +

    False

    +

    The return value of H5LTopen_file_image will be a file ID on success or a negative value on failure. The file ID returned should be closed with H5Fclose.

    +

    Note that there is no way currently to specify a “backing store” file name in this definition of H5LTopen_image.

    +

    3. C API Call Semantics

    +

    The purpose of this chapter is to describe some issues that developers should consider when using file image buffers, property lists, and callback APIs.

    +

    3.1. File Image Callback Semantics

    +

    The H5Fget/set_file_image_callbacks() API calls allow an application to hook the memory management operations used when allocating, duplicating, and discarding file images in the property list, in the Core file driver, and potentially in any in-memory file driver developed in the future.

    +

    From the perspective of the HDF5 Library, the supplied image_malloc(), image_memcpy(), image_realloc(), and image_free() callback routines must function identically to the C standard library malloc(), memcpy(), realloc(), and free() calls. What happens on the application side can be much more nuanced, particularly with the ability to pass user data to the callbacks. However, whatever the application does with these calls, it must maintain the illusion that the calls have had the expected effect. Maintaining this illusion requires some understanding of how the property list structure works, and what HDF5 will do with the initial images passed to it.

    +

    At the beginning of this document, we talked about the need to work within the constraints of the property list mechanism. When we said “from the perspective of the HDF5 Library…” in the paragraph above, we are making reference to this point.

    +

    The property list mechanism was developed as a way to add parameters to functions without changing the parameter list and breaking existing code. However, it was designed to use only “call by value” semantics, not “call by reference”. The decision to use “call by value” semantics requires that the values of supplied variables be copied into the property list. This has the advantage of simplifying the copying and deletion of property lists. However, if the value to be copied is large (say a 2 GB file image), the overhead can be unacceptable.

    +

    The usual solution to this problem is to use “call by reference” where only a pointer to an object is placed in a parameter list rather than a copy of the object itself. However, use of “call by reference” semantics would greatly complicate the property list mechanism: at a minimum, it would be necessary to maintain reference counts to dynamically allocated objects so that the owner of the object would know when it was safe to free the object.

    +

    After much discussion, we decided that the file image operations calls were sufficiently specialized that it made no sense to rework the property list mechanism to support “call by reference.” Instead we provided the file image callback mechanism to allow the user to implement some version of “call by reference” when needed. It should be noted that we expect this mechanism to be used rarely if at all. For small file images, the copying overhead should be negligible, and for large images, most use cases should be addressed by the H5LTopen_file_image call.

    +

    In the (hopefully) rare event that use of the file image callbacks is necessary, the fundamental point to remember is that the callbacks must be constructed and used in such a way as to maintain the library’s illusion that it is using “call by value” semantics.

    +

    Thus the property list mechanism must think that it is allocating a new buffer and copying the supplied buffer into it when the file image property is set. Similarly, it must think that it is allocating a new buffer and copying the contents of the existing buffer into it when it copies a property list that contains a file image. Likewise, it must think it is de-allocating a buffer when it discards a property list that contains a file image.

    +

    Similar illusions must be maintained when a file image buffer is copied into the Core file driver (or any future driver that uses the file image callbacks) when the file driver re-sizes the buffer containing the image and finally when the driver discards the buffer.

    +

    3.1.1. Buffer Ownership

    +

    The owner of a file image in a buffer is the party that has the responsibility to discard the file image buffer when it is no longer needed. In this context, the owner is either the HDF5 Library or the application program.

    +

    We implemented the image_* callback facility to allow efficient management of large file images. These facilities can be used to allow sharing of file image buffers between the application and the HDF5 library, and also transfer of ownership in either direction. In such operations, care must be taken to ensure that ownership is clear and that file image buffers are not discarded before all references to them are discarded by the non-owning party.

    +

    Ownership of a file image buffer will only be passed to the application program if the file image callbacks are designed to do this. In such cases, the application program must refrain from freeing the buffer until the library has deleted all references to it. This in turn will happen after all property lists (if any) that refer to the buffer have been discarded, and the file driver (if any) that used the buffer has closed the file and thinks it has discarded the buffer.

    +

    3.1.2. Sharing a File image Buffer with the HDF5 Library

    +

    As mentioned above, the HDF5 property lists are a mechanism for passing values into HDF5 Library calls. They were created to allow calls to be extended with new parameters without changing the actual API or breaking existing code. They were designed based on the assumption that all new parameters would be “call by value” and not “call by reference.” Having “call by value” parameters means property lists can be copied, reused, and discarded with ease.

    +

    Suppose an application wished to share a file image buffer with the HDF5 Library. This means the library would be allowed to read the file image, but not free it. The file image callbacks might be constructed as follows to share a buffer:

    +

    Construct the image_malloc() call so that it returns the address of the buffer instead of allocating new space. This will keep the library thinking that the buffers are distinct even when they are not. Support this by including the address of the buffer in the user data. As a sanity check, include the buffer’s size in the user data as well, and require image_malloc() to fail if the requested buffer size is unexpected. Finally, include a reference counter in the user data, and increment the reference counter on each call to image_malloc(). +Construct the image_memcpy() call so that it does nothing. As a sanity check, make it fail if the source and destination pointers do not match the buffer address in the user data or if the size is unexpected. +Construct the image_free() routine so that it does nothing. As a sanity check, make it compare the supplied pointer with the expected pointer in the user data. Also, make it decrement the reference counter and notify the application that the HDF5 Library is done with the buffer when the reference count drops to 0. +As the property list code will never resize a buffer, we do not discuss the image_realloc() call here. The behavior of image_realloc() in this scenario depends on what the application wants to do with the file image after it has been opened. We discuss this issue in the next section. Note also that the operation passed into the file image callbacks allow the callbacks to behave differently depending on the context in which they are used.

    +

    For more information on user defined data, see the “H5Pset_file_image_callbacks” section.

    +

    3.1.3. File Driver Considerations

    +

    When a file image is opened by a driver that sets both the H5FD_FEAT_ALLOW_FILE_IMAGE and the H5FD_FEAT_CAN_USE_FILE_IMAGE_CALLBACKS flags, the driver will allocate a buffer large enough for the initial file image and then copy the image from the property list into this buffer. As processing progresses, the driver will reallocate the image as necessary to increase its size and will eventually discard the image at file close. If defined, the driver will use the file image callbacks for these operations; otherwise, the driver will use the standard C library calls. See the "H5Pset_file_image_callbacks” section for more information.

    +

    As described above, the file image callbacks can be constructed so as to avoid the overhead of buffer allocations and copies while allowing the HDF5 Library to maintain its illusions on the subject. There are two possible complications involving the file driver. The complications are the possibility of reallocation calls from the driver and the possibility of the continued existence of property lists containing references to the buffer.

    +

    Suppose an application wishes to share a file image buffer with the HDF5 Library. The application allows the library to read (and possibly write) the image, but not free it. We must first decide whether the image is to be opened read-only or read/write.

    +

    If the image will be opened read-only (or if we know that any writes will not change the size of the image), the image_realloc() call should never be invoked. Thus the image_realloc() routine can be constructed so as to always fail, and the image_malloc(), image_memcpy(), and image_free() routines can be constructed as described in the section above.

    +

    Suppose, however, that the file image will be opened read/write and may grow during the computation. We must now allow for the base address of the buffer to change due to reallocation calls, and we must employ the user data structure to communicate any change in the buffer base address and size to the application. We pass buffer changes to the application so that the application will be able to eventually free the buffer. To this end, we might define a user data structure as shown in the example below:

    +
     typedef struct udata {
    +      void \*init\_ptr;
    +      size\_t init\_size;
    +      int init\_ref\_count;
    +      void \*mod\_ptr;
    +      size\_t mod\_size;
    +      int mod\_ref\_count;
    + }
    +
    +

    Example 1. Using a user data structure to communicate with an application

    +

    We initialize an instance of the structure so that init_ptr points to the buffer to be shared, init_size contains the initial size of the buffer, and all other fields are initialized to either NULL or 0 as indicated by their type. We then pass a pointer to the instance of the user data structure to the HDF5 Library along with allocation callback functions constructed as follows:

    +

    Construct the image_malloc() call so that it returns the value in the init_ptr field of the user data structure and increments the init_ref_count. As a sanity check, the function should fail if the requested size does not match the init_size field in the user data structure or if any of the modified fields have values other than their initial values. +Construct the image_memcpy() call so that it does nothing. As a sanity check, it should be made to fail if the source, destination, and size parameters do not match the init_ptr and init_size fields as appropriate. +Construct the image_realloc() call so that it performs a standard realloc. Sanity checking, assuming that the realloc is successful, should be as follows: +If the mod_ptr, mod_size, or mod_ref_count fields of the user data structure still have their initial values, verify that the supplied pointer matches the init_ptr field and that the supplied size does not match the init_size field. Decrement init_ref_count, set mod_ptr equal to the address returned by realloc, set mod_size equal to the supplied size, and set mod_ref_count to 1. +If the mod_ptr, mod_size, or mod_ref_count fields of the user data structure are defined, verify that the supplied pointer matches the value of mod_ptr and that the supplied size does not match mod_size. Set mod_ptr equal to the value returned by realloc, and set mod_size equal to the supplied size. +In both cases, if all sanity checks pass, return the value returned by the realloc call. Otherwise, return NULL.

    +

    Construct the image_free() routine so that it does nothing. Perform sanity checks as follows: +If the H5_FILE_IMAGE_OP_PROPERTY_LIST_CLOSE flag is set, decrement the init_ref_count field of the user data structure. Flag an error if init_ref_count drops below zero. +If the H5_FILE_IMAGE_OP_FILE_CLOSE flag is set, check to see if the mod_ptr, mod_size, or mod_ref_count fields of the user data structure have been modified from their initial values. If they have, verify that mod_ref_count contains 1 and then set that field to zero. If they have not been modified, proceed as per the H5_FILE_IMAGE_OP_PROPERTY_LIST_CLOSE case. +In either case, if both the init_ref_count and mod_ref_count fields have dropped to zero, notify the application that the HDF5 Library is done with the buffer. If the mod_ptr or mod_size fields have been modified, pass these values on to the application as well.

    +

    3.2. Initial File Image Semantics

    +

    One can argue whether creating a file with an initial file image is closer to creating a file or opening a file. The consensus seems to be that it is closer to a file open, and thus we shall require that the initial image only be used for calls to H5Fopen().

    +

    Whatever our convention, from an internal perspective, opening a file with an initial file image is a bit of both creating a file and opening a file. Conceptually, we will create a file on disk, write the supplied image to the file, close the file, open the file as an HDF5 file, and then proceed as usual (of course, the Core VFD will not write to the file system unless it is configured to do so). This process is similar to a file create: we are creating a file that did not exist on disk to begin with and writing data to it. Also, we must verify that no file of the supplied name is open. However, this process is also similar to a file open: we must read the superblock and handle the usual file open tasks.

    +

    Implementing the above sequence of actions has a number of implications on the behavior of the H5Fopen() call when an initial file image is supplied:

    +

    H5Fopen() must fail if the target file driver does not set the H5FD_FEAT_ALLOW_FILE_IMAGE flag and a file image is specified in the FAPL. +If the target file driver supports the H5FD_FEAT_ALLOW_FILE_IMAGE flag, then H5Fopen() must fail if the file is already open or if a file of the specified name exists. +Even if the above constraints are satisfied, H5Fopen() must still fail if the image does not contain a valid (or perhaps just plausibly valid) image of an HDF5 file. In particular, the superblock must be processed, and the file structure be set up accordingly. +See the “Virtual File Driver Feature Flags” section for more information.

    +

    As we indicated earlier, if an initial file image appears in the property list of an H5Fcreate() call, it is ignored.

    +

    While the above section on the semantics of the file image callbacks may seem rather gloomy, we get the payback here. The above says everything that needs to be said about initial file image semantics in general. The sub-section below has a few more observations on the Core file driver.

    +

    3.2.1. Applying Initial File Image Semantics to the Core File Driver

    +

    At present, the Core file driver uses the open() and read() system calls to load an HDF5 file image from the file system into RAM. Further, if the backing_store flag is set in the FAPL entry specifying the use of the Core file driver, the Core file driver’s internal image will be used to overwrite the source file on either flush or close. See the H5Pset_fapl_core entry in the HDF5 Reference Manual for more information.

    +

    This results in the following observations. In all cases assume that use of the Core file driver has been specified in the FAPL.

    +

    If the file specified in the H5Fopen() call does not exist, and no initial image is specified in the FAPL, the open must fail because there is no source for the initial image needed by the Core file driver. +If the file specified in the H5Fopen() call does exist, and an initial image is specified in the FAPL, the open must fail because the source of the needed initial image is ambiguous: the file image could be taken either from file or from the FAPL. +If the file specified in the H5Fopen() call does not exist, and an initial image is specified in the FAPL, the open will succeed. This assumes that the supplied image is valid. Further, if the backing store flag is set, the file specified in the H5Fopen() call will be created, and the contents of the Core file driver’s internal buffer will be written to the new file on flush or close. +Thus a call to H5Fopen() can result in the creation of a new HDF5 file in the file system.

    +

    4. Examples

    +

    The purpose of this chapter is to provide examples of how to read or build an in-memory HDF5 file image.

    +

    4.1. Reading an In-memory HDF5 File Image

    +

    The H5Pset_file_image() function call allows the Core file driver to be initialized from an application provided buffer. The following pseudo code illustrates its use:

    +

    <allocate and initialize buf_len and buf> +<allocate fapl_id> + +H5Pset_file_image(fapl_id, buf, buf_len); + + + +<read and/or write file as desired, close> +Example 2. Using H5Pset_file_image to initialize the Core file driver

    +

    This solution is easy to code, but the supplied buffer is duplicated twice. The first time is in the call to H5Pset_file_image() when the image is duplicated and the duplicate inserted into the property list. The second time is when the file is opened: the image is copied from the property list into the initial buffer allocated by the Core file driver. This is a non-issue for small images, but this could become a significant performance hit for large images.

    +

    If we want to avoid the extra malloc and memcpycalls, we must decide whether the application should retain ownership of the buffer or pass ownership to the HDF5 Library.

    +

    The following pseudo code illustrates opening the image read -only using the H5LTopen_file_image() routine. In this example, the application retains ownership of the buffer and avoids extra buffer allocations and memcpy calls.

    +

    <allocate and initialize buf_len and buf> +hid_t file_id; +unsigned flags = H5LT_FILE_IMAGE_DONT_COPY | H5LT_FILE_IMAGE_DONT_RELEASE; +file_id = H5LTopen_file_image(buf, buf_len, flags); +<read file as desired, and then close> + +Example 3. Using H5LTopen_file_image to open a read-only file image where the application retains ownership of the buffer +If the application wants to transfer ownership of the buffer to the HDF5 Library, and the standard C library routine free is an acceptable way of discarding it, the above example can be modified as follows:

    +

    <allocate and initialize buf_len and buf> +hid_t file_id; +unsigned flags = H5LT_FILE_IMAGE_DONT_COPY; +file_id = H5LTopen_file_image(buf, buf_len, flags); +<read file as desired, and then close>

    +

    Example 4. Using H5LTopen_file_image to open a read-only file image where the application transfers ownership of the buffer +Again, file access is read-only. Read/write access can be obtained via the H5LTopen_file_image() call, but we will explore that in the section below.

    +

    4.2. In-memory HDF5 File Image Construction

    +

    Before the implementation of file image operations, HDF5 supported construction of an image of an HDF5 file in memory with the Core file driver. The H5Fget_file_image() function call allows an application access to the file image without first writing it to disk. See the following code fragment:

    + +H5Fflush(fid); +size = H5Fget\_file\_image(fid, NULL, 0); +buffer\_ptr = malloc(size); +H5Fget\_file\_image(fid, buffer\_ptr, size); +Example 5. Accessing the image of a file in memory + +

    The use of H5Fget_file_image() may be acceptable for small images. For large images, the cost of the malloc() and memcpy() operations may be excessive. To address this issue, the H5Pset_file_image_callbacks() call allows an application to manage dynamic memory allocation for file images and memory-based file drivers (only the Core file driver at present). The following code fragment illustrates its use. Note that most error checking is omitted for simplicity and that H5Pset_file_image is not used to set the initial file image.

    +

    struct udata_t { +void * image_ptr; +size_t image_size; + } udata = {NULL, 0}; +void *image_malloc(size_t size, H5_file_image_op_t file_image_op, void *udata) +{ + ((struct udata_t *)udata)->image_size = size; + return(malloc(size)); +} +void *image_memcpy)(void *dest, const void *src, size_t size, + H5_file_image_op_t file_image_op, void *udata) +{ +assert(FALSE); /* Should never be invoked in this scenario. */ + return(NULL); /* always fails */ + } +void image_realloc(void *ptr, size_t size, H5_file_image_op_t file_image_op, +void *udata) +{ + ((struct udata_t *)udata)->image_size = size; + return(realloc(ptr, size)); +} +herr_t image_free(void *ptr, H5_file_image_op_t file_image_op, void *udata) +{ + assert(file_image_op == H5_FILE_IMAGE_OP_FILE_CLOSE); + ((struct udata_t *)udata)->image_ptr = ptr; + return(0); /* if we get here, we must have been successful */ +} +void *udata_copy(void *udata) +{ + return(udata); +} +herr_t udata_free(void *udata) +{ + return(0); +} +H5_file_image_callbacks_t callbacks = {image_malloc, image_memcpy, + image_realloc, image_free, + udata_copy, udata_free, + (void *)(&udata)}; +<allocate fapl_id> +H5Pset_file_image_callbacks(fapl_id, &callbacks); +<open core file using fapl_id, write file, close it> +assert(udata.image_ptr!= NULL); +/* udata now contains the base address and length of the final version of the core file */ +<use image of file, and then discard it via free()> +Example 6. Using H5Pset_file_image_callbacks to improve memory allocation

    +

    The above code fragment gives the application full ownership of the buffer used by the Core file driver after the file is closed, and it notifies the application that the HDF5 Library is done with the buffer by setting udata.image_ptr to something other than NULL. If read access to the buffer is sufficient, the H5Fget_vfd_handle() call can be used as an alternate solution to get access to the base address of the Core file driver’s buffer.

    +

    The above solution avoids some unnecessary mallocand memcpycalls and should be quite adequate if an image of an HDF5 file is constructed only occasionally. However, if an HDF5 file image must be constructed regularly, and if we can put a strong and tight upper bound on the size of the necessary buffer, then the following pseudo code demonstrates a method of avoiding memory allocation completely. The downside, however, is that buffer is allocated statically. Again, much error checking is omitted for clarity.

    +

    char buf[BIG_ENOUGH]; +struct udata_t { +void * image_ptr; +size_t image_size; +size_t max_image_size; +int ref_count; +} udata = {(void *)(&(buf[0]), 0, BIG_ENOUGH, 0}; +void *image_malloc(size_t size, H5_file_image_op_t file_image_op, void *udata) +{ + assert(size <= ((struct udata_t *)udata)->max_image_size); + assert(((struct udata_t *)udata)->ref_count == 0); + ((struct udata_t *)udata)->image_size = size; + (((struct udata_t *)udata)->ref_count)++; + return((((struct udata_t *)udata)->image_ptr); +} +void *image_memcpy)(void *dest, const void *src, size_t size, + H5_file_image_op_t file_image_op, void *udata) +{ +assert(FALSE); /* Should never be invoked in this scenario. */ + return(NULL); /* always fails */ + } +void *image_realloc(void *ptr, size_t size, H5_file_image_op_t file_image_op, void *udata) +{ + assert(ptr == ((struct udata_t *)udata)->image_ptr); +assert(size <= ((struct udata_t *)udata)->max_image_size); +assert(((struct udata_t *)udata)->ref_count == 1); + ((struct udata_t *)udata)->image_size = size; +return((((struct udata_t *)udata)->image_ptr); +} +herr_t image_free(void *ptr, H5_file_image_op_t file_image_op, void *udata) +{ + assert(file_image_op == H5_FILE_IMAGE_OP_FILE_CLOSE); + assert(ptr == ((struct udata_t *)udata)->image_ptr); +assert(((struct udata_t *)udata)->ref_count == 1); + (((struct udata_t *)udata)->ref_count)--; + return(0); /* if we get here, we must have been successful */ +} +void *udata_copy(void *udata) +{ + return(udata); +} +herr_t udata_free(void *udata) +{ + return(0); +} +H5_file_image_callbacks_t callbacks = {image_malloc, image_memcpy, + image_realloc, image_free, + udata_copy, udata_free, + (void *)(&udata)}; +/* end of initialization */ +<allocate fapl_id> +H5Pset_file_image_callbacks(fapl_id, &callbacks); +<open core file using fapl_id> + +<write the file, flush it, and then close it> +assert(udata.ref_count == 0); +/* udata now contains the base address and length of the final version of the core file */ + +<reinitialize udata, and repeat the above from the end of initialization onwards to write a new file image> +Example 7. Using H5Pset_file_image_callbacks with a static buffer

    +

    If we can further arrange matters so that only the contents of the datasets in the HDF5 file image change, but not the structure of the file itself, we can optimize still further by re-using the image and changing only the contents of the datasets after the initial write to the buffer. The following pseudo code shows how this might be done. Note that the code assumes that buf already contains the image of the HDF5 file whose dataset contents are to be overwritten. Again, much error checking is omitted for clarity. Also, observe that the file image callbacks do not support the H5Pget_file_image() call.

    + + +void \*image\_malloc(size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + assert(size <= ((struct udata\_t \*)udata)->max\_image\_size); + assert(size == ((struct udata\_t \*)udata)->image\_size); + assert(((struct udata\_t \*)udata)->ref\_count >= 0); + ((struct udata\_t \*)udata)->image\_size = size; + (((struct udata\_t \*)udata)->ref\_count)++; + return((((struct udata\_t \*)udata)->image\_ptr); +} +void \*image\_memcpy)(void \*dest, const void \*src, size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ +assert(dest == ((struct udata\_t \*)udata)->image\_ptr); +assert(src == ((struct udata\_t \*)udata)->image\_ptr); +assert(size <= ((struct udata\_t \*)udata)->max\_image\_size); +assert(size == ((struct udata\_t \*)udata)->image\_size); +assert(((struct udata\_t \*)udata)->ref\_count >= 1); + return(dest); /\* if we get here, we must have been successful \*/ +} +void \*image\_realloc(void \*ptr, size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + /\* One would think that this function is not needed in this scenario, as + \* only the contents of the HDF5 file is being changed, not its size or + \* structure. However, the Core file driver calls realloc() just before + \* close to clip the buffer to the size indicated by the end of the + \* address space. + \* + \* While this call must be supported in this case, the size of + \* the image should never change. Hence the function can limit itself + \* to performing sanity checks, and returning the base address of the + \* statically allocated buffer. + \*/ + assert(ptr == ((struct udata\_t \*)udata)->image\_ptr); +assert(size <= ((struct udata\_t \*)udata)->max\_image\_size); +assert(((struct udata\_t \*)udata)->ref\_count >= 1); + assert(((struct udata\_t \*)udata)->image\_size == size); +return((((struct udata\_t \*)udata)->image\_ptr); +} +herr\_t image\_free(void \*ptr, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + assert((file\_image\_op == H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_CLOSE) || + (file\_image\_op == H5\_FILE\_IMAGE\_OP\_FILE\_CLOSE)); +assert(((struct udata\_t \*)udata)->ref\_count >= 1); + (((struct udata\_t \*)udata)->ref\_count)--; + return(0); /\* if we get here, we must have been successful \*/ +} +void \*udata\_copy(void \*udata) +{ + return(udata); +} +herr\_t udata\_free(void \*udata) +{ + return(0); +} +H5\_file\_image\_callbacks\_t callbacks = {image\_malloc, image\_memcpy, + image\_realloc, image\_free, + udata\_copy, udata\_free, + (void \*)(&udata)}; +/\* end of initialization \*/ + +H5Pset\_file\_image\_callbacks(fapl\_id, &callbacks); +H5Pset\_file\_image(fapl\_id, udata.image\_ptr, udata.image\_len); + + + +assert(udata.ref\_count == 0); +/\* udata now contains the base address and length of the final version of the core file \*/ + + +Example 8. Using H5Pset\_file\_image\_callbacks where only the datasets change + +

    Before we go on, we should note that the above pseudo code can be written more compactly, albeit with fewer sanity checks, using the H5LTopen_file_image() call. See the example below:

    + +hid\_t file\_id; +unsigned flags = H5LT\_FILE\_IMAGE\_OPEN\_RW | H5LT\_FILE\_IMAGE\_DONT\_COPY | H5LT\_FILE\_IMAGE\_DONT\_RELEASE; +/\* end initialization \*/ +file\_id = H5LTopen\_file\_image(udata.image\_ptr, udata.image\_len, flags); + +/\* udata now contains the base address and length of the final version of the core file \*/ + + + + +

    Example 9. Using H5LTopen_file_image where only the datasets change +The above pseudo code allows updates of a file image about as cheaply as possible. We assume the application has enough RAM for the image and that the HDF5 file structure is constant after the first write.

    +

    While the scenario above is plausible, we will finish this section with a more general scenario. In the pseudo code below, we assume sufficient RAM to retain the HDF5 file image between uses, but we do not assume that the HDF5 file structure remains constant or that we can place a hard pper bound on the image size.

    +

    Since we must use malloc, realloc, and free in this example, and since realloc can change the base address of a buffer, we must maintain two of ptr, size, and ref_count triples in the udata structure. The first triple is for the property list (which will never change the buffer), and the second triple is for the file driver. As shall be seen, this complicates the file image callbacks considerably. Note also that while we do not use H5Pget_file_image() in this example, we do include support for it in the file image callbacks. As usual, much error checking is omitted in favor of clarity.

    +

    struct udata_t { +void * fapl_image_ptr; +size_t fapl_image_size; +int fapl_ref_count; +void * vfd_image_ptr; +size_t vfd_image_size; +nt vfd_ref_count; +} udata = {NULL, 0, 0, NULL, 0, 0}; +boolean initial_file_open = TRUE; +void *image_malloc(size_t size, H5_file_image_op_t file_image_op, void *udata) +{ + void * return_value = NULL; + switch ( file_image_op ) { + case H5_FILE_IMAGE_OP_PROPERTY_LIST_SET: + case H5_FILE_IMAGE_OP_PROPERTY_LIST_COPY: + assert(((struct udata_t *)udata)->fapl_image_ptr != NULL); + assert(((struct udata_t *)udata)->fapl_image_size == size); + assert(((struct udata_t *)udata)->fapl_ref_count >= 0); + return_value = ((struct udata_t *)udata)->fapl_image_ptr; + (((struct udata_t *)udata)->fapl_ref_count)++; + break; + case H5_FILE_IMAGE_OP_PROPERTY_LIST_GET: + assert(((struct udata_t *)udata)->fapl_image_ptr != NULL); + assert(((struct udata_t *)udata)->vfd_image_size == size); + assert(((struct udata_t *)udata)->fapl_ref_count >= 1); + return_value = ((struct udata_t *)udata)->fapl_image_ptr; + /* don’t increment ref count */ + break; + case H5_FILE_IMAGE_OP_FILE_OPEN: + assert(((struct udata_t *)udata)->vfd_image_ptr == NULL); + assert(((struct udata_t *)udata)->vfd_image_size == 0); + assert(((struct udata_t *)udata)->vfd_ref_count == 0); +if (((struct udata_t *)udata)->fapl_image_ptr == NULL ) { + ((struct udata_t *)udata)->vfd_image_ptr = +malloc(size); + ((struct udata_t *)udata)->vfd_image_size = size; + } else { + assert(((struct udata_t *)udata)->fapl_image_size == +size); + assert(((struct udata_t *)udata)->fapl_ref_count >= +1); + ((struct udata_t *)udata)->vfd_image_ptr = +((struct udata_t *)udata)->fapl_image_ptr; + ((struct udata_t *)udata)->vfd_image_size = size; + } + return_value = ((struct udata_t *)udata)->vfd_image_ptr; + (((struct udata_t *)udata)->vfd_ref_count)++; + break; + default: + assert(FALSE); + } + return(return_value); +} +void *image_memcpy)(void *dest, const void *src, size_t size, + H5_file_image_op_t file_image_op, void *udata) +{ + switch(file_image_op) { + case H5_FILE_IMAGE_OP_PROPERTY_LIST_SET: + case H5_FILE_IMAGE_OP_PROPERTY_LIST_COPY: + case H5_FILE_IMAGE_OP_PROPERTY_LIST_GET: +assert(dest == ((struct udata_t *)udata)->fapl_image_ptr); +assert(src == ((struct udata_t *)udata)->fapl_image_ptr); +assert(size == ((struct udata_t *)udata)->fapl_image_size); +assert(((struct udata_t *)udata)->fapl_ref_count >= 1); +break; +case H5_FILE_IMAGE_OP_FILE_OPEN: +assert(dest == ((struct udata_t *)udata)->vfd_image_ptr); +assert(src == ((struct udata_t *)udata)->fapl_image_ptr); +assert(size == ((struct udata_t *)udata)->fapl_image_size); +assert(size == ((struct udata_t *)udata)->vfd_image_size); +assert(((struct udata_t *)udata)->fapl_ref_count >= 1); +assert(((struct udata_t *)udata)->vfd_ref_count == 1); +break; + default: + assert(FALSE); + break; + } + return(dest); /* if we get here, we must have been successful */ + } +void *image_realloc(void *ptr, size_t size, H5_file_image_op_t file_image_op, + void *udata) +{ + assert(ptr == ((struct udata_t *)udata)->vfd_image_ptr); | +assert(((struct udata_t *)udata)->vfd_ref_count == 1); +((struct udata_t *)udata)->vfd_image_ptr = realloc(ptr, size); + ((struct udata_t *)udata)->vfd_image_size = size; +return((((struct udata_t *)udata)->vfd_image_ptr); +} +herr_t image_free(void *ptr, H5_file_image_op_t file_image_op, void *udata) +{ + switch(file_image_op) { + case H5_FILE_IMAGE_OP_PROPERTY_LIST_CLOSE: + assert(ptr == ((struct udata_t *)udata)->fapl_image_ptr); + assert(((struct udata_t *)udata)->fapl_ref_count >= 1); + (((struct udata_t *)udata)->fapl_ref_count)--; + break; + case H5_FILE_IMAGE_OP_FILE_CLOSE: + assert(ptr == ((struct udata_t *)udata)->vfd_image_ptr); + assert(((struct udata_t *)udata)->vfd_ref_count == 1); + (((struct udata_t *)udata)->vfd_ref_count)--; + break; + default: + assert(FALSE); + break; + } + return(0); /* if we get here, we must have been successful */ +} +void *udata_copy(void *udata) +{ + return(udata); +} +herr_t udata_free(void *udata) +{ + return(0); +} +H5_file_image_callbacks_t callbacks = {image_malloc, image_memcpy, + image_realloc, image_free, + udata_copy, udata_free, + (void *)(&udata)}; +/* end of initialization */ +<allocate fapl_id> +H5Pset_file_image_callbacks(fapl_id, &callbacks); +if ( initial_file_open ) { + initial_file_open = FALSE; +} else { + assert(udata.vfd_image_ptr != NULL); + assert(udata.vfd_image_size > 0); + assert(udata.vfd_ref_count == 0); + assert(udata.fapl_ref_count == 0); + udata.fapl_image_ptr = udata.vfd_image_ptr; + udata.fapl_image_size = udata.vfd_image_size; + udata.vfd_image_ptr = NULL; + udata.vfd_image_size = 0; + H5Pset_file_image(fapl_id, udata.fapl_image_ptr, udata.fapl_image_size); +} +<open core file using fapl_id> + +<write/update the file, and then close it> +assert(udata.fapl_ref_count == 0); +assert(udata.vfd_ref_count == 0);

    +

    /* udata.vfd_image_ptr and udata.vfd_image_size now contain the base address and length of the final version of the core file */ + + + +Example 10. Using H5LTopen_file_image where only the datasets change and where the file structure and image size might not be constant

    +

    The above pseudo code shows how a buffer can be passed back and forth between the application and the HDF5 Library. The code also shows the application having control of the actual allocation, reallocation, and freeing of the buffer.

    +

    4.3. Using HDF5 to Construct and Read a Data Packet

    +

    Using the file image operations described in this document, we can bundle up data in an image of an HDF5 file on one process, transmit the image to a second process, and then open and read the image on the second process without any mandatory file system I/O.

    +

    We have already demonstrated the construction and reading of such buffers above, but it may be useful to offer an example of the full operation. We do so in the example below using as simple a set of calls as possible. The set of calls in the example has extra buffer allocations. To reduce extra buffer allocations, see the sections above.

    +

    In the following example, we construct an HDF5 file image on process A and then transmit the image to process B where we then open the image and extract the desired data. Note that no file system I/O is performed: all the processing is done in memory with the Core file driver.

    +

    *** Process A *** + +H5Fflush(fid); +size = H5Fget_file_image(fid, NULL, 0); +buffer_ptr = malloc(size); +H5Fget_file_image(fid, buffer_ptr, size); + +<transmit *buffer_ptr> +free(buffer_ptr); + +*** Process B *** +hid_t file_id;

    + +buffer\_ptr = malloc(size) + +file\_id = H5LTopen\_file\_image(buf, + buf\_len, + H5LT\_FILE\_IMAGE\_DONT\_COPY); + +Example 11. Building and passing a file image from one process to another + +

    4.4. Using a Template File

    +

    After the above examples, an example of the use of a template file might seem anti-climactic. A template file might be used to enforce consistency on file structure between files or in parallel HDF5 to avoid long sequences of collective operations to create the desired groups, datatypes, and possibly datasets. The following pseudo code outlines a potential use:

    +

    <allocate and initialize buf and buflen, with buf containing the desired initial image (which in turn contains the desired group, datatype, and dataset definitions), and buf_len containing the size of buf> +<allocate fapl_id> + +H5Pset_file_image(fapl_id, buf, buf_len); + + + +<read and/or write file as desired, close> +Example 12. Using a template file

    +

    Observe that the above pseudo code includes an unnecessary buffer allocation and copy in the call to H5Pset_file_image(). As we have already discussed ways of avoiding this, we will not address that issue here.

    +

    What is interesting in this case is to consider why the application would find this use case attractive.

    +

    In the serial case, at first glance there seems little reason to use the initial image facility at all. It is easy enough to use standard C calls to duplicate a template file, rename it as desired, and then open it as an HDF5 file.

    +

    However, this assumes that the template file will always be available and in the expected place. This is a questionable assumption for an application that will be widely distributed. Thus, we can at least make an argument for either keeping an image of the template file in the executable or for including code for writing the desired standard definitions to new HDF5 files.

    +

    Assuming the image is relatively small, we can further make an argument for the image in place of the code, as, quite simply, the image should be easier to maintain and modify with an HDF5 file editor.

    +

    However, there remains the question of why one should pass the image to the HDF5 Library instead of writing it directly with standard C calls and then using HDF5 to open it. Other than convenience and a slight reduction in code size, we are hard pressed to offer a reason.

    +

    In contrast, the argument is stronger in the parallel case since group, datatype, and dataset creations are all expensive collective operations. The argument is also weaker: simply copying an existing template file and opening it should lose many of its disadvantages in the HPC context although we would imagine that it is always useful to reduce the number of files in a deployment.

    +

    In closing, we would like to consider one last point. In the parallel case, we would expect template files to be quite large. Parallel HDF5 requires eager space allocation for chunked datasets. For similar reasons, we would expect template files in this context to contain long sequences of zeros with a scattering of metadata here and there. Such files would compress well, and the compressed images would be cheap to distribute across the available processes if necessary. Once distributed, each process could uncompress the image and write to file those sections containing actual data that lay within the section of the file assigned to the process. This approach might be significantly faster than a simple copy as it would allow sparse writes, and thus it might provide a compelling use case for template files. However, this approach would require extending our current API to allow compressed images. We would also have to add the H5Pget/set_image_decompression_callback() API calls. We see no problem in doing this. However, it is beyond the scope of the current effort, and thus we will not pursue the matter further unless there is interest in our doing so.

    +

    5. Java Signatures for File Image Operations API Calls

    +

    Potential Java function call signatures for the file image operation APIs are described in this section. These have not yet been implemented, and there are no immediate plans for implementation.

    +

    Note that the H5LTopen_file_image() call is omitted. Our practice has been to not support high-level library calls in Java.

    +

    H5Pset_file_image

    +

    int H5Pset_file_image(int fapl_id, const byte[] buf_ptr); +H5Pget_file_image

    +

    herr_t H5Pget_file_image(hid_t fapl_id, byte[] buf_ptr_ptr); +H5_file_image_op_t

    +

    public static H5_file_image_op_t +{ + H5_FILE_IMAGE_OP_PROPERTY_LIST_SET, + H5_FILE_IMAGE_OP_PROPERTY_LIST_COPY, + H5_FILE_IMAGE_OP_PROPERTY_LIST_GET, + H5_FILE_IMAGE_OP_PROPERTY_LIST_CLOSE, + H5_FILE_IMAGE_OP_FILE_OPEN, + H5_FILE_IMAGE_OP_FILE_RESIZE, + H5_FILE_IMAGE_OP_FILE_CLOSE +} +H5_file_image_malloc_cb

    +

    public interface H5_file_image_malloc_cb extends Callbacks { + buf[] callback(H5_file_image_op_t file_image_op, CBuserdata udata); +} +H5_file_image_memcpy_cb

    +

    public interface H5_file_image_memcpy_cb extends Callbacks { +buf[] callback(buf[] dest, const buf[] src, H5_file_image_op_t file_image_op, CBuserdata +udata); +} +H5_file_image_realloc_cb

    +

    public interface H5_file_image_realloc_cb extends Callbacks { + buf[] callback(buf[] ptr, H5_file_image_op_t file_image_op, CBuserdata udata); +} +H5_file_image_free_cb

    +

    public interface H5_file_image_free_cb extends Callbacks { + void callback(buf[] ptr, H5_file_image_op_t file_image_op, CBuserdata udata); +} +H5_file_udata_copy_cb

    +

    public interface H5_file_udata_copy_cb extends Callbacks { + buf[] callback(CBuserdata udata); +} +H5_file_udata_free_cb

    +

    public interface H5_file_udata_free_cb extends Callbacks { + void callback(CBuserdata udata); +} +H5_file_image_callbacks_t

    +

    public abstract class H5_file_image_callbacks_t +{ + H5_file_image_malloc_cb image_malloc; + H5_file_image_memcpy_cb image_memcpy; + H5_file_image_realloc_cb image_realloc; + H5_file_image_free_cb image_free; + H5_file_udata_copy_cb udata_copy; + H5_file_udata_free_cb udata_free; + CBuserdata udata; + public H5_file_image_callbacks_t( + H5_file_image_malloc_cb image_malloc, + H5_file_image_memcpy_cb image_memcpy, + H5_file_image_realloc_cb image_realloc, + H5_file_image_free_cb image_free, + H5_file_udata_copy_cb udata_copy, + H5_file_udata_free_cb udata_free, + CBuserdata udata) { + this.image_malloc = image_malloc; + this.image_memcpy = image_memcpy; + this.image_realloc = image_realloc; + this.image_free = image_free; + this.udata_copy = udata_copy; + this.udata_free = udata_free; + this.udata = udata; + } +} +H5Pset_file_image_callbacks

    +

    int H5Pset_file_image_callbacks(int fapl_id, + H5_file_image_callbacks_t callbacks_ptr); +H5Pget_file_image_callbacks

    +

    int H5Pget_file_image_callbacks(int fapl_id, + H5_file_image_callbacks_t[] callbacks_ptr); +H5Fget_file_image

    +

    long H5Fget_file_image(int file_id, byte[] buf_ptr);

    +

    6. Fortran Signatures for File Image Operations API Calls

    +

    Potential Fortran function call signatures for the file image operation APIs are described in this section. These have not yet been implemented, and there are no immediate plans for implementation.

    +

    6.1. Low-level Fortran API Routines

    +

    The Fortran low-level APIs make use of Fortran 2003’s ISO_C_BINDING module in order to achieve portable and standard conforming interoperability with the C APIs. The C pointer (C_PTR) and function pointer (C_FUN_PTR) types are returned from the intrinsic procedures C_LOC(X) and C_FUNLOC(X), respectively, defined in the ISO_C_BINDING module. The argument X is the data or function to which the C pointers point to and must have the TARGET attribute in the calling program. Note that the variable name lengths of the Fortran equivalent of the predefined C constants were shortened to less than 31 characters in order to be Fortran standard compliant.

    +

    6.1.1. H5Pset_file_image_f

    +

    The signature of H5Pset_file_image_f is defined as follows:

    +

    SUBROUTINE H5Pset_file_image_f(fapl_id, buf_ptr, buf_len, hdferr) +The parameters of H5Pset_file_image are defined as follows:

    +

    INTEGER(hid_t), INTENT(IN):: fapl_id +Will contain the ID of the target file access property list.

    +

    TYPE(C_PTR), INTENT(IN):: buf_ptr +Will supply the C pointer to the initial file image or C_NULL_PTR if no initial file image is desired.

    +

    INTEGER(size_t), INTENT(IN):: buf_len +Will contain the size of the supplied buffer or 0 if no initial image is desired.

    +

    INTEGER, INTENT(OUT) :: hdferr +Will return the error status: 0 for success and -1 for failure.

    +

    6.1.2. H5Pget_file_image_f

    +

    The signature of H5Pget_file_image_f is defined as follows:

    +

    SUBROUTINE H5Pget_file_image_f(fapl_id, buf_ptr, buf_len, hdferr) +The parameters of H5Pget_file_image_f are defined as follows:

    +

    INTEGER(hid_t), INTENT(IN) :: fapl_id +Will contain the ID of the target file access property list

    +

    TYPE(C_PTR), INTENT(INOUT), VALUE :: buf_ptr +Will hold either a C_NULL_PTR or a scalar of type c_ptr. If buf_ptr is not C_NULL_PTR, on successful return, buf_ptr shall contain a C pointer to a copy of the initial image provided in the last call to H5Pset_file_image_f for the supplied fapl_id, or buf_ptr shall contain a C_NULL_PTR if there is no initial image set. The Fortran pointer can be obtained using the intrinsic C_F_POINTER.

    +

    INTEGER(size_t), INTENT(OUT) :: buf_len +Will contain the value of the buffer parameter for the initial image in the supplied fapl_id. The value will be 0 if no initial image is set.

    +

    INTEGER, INTENT(OUT) :: hdferr +Will return the error status: 0 for success and -1 for failure.

    +

    6.1.3. H5Pset_file_image_callbacks_f

    +

    The signature of H5Pset_file_image_callbacks_f is defined as follows:

    +

    INTEGER :: H5_IMAGE_OP_PROPERTY_LIST_SET_F=0, + H5_IMAGE_OP_PROPERTY_LIST_COPY_F=1, + H5_IMAGE_OP_PROPERTY_LIST_GET_F=2, + H5_IMAGE_OP_PROPERTY_LIST_CLOSE_F=3, + H5_IMAGE_OP_FILE_OPEN_F=4, + H5_IMAGE_OP_FILE_RESIZE_F=5, + H5_IMAGE_OP_FILE_CLOSE_F=6 +TYPE, BIND(C) :: H5_file_image_callbacks_t + TYPE(C_FUN_PTR), VALUE :: image_malloc + TYPE(C_FUN_PTR), VALUE :: image_memcpy + TYPE(C_FUN_PTR), VALUE :: image_realloc + TYPE(C_FUN_PTR), VALUE :: image_free + TYPE(C_FUN_PTR), VALUE :: udata + TYPE(C_FUN_PTR), VALUE :: udata_copy + TYPE(C_FUN_PTR), VALUE :: udata_free + TYPE(C_PTR), VALUE :: udata +END TYPE H5_file_image_callbacks_t +The semantics of the above values will be the same as those defined in the C enum. See Section 2.1.3 for more information.

    +

    Fortran Callback APIs

    +

    The Fortran callback APIs are shown below.

    +

    FUNCTION op_func(size, file_image_op, udata,) RESULT(image_malloc) +INTEGER(size_t) :: size +Will contain the size of the image buffer to allocate in bytes. +INTEGER :: file_image_op +Will be set to one of the values of H5_IMAGE_OP_* indicating the operation being performed on the file image when this callback is invoked. +TYPE(C_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset_file_image_callbacks_f. +TYPE(C_FUN_PTR), VALUE :: image_malloc +Shall contain a pointer to a function with functionality identical to the standard C library memcpy() call.

    +

    FUNCTION op_func(dest, src, size, & file_image_op, udata) RESULT(image_memcpy) +TYPE(C_PTR), VALUE :: dest +Will contain the address of the buffer into which to copy. +TYPE(C_PTR), VALUE :: src +Will contain the address of the buffer from which to copy +INTEGER(size_t) :: size +Will contain the number of bytes to copy. +INTEGER :: file_image_op +Will be set to one of the values of H5_IMAGE_OP_* indicating the operation being performed on the file image when this callback is invoked. +TYPE(C_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset_file_image_callbacks_f. +TYPE(C_FUN_PTR), VALUE :: image_memcpy +Shall contain a pointer to a function with functionality identical to the standard C library memcpy() call.

    +

    FUNCTION op_func(ptr, size, & file_image_op, udata) RESULT(image_realloc) +TYPE(C_PTR), VALUE :: ptr +Will contain the pointer to the buffer being reallocated +INTEGER(size_t) :: size +Will contain the desired size of the buffer after realloc in bytes. +INTEGER :: file_image_op +Will be set to one of the values of H5_IMAGE_OP_* indicating the operation being performed on the file image when this callback is invoked. +TYPE(C_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset_file_image_callbacks_f. +TYPE(C_FUN_PTR), VALUE :: image_realloc +Shall contain a pointer to a unction functionality identical to the standard C library realloc() call.

    +

    FUNCTION op_func(ptr, file_image_op, udata) RESULT(image_free) +TYPE(C_PTR), VALUE :: ptr +Will contain the pointer to the buffer being released. +INTEGER :: file_image_op +Will be set to one of the values of H5_IMAGE_OP_* indicating the operation being performed on the file image when this callback is invoked. +TYPE(C_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset_file_image_callbacks_f. +TYPE(C_PTR), VALUE :: image_free +Shall contain a pointer to a function with functionality identical to the standard C library free() call

    +

    FUNCTION op_func(udata) RESULT(udata_copy) +TYPE(C_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset_file_image_callbacks_f. +TYPE(C_FUN_PTR), VALUE :: udata_copy +Shall contain a pointer to a function that will allocate a buffer of suitable size, copy the contents of the supplied udata into the new buffer, and return the address of the new buffer. The function will return C_NULL_PTR on failure.

    +

    FUNCTION op_func(udata) RESULT(udata_free) +TYPE(C_PTR), VALUE :: udata +Shall contain a pointer value, potentially to user-defined data, that will be passed to the image_malloc, image_memcpy, image_realloc, and image_free callbacks.

    +

    The signature of H5Pset_file_image_callbacks_f is defined as follows:

    +

    SUBROUTINE H5Pset_file_image_callbacks_f(fapl_id, &callbacks_ptr, hdferr) +The parameters are defined as follows:

    +

    INTEGER(hid_t), INTENT(IN) :: fapl_id +Will contain the ID of the target file access property list.

    +

    TYPE(H5_file_image_callbacks_t), INTENT(IN) :: callbacks_ptr +Will contain the callback derived type. callbacks_ptr shall contain a pointer to the Fortran function via the intrinsic functions C_LOC(X) and C_FUNLOC(X).

    +

    INTEGER, INTENT(OUT) :: hdferr +Will return the error status: 0 for success and -1 for failure.

    +

    6.1.4. H5Pget_file_image_callbacks_f

    +

    The H5Pget_file_image_callbacks_f routine is designed to obtain the current file image callbacks from a file access property list.

    +

    The signature is defined as follows

    +

    SUBROUTINE H5Pget_file_image_callbacks_f(fapl_id, callbacks_ptr, hdferr) +The parameters are defined as follows:

    +

    INTEGER(hid_t), INTENT(IN) :: fapl_id +Will contain the ID of the target file access property list.

    +

    TYPE(H5_file_image_callbacks_t), INTENT(OUT) :: callbacks_ptr +Will contain the callback derived type. Each member of the derived type shall have the same meaning as its C counterpart. See section 2.1.4 for more information.

    +

    INTEGER, INTENT(OUT) :: hdferr +Will return the error status: 0 for success and -1 for failure.

    +

    6.1.5. Fortran Virtual File Driver Feature Flags

    +

    Implementation of the H5Pget/set_file_image_callbacks_f() and H5Pget/set_file_image_f() APIs requires a pair of new virtual file driver feature flags:

    +

    H5FD_FEAT_LET_IMAGE_F +H5FD_FEAT_LET_IMAGE_CALLBACK_F +See the “Virtual File Driver Feature Flags” section for more information.

    +

    6.1.6. H5Fget_file_image_f

    +

    The signature of H5Fget_file_image_f shall be defined as follows:

    +

    SUBROUTINE H5Fget_file_image_f(file_id, buf_ptr, buf_len, hdferr, buf_size) +The parameters of H5Fget_file_image_f are defined as follows:

    +

    INTEGER(hid_t), INTENT(IN) :: file_id +Will contain the ID of the target file.

    +

    TYPE(C_PTR), INTENT(IN) :: buf_ptr +Will contain a C pointer to the buffer into which the image of the HDF5 file is to be copied. If buf_ptr is C_NULL_PTR, no data will be copied.

    +

    INTEGER(size_t), INTENT(IN) :: buf_len +Will contain the size in bytes of the supplied buffer.

    +

    INTEGER(ssizet_t), INTENT(OUT), OPTIONAL :: buf_size +Will indicate the buffer size required to store the file image (in other words, the length of the file). If only the buf_size is needed, then buf_ptr should be also be set to C_NULL_PTR

    +

    INTEGER, INTENT(OUT) :: hdferr +Returns the error status: 0 for success and -1 for failure.

    +

    See the “H5Fget_file_image” section for more information.

    +

    6.2. High-level Fortran API Routine +The new Fortran high-level routine H5LTopen_file_image_f will provide a wrapper for the high-level H5LTopen_file_image function. Consequently, the high-level Fortran API will not be implemented using low-level HDF5 Fortran APIs. +6.2.1. H5LTopen_file_image_f +The signature of H5LTopen_file_image_f is defined as follows:

    +

    SUBROUTINE H5LTopen_file_image_f(buf_ptr, buf_len, flags, file_id, hdferr) +The parameters of H5LTopen_file_image_f are defined as follows:

    +

    TYPE(C_PTR), INTENT(IN), VALUE :: buf_ptr +Will contain a pointer to the supplied initial image. A C_NULL_PTR value is invalid and will cause H5LTopen_file_image_f to fail.

    +

    INTEGER(size_t), INTENT(IN) :: buf_len +Will contain the size of the supplied buffer. A value of 0 is invalid and will cause H5LTopen_file_image_f to fail.

    +

    INTEGER, INTENT(IN) :: flags +Will contain a set of flags indicating whether the image is to be opened read/write, whether HDF5 is to take control of the buffer, and how long the application promises to maintain the buffer. Possible flags are as follows: H5LT_IMAGE_OPEN_RW_F, H5LT_IMAGE_DONT_COPY_F, and H5LT_IMAGE_DONT_RELEASE_F. The C equivalent flags are defined in the “H5LTopen_file_image” section.

    +

    INTEGER(hid_t), INTENT(IN) :: file_id +Will be a file ID on success.

    +

    INTEGER, INTENT(OUT) :: hdferr +Returns the error status: 0 for success and -1 for failure.

    diff --git a/documentation/hdf5-docs/advanced_topics/file_image_ops.md b/documentation/hdf5-docs/advanced_topics/file_image_ops.md new file mode 100644 index 00000000..8b61252b --- /dev/null +++ b/documentation/hdf5-docs/advanced_topics/file_image_ops.md @@ -0,0 +1,1304 @@ +--- +title: HDF5 File Image Operations +redirect_from: + - /display/HDF5/HDF5+File+Image+Operations +--- +## \*\*\* UNDER CONSTRUCTION \*\*\* + +# HDF5 File Image Operations + +**[1. Introduction to HDF5 File Image Operations](#introduction-to-hdf5-file-image-operations)** + + * [File Image Operations Function Summary](#file-image-operations-function-summary) + * [Abbreviations](#abbreviations) + * [Developer Prerequisites](#developer-prerequisites) + * [Resources](#resources) + +**[2. C API Call Syntax](#c-api-call-syntax)** + + [2.1. Low-level C API Routines](#low-level-c-api-routines) + + * [H5Pset\_file\_image](#h5pset_file_image)
    + * [H5Pget\_file\_image](#h5pget_file_image)
    + * [H5Pset\_file\_image\_callbacks](#h5pset_file_image_callbacks)
    + * [H5Pget\_file\_image\_callbacks](#h5pget_file_image_callbacks)
    + * [Virtual File Driver Feature Flags](#virtual-file-driver-feature-flags)
    + * [H5Fget\_file\_image](#h5fget_file_image) + + [2.2. High-level C API Routine](#high-level-c-api-routine])
    + * [H5LTopen\_file\_image](#h5ltopen_file_image) + +**[3. C API Call Semantics](#c-api-call-semantics)** + + [3.1. File Image Callback Semantics](#file-image-callback-semantics) + + * [Buffer Ownership](#buffer-ownership)
    + * [Sharing a File image Buffer with the HDF5 Library](#sharing-a-file-image-buffer-with-the-hdf5-library)
    + * [File Driver Considerations](#file-driver-considerations) + + [3.2. Initial File Image Semantics](#initial-file-image-semantics)
    + * [Applying Initial File Image Semantics to the Core File Driver](#applying-initial-file-image-semantics) + +**[4. Examples](#examples)** + + [4.1. Reading an In-memory HDF5 File Image](#reading-an-in-memory-hdf5-file-image)
    + [4.2. In-memory HDF5 File Image Construction](#in-memory-hdf5-file-image-construction)
    + [4.3. Using HDF5 to Construct and Read a Data Packet](#using-hdf5-to-construct-and-read-a-data-packet)
    + [4.4. Using a Template File](#using-template-file)
    + +**[5. Java Signatures for File Image Operations API Calls](#java-signatures-for-file-image-operations-api-calls)** + +**[6. Fortran Signatures for File Image Operations API Calls](#fortran-signatures-for-file-image-operations-api-calls)** + + [6.1. Low-level Fortran API Routines](#) + + * [H5Pset\_file\_image\_f](#)
    + * [H5Pget\_file\_image\_f](#)
    + * [H5Pset\_file\_image\_callbacks\_f](#)
    + * [H5Pget\_file\_image\_callbacks\_f](#)
    + * [Fortran Virtual File Driver Feature Flags](#)
    + * [H5Fget\_file\_image\_f](#) + + [6.2. High-level Fortran API Routine](#) + + * [H5LTopen\_file\_image\_f](#) + + +## 1. Introduction to HDF5 File Image Operations +File image operations allow users to work with HDF5 files in memory in the same ways that users currently work with HDF5 files on disk. Disk I/O is not required when file images are opened, created, read from, or written to. + +An HDF5 file image is an HDF5 file that is held in a buffer in main memory. Setting up a file image in memory involves using either a buffer in the file access property list or a buffer in the Core (aka Memory) file driver. + +The advantage of working with a file in memory is faster access to the data. + +The challenge of working with files in memory buffers is maximizing performance and minimizing memory footprint while working within the constraints of the property list mechanism. This should be a non-issue for small file images, but may be a major issue for large images. + +If invoked with the appropriate flags, the H5LTopen\_file\_image() high level library call should deal with these challenges in most cases. However, some applications may require the programmer to address these issues directly. + +### 1.1. File Image Operations Function Summary +Functions used in file image operations are listed below. + +Function Listing 1. File image operations functions + +| C Function | Purpose | +| ---------- | --------- | +| H5Pset\_file\_image | Allows an application to specify an initial file image. For more information, see section 2.1.1. | +| H5Pget\_file\_image | Allows an application to retrieve a copy of the file image designated for a VFD to use as the initial contents of a file. For more information, see section 2.1.2. | +| H5Pset\_file\_image\_callbacks | Allows an application to manage file image buffer allocation, copying, reallocation, and release. For more information, see section 2.1.3. | +| H5Pget\_file\_image\_callbacks | Allows an application to obtain the current file image callbacks from a file access property list. For more information, see section 2.1.4. | +| H5Fget\_file\_image | Provides a simple way to retrieve a copy of the image of an existing, open file. For more information, see section 2.1.6. | +| H5LTopen\_file\_image | Provides a convenient way to open an initial file image with the Core VFD. For more information, see section 2.2.1. | + +### 1.2. Abbreviations + +The abbreviations in Table 1 are used in this document. + +Table 1. Abbreviations + +| Abbreviation | This abbreviation is short for | +| ------------ | ------------------------------ | +| FAPL or fapl | File Access Property List. In code samples, fapl is used. | +| VFD | Virtual File Driver | +| VFL | Virtual File Layer | + +### 1.3. Developer Prerequisites +Developers who use the file image operations described in this document should be proficient and experienced users of the HDF5 C Library APIs. More specifically, developers should have a working knowledge of property lists, callbacks, and virtual file drivers. + +### 1.4. Resources +See the following for more information. + +The [RFC: File Image Operations](https://docs.hdfgroup.org/hdf5/rfc/HDF5FileImageOperations.pdf) is the primary source for the information in this document. + +The [Alternate File Storage Layouts and Low-level File Drivers](https://docs.hdfgroup.org/hdf5/develop/_h5_f__u_g.html#subsec_file_alternate_drivers) section is in “The HDF5 File” chapter of the [HDF5 User’s Guide](https://docs.hdfgroup.org/hdf5/develop/_u_g.html). + +The H5P\_SET\_FAPL\_CORE function call can be used to modify the file access property list so that the Memory virtual file driver, H5FD\_ CORE, is used. The Memory file driver is also known as the Core file driver. + +Links to the [Virtual File Layer](https://docs.hdfgroup.org/hdf5/develop/_v_f_l.html) and List of VFL Functions documents can be found in the HDF5 Technical Notes. + +## 2. C API Call Syntax +The C API function calls described in this chapter fall into two categories: low-level routines that are part of the main HDF5 C Library and one high-level routine that is part of the “lite” API in the high-level wrapper library. The high-level routine uses the low-level routines and presents frequently requested functionality conveniently packaged for application developers’ use. + +### 2.1. Low-level C API Routines +The purpose of this section is to describe the low-level C API routines that support file image operations. These routines allow an in-memory image of an HDF5 file to be opened without requiring file system I/O. + +The basic approach to opening an in-memory image of an HDF5 file is to pass the image to the Core file driver, and then tell the Core file driver to open the file. We do this by using the H5Pget/set\_file\_image calls. These calls allow the user to specify an initial file image. + +A potential problem with the H5Pget/set\_file\_image calls is the overhead of allocating and copying of large file image buffers. The callback routines enable application programs to avoid this problem. However, the use of these callbacks is complex and potentially hazardous: the particulars are discussed in the semantics and examples chapters below (see section 3.1 and section 4.1 respectively). Fortunately, use of the file image callbacks should seldom be necessary: the H5LTopen\_file\_image call should address most use cases. + +The property list facility in HDF5 is employed in file image operations. This facility was designed for passing data, not consumable resources, into API calls. The peculiar ways in which the file image allocation callbacks may be used allows us to avoid extending the property list structure to handle consumable resources cleanly and to avoid constructing a new facility for the purpose. + +The sub-sections below describe the low-level C APIs that are used with file image operations. + +#### 2.1.1. H5Pset\_file\_image +The H5Pset\_file\_image routine allows an application to provide an image for a file driver to use as the initial contents of the file. This call was designed initially for use with the Core VFD, but it can be used with any VFD that supports using an initial file image when opening a file. See the “Virtual File Driver Feature Flags” section for more information. Calling this routine makes a copy of the provided file image buffer. See the “H5Pset\_file\_image\_callbacks” section for more information. + +The signature of H5Pset\_file\_image is defined as follows: + +herr\_t H5Pset\_file\_image(hid\_t fapl\_id, void \*buf\_ptr, size\_t buf\_len) +The parameters of H5Pset\_file\_image are defined as follows: + +fapl\_id contains the ID of the target file access property list. +buf\_ptr supplies a pointer to the initial file image, or NULL if no initial file image is desired. +buf\_len contains the size of the supplied buffer, or 0 if no initial image is desired. +If either the buf\_len parameter is zero, or the buf\_ptr parameter is NULL, no file image will be set in the FAPL, and any existing file image buffer in the FAPL will be released. If a buffer is released, the FAPL’s file image buf\_len will be set to 0 and buf\_ptr will be set to NULL. + +Given the tight interaction between the file image callbacks and the file image, the file image callbacks in a property list cannot be changed while a file image is defined. + +With properly constructed file image callbacks, it is possible to avoid actually copying the file image. The particulars of this are discussed in greater detail in the “C API Call Semantics” chapter and in the “Examples” chapter. + +#### 2.1.2. H5Pget\_file\_image +The H5Pget\_file\_image routine allows an application to retrieve a copy of the file image designated for a VFD to use as the initial contents of a file. This routine uses the file image callbacks (if defined) when allocating and loading the buffer to return to the application, or it uses malloc and memcpy if the callbacks are undefined. When malloc and memcpy are used, it will be the caller’s responsibility to discard the returned buffer via a call to free. + +The signature of H5Pget\_file\_image is defined as follows: + +herr\_t H5Pget\_file\_image(hid\_t fapl\_id, void \*\*buf\_ptr\_ptr, size\_t \*buf\_len\_ptr) +The parameters of H5Pget\_file\_image are defined as follows: + +fapl\_id contains the ID of the target file access property list. +buf\_ptr\_ptr contains a NULL or a pointer to a void\*. If buf\_ptr\_ptr is not NULL, on successful return, \*buf\_ptr\_ptr will contain a pointer to a copy of the initial image provided in the last call to H5Pset\_file\_image for the supplied fapl\_id. If no initial image has been set, \*buf\_ptr\_ptr will be NULL. +buf\_len\_ptr contains a NULL or a pointer to size\_t. If buf\_len\_ptr is not NULL, on successful return, \*buf\_len\_ptr will contain the value of the buf\_len parameter for the initial image in the supplied fapl\_id. If no initial image is set, the value of \*buf\_len\_ptr will be 0. +As with H5Pset\_file\_image, appropriately defined file image callbacks can allow this function to avoid buffer allocation and memory copy operations. + +#### 2.1.3. H5Pset\_file\_image\_callbacks +The H5Pset\_file\_image\_callbacks API call exists to allow an application to control the management of file image buffers through user defined callbacks. These callbacks will be used in the management of file image buffers in property lists and in select file drivers. These routines are invoked when a new file image buffer is allocated, when an existing file image buffer is copied or resized, or when a file image buffer is released from use. From the perspective of the HDF5 Library, the operations of the image\_malloc, image\_memcpy, image\_realloc, and image\_free callbacks must be identical to those of the corresponding C standard library calls (malloc, memcpy, realloc, and free). While the operations must be identical, the file image callbacks have more parameters. The callbacks and their parameters are described below. The return values of image\_malloc and image\_realloc are identical to the return values of malloc and realloc. However, the return values of image\_memcpy and image\_free are different than the return values of memcpy and free: the return values of image\_memcpy and image\_free can also indicate failure. See the “File Image Callback Semantics” section for more information. + +The signature of H5Pset\_file\_image\_callbacks is defined as follows: + +typedef enum +{ + H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_SET, + H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_COPY, + H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_GET, + H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_CLOSE, + H5\_FILE\_IMAGE\_OP\_FILE\_OPEN, + H5\_FILE\_IMAGE\_OP\_FILE\_RESIZE, + H5\_FILE\_IMAGE\_OP\_FILE\_CLOSE +} H5\_file\_image\_op\_t; + +typedef struct +{ + void \*(\*image\_malloc)(size\_t size, H5\_file\_image\_op\_t file\_image\_op, + void \*udata); + void \*(\*image\_memcpy)(void \*dest, const void \*src, size\_t size, + H5\_file\_image\_op\_t file\_image\_op, void \*udata); + void \*(\*image\_realloc)(void \*ptr, size\_t size, + H5\_file\_image\_op\_t file\_image\_op, void \*udata); + herr\_t (\*image\_free)(void \*ptr, H5\_file\_image\_op\_t file\_image\_op, + void \*udata); + void \*(\*udata\_copy)(void \*udata); + herr\_t (\*udata\_free)(void \*udata); + void \*udata; +} H5\_file\_image\_callbacks\_t; + +herr\_t H5Pset\_file\_image\_callbacks(hid\_t fapl\_id, + H5\_file\_image\_callbacks\_t \*callbacks\_ptr) +The parameters of H5Pset\_file\_image\_callbacks are defined as follows: + +fapl\_id contains the ID of the target file access property list. +callbacks\_ptr contains a pointer to an instance of the H5\_file\_image\_callbacks\_t structure. +The fields of the H5\_file\_image\_callbacks\_t structure are defined as follows: + +image\_malloc contains a pointer to a function with (from the perspective of HDF5) functionality identical to the standard C library malloc() call. The parameters of the image\_malloc callback are defined as follows: +size contains the size in bytes of the image buffer to allocate. +file\_image\_op contains one of the values of H5\_file\_image\_op\_t. These values indicate the operation being performed on the file image when this callback is invoked. Possible values for file\_image\_op are discussed in Table 2. +udata holds the value passed in for the udata parameter to H5Pset\_file\_image\_callbacks. +Setting image\_malloc to NULL indicates that the HDF5 Library should invoke the standard C library malloc() routine when allocating file image buffers. + +image\_memcpy contains a pointer to a function with (from the perspective of HDF5) functionality identical to the standard C library memcpy() call except that it returns NULL on failure. Recall that the memcpy C Library routine is defined to return the dest parameter in all cases. The parameters of the image\_memcpy callback are defined as follows: +dest contains the address of the destination buffer. +src contains the address of the source buffer. +size contains the number of bytes to copy. +file\_image\_op contains one of the values of H5\_file\_image\_op\_t. These values indicate the operation being performed on the file image when this callback is invoked. Possible values for file\_image\_op are discussed in Table 2. +udata holds the value passed in for the udata parameter to H5Pset\_file\_image\_callbacks. +Setting image\_memcpy to NULL indicates that the HDF5 Library should invoke the standard C library memcpy() routine when copying buffers. + +image\_realloc contains a pointer to a function with (from the perspective of HDF5) functionality identical to the standard C library realloc() call. The parameters of the image\_realloc callback are defined as follows: +ptr contains the pointer to the buffer being reallocated. +size contains the desired size in bytes of the buffer after realloc. +file\_image\_op contains one of the values of H5\_file\_image\_op\_t. These values indicate the operation being performed on the file image when this callback is invoked. Possible values for file\_image\_op are discussed in Table 2. +udata holds the value passed in for the udata parameter to H5Pset\_file\_image\_callbacks. +Setting image\_realloc to NULL indicates that the HDF5 Library should invoke the standard C library realloc() routine when resizing file image buffers. + +image\_free contains a pointer to a function with (from the perspective of HDF5) functionality identical to the standard C library free() call except that it will return 0 (SUCCEED) on success and -1 (FAIL) on failure. The parameters of the image\_free callback are defined as follows: +ptr contains the pointer to the buffer being released. +file\_image\_op contains one of the values of H5\_file\_image\_op\_t. These values indicate the operation being performed on the file image when this callback is invoked. Possible values for file\_image\_op are discussed in Table 2 . +udata holds the value passed in for the udata parameter to H5Pset\_file\_image\_callbacks. +Setting image\_free to NULL indicates that the HDF5 Library should invoke the standard C library free() routine when releasing file image buffers. + +udata\_copy contains a pointer to a function that (from the perspective of HDF5) allocates a buffer of suitable size, copies the contents of the supplied udata into the new buffer, and returns the address of the new buffer. The function returns NULL on failure. This function is necessary if a non-NULL udata parameter is supplied, so that property lists containing the image callbacks can be copied. If the udata parameter (below) is NULL, then this parameter should be NULL as well. The parameter of the udata\_copy callback is defined as follows: +udata contains the pointer to the user data block being copied. +udata\_free contains a pointer to a function that (from the perspective of HDF5) frees a user data block. This function is necessary if a non-NULL udata parameter is supplied so that property lists containing image callbacks can be discarded without a memory leak. If the udata parameter (below) is NULL, this parameter should be NULL as well. The parameter of the udata\_free callback is defined as follows: +udata contains the pointer to the user data block to be freed. +udata\_free returns 0 (SUCCEED) on success and -1 (FAIL) on failure. + +udata contains a pointer value, potentially to user-defined data, that will be passed to the image\_malloc, image\_memcpy, image\_realloc, and image\_free callbacks. +The semantics of the values that can be set for the file\_image\_op parameter to the above callbacks are described in the table below: + +Table 2. Values for the file\_image\_op parameter + +H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_SET This value is passed to the image\_malloc and image\_memcpy callbacks when an image buffer is being copied while being set in a FAPL. +H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_COPY This value is passed to the image\_malloc and image\_memcpy callbacks when an image buffer is being copied when a FAPL is copied. +H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_GET This value is passed to the image\_malloc and image\_memcpy callbacks when an image buffer is being copied while being retrieved from a FAPL +H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_CLOSE This value is passed to the image\_free callback when an image buffer is being released during a FAPL close operation. +H5\_FILE\_IMAGE\_OP\_FILE\_OPEN This value is passed to the image\_malloc and image\_memcpy callbacks when an image buffer is copied during a file open operation. While the image being opened will typically be copied from a FAPL, this need not always be the case. An example of an exception is when the Core file driver takes its initial image from a file. +H5\_FILE\_IMAGE\_OP\_FILE\_RESIZE This value is passed to the image\_realloc callback when a file driver needs to resize an image buffer. +H5\_FILE\_IMAGE\_OP\_FILE\_CLOSE This value is passed to the image\_free callback when an image buffer is being released during a file close operation. + + +In closing our discussion of H5Pset\_file\_image\_callbacks(), we note the interaction between this call and the H5Pget/set\_file\_image() calls above: since the malloc, memcpy, and free callbacks defined in the instance of H5\_file\_image\_callbacks\_t are used by H5Pget/set\_file\_image(), H5Pset\_file\_image\_callbacks() will fail if a file image is already set in the target property list. + +For more information on writing the file image to disk, set the backing\_store parameter. See the H5Pset\_fapl\_core entry in the HDF5 Reference Manual. + +#### 2.1.4. H5Pget\_file\_image\_callbacks +The H5Pget\_file\_image\_callbacks routine is designed to obtain the current file image callbacks from a file access property list. + +The signature of H5Pget\_file\_image\_callbacks() is defined as follows: + +herr\_t H5Pget\_file\_image\_callbacks(hid\_t fapl\_id, + H5\_file\_image\_callbacks\_t \*callbacks\_ptr) +The parameters of H5Pget\_file\_image\_callbacks are defined as follows: + +fapl\_id contains the ID of the target file access property list. +callbacks\_ptr contains a pointer to an instance of the H5\_file\_image\_callbacks\_t structure. All fields should be initialized to NULL. See the “H5Pset\_file\_image\_callbacks” section for more information on the H5\_file\_image\_callbacks\_t structure. +Upon successful return, the fields of \*callbacks\_ptr shall contain values as defined below: + +Upon successful return, callbacks\_ptr->image\_malloc will contain the pointer passed as the image\_malloc field of the instance of H5\_file\_image\_callbacks\_t pointed to by the callbacks\_ptr parameter of the last call to H5Pset\_file\_image\_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks\_ptr->image\_memcpy will contain the pointer passed as the image\_memcpy field of the instance of H5\_file\_image\_callbacks\_t pointed to by the callbacks\_ptr parameter of the last call to H5Pset\_file\_image\_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks\_ptr->image\_realloc will contain the pointer passed as the image\_realloc field of the instance of H5\_file\_image\_callbacks\_t pointed to by the callbacks\_ptr parameter of the last call to H5Pset\_file\_image\_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks\_ptr->image\_free\_ptr will contain the pointer passed as the image\_free field of the instance of H5\_file\_image\_callbacks\_t pointed to by the callbacks\_ptr parameter of the last call to H5Pset\_file\_image\_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks\_ptr->udata\_copy will contain the pointer passed as the udata\_copy field of the instance of H5\_file\_image\_callbacks\_t pointed to by the callbacks\_ptr parameter of the last call to H5Pset\_file\_image\_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks\_ptr-> udata\_free will contain the pointer passed as the udata\_free field of the instance of H5\_file\_image\_callbacks\_t pointed to by the callbacks\_ptr parameter of the last call to H5Pset\_file\_image\_callbacks() for the specified FAPL, or NULL if there has been no such call. +Upon successful return, callbacks\_ptr->udata will contain the pointer passed as the udata field of the instance of H5\_file\_image\_callbacks\_t pointed to by the callbacks\_ptr parameter of the last call to H5Pset\_file\_image\_callbacks() for the specified FAPL, or NULL if there has been no such call. +2.1.5. Virtual File Driver Feature Flags +Implementation of the H5Pget/set\_file\_image\_callbacks() and H5Pget/set\_file\_image() function calls requires a pair of virtual file driver feature flags. The flags are H5FD\_FEAT\_ALLOW\_FILE\_IMAGE and H5FD\_FEAT\_CAN\_USE\_FILE\_IMAGE\_CALLBACKS. Both of these are defined in H5FDpublic.h. + +The first flag, H5FD\_FEAT\_ALLOW\_FILE\_IMAGE, allows a file driver to indicate whether or not it supports file images. A VFD that sets this flag when its ‘query’ callback is invoked indicates that the file image set in the FAPL will be used as the initial contents of a file. Support for setting an initial file image is designed primarily for use with the Core VFD. However, any VFD can indicate support for this feature by setting the flag and copying the image in an appropriate way for the VFD (possibly by writing the image to a file and then opening the file). However, such a VFD need not employ the file image after file open time. In such cases, the VFD will not make an in-memory copy of the file image and will not employ the file image callbacks. + +File drivers that maintain a copy of the file in memory (only the Core file driver at present) can be constructed to use the initial image callbacks (if defined). Those that do must set the H5FD\_FEAT\_CAN\_USE\_FILE\_IMAGE\_CALLBACKS flag, the second flag, when their ‘query’ callbacks are invoked. + +Thus file drivers that set the H5FD\_FEAT\_ALLOW\_FILE\_IMAGE flag but not the H5FD\_FEAT\_CAN\_USE\_FILE\_IMAGE\_CALLBACKS flag may read the supplied image from the property list (if present) and use it to initialize the contents of the file. However, they will not discard the image when done, nor will they make any use of any file image callbacks (if defined). + +If an initial file image appears in a file allocation property list that is used in an H5Fopen() call, and if the underlying file driver does not set the H5FD\_FEAT\_ALLOW\_FILE\_IMAGE flag, then the open will fail. + +If a driver sets both the H5FD\_FEAT\_ALLOW\_FILE\_IMAGE flag and the H5FD\_FEAT\_CAN\_USE\_FILE\_IMAGE\_CALLBACKS flag, then that driver will allocate a buffer of the required size, copy the contents of the initial image buffer from the file access property list, and then open the copy as if it had just loaded it from file. If the file image allocation callbacks are defined, the driver shall use them for all memory management tasks. Otherwise it will use the standard malloc, memcpy, realloc, and free C library calls for this purpose. + +If the VFD sets the H5FD\_FEAT\_ALLOW\_FILE\_IMAGE flag, and an initial file image is defined by an application, the VFD should ensure that file creation operations (as opposed to file open operations) bypass use of the file image, and create a new, empty file. + +Finally, it is logically possible that a file driver would set the H5FD\_FEAT\_CAN\_USE\_FILE\_IMAGE\_CALLBACKS flag, but not the H5FD\_FEAT\_ALLOW\_FILE\_IMAGE flag. While it is hard to think of a situation in which this would be desirable, setting the flags this way will not cause any problems: the two capabilities are logically distinct. + + + +#### 2.1.6. H5Fget\_file\_image +The purpose of the H5Fget\_file\_image routine is to provide a simple way to retrieve a copy of the image of an existing, open file. This routine can be used with files opened using the SEC2 (aka POSIX), STDIO, and Core (aka Memory) VFDs. + +The signature of H5Fget\_file\_image is defined as follows: + +ssize\_t H5Fget\_file\_image(hid\_t file\_id, void \*buf\_ptr, size\_t buf\_len) +The parameters of H5Fget\_file\_image are defined as follows: + +file\_id contains the ID of the target file. +buf\_ptr contains a pointer to the buffer into which the image of the HDF5 file is to be copied. If buf\_ptr is NULL, no data will be copied, but the return value will still indicate the buffer size required (or a negative value on error). +buf\_len contains the size of the supplied buffer. +If the return value of H5Fget\_file\_image is a positive value, then the value will be the length of buffer required to store the file image (in other words, the length of the file). A negative value might be returned if the file is too large to store in the supplied buffer or on failure. + +The current file size can be obtained via a call to H5Fget\_filesize(). Note that this function returns the value of the end of file (EOF) and not the end of address space (EOA). While these values are frequently the same, it is possible for the EOF to be larger than the EOA. Since H5Fget\_file\_image() will only obtain a copy of the file from the beginning of the superblock to the EOA, it will be best to use H5Fget\_file\_image() to determine the size of the buffer required to contain the image. + +Other Design Considerations + +Here are some other notes regarding the design and implementation of H5Fget\_file\_image. + +The H5Fget\_file\_image call should be part of the high-level library. However, a file driver agnostic implementation of the routine requires access to data structures that are hidden within the HDF5 Library. We chose to implement the call in the library proper rather than expose those data structures. + +There is no reason why the H5Fget\_file\_image() API call could not work on files opened with any file driver. However, the Family, Multi, and Split file drivers have issues that make the call problematic. At present, files opened with the Family file driver are marked as being created with that file driver in the superblock, and the HDF5 Library refuses to open files so marked with any other file driver. This negates the purpose of the H5Fget\_file\_image() call. While this mark can be removed from the image, the necessary code is not trivial. + +Thus we will not support the Family file driver in H5Fget\_file\_image() unless there is demand for it. Files created with the Multi and Split file drivers are also marked in the superblock. In addition, they typically use a very sparse address space. A sparse address space would require the use of an impractically large buffer for an image, and most of the buffer would be empty. So, we see no point in supporting the Multi and Split file drivers in H5Fget\_file\_image() under any foreseeable circumstances. + + + +### 2.2. High-level C API Routine +The H5LTopen\_file\_image high-level routine encapsulates the capabilities of routines in the main HDF5 Library with conveniently accessible abstractions. + +#### 2.2.1. H5LTopen\_file\_image +The H5LTopen\_file\_image routine is designed to provide an easier way to open an initial file image with the Core VFD. Flags to H5LTopen\_file\_image allow for various file image buffer ownership policies to be requested. See the HDF5 Reference Manual for more information on high-level APIs. + +The signature of H5LTopen\_file\_image is defined as follows: + +hid\_t H5LTopen\_file\_image(void \*buf\_ptr, size\_t buf\_len, unsigned flags) +The parameters of H5LTopen\_file\_image are defined as follows: + +buf\_ptr contains a pointer to the supplied initial image. A NULL value is invalid and will cause H5LTopen\_file\_image to fail. +buf\_len contains the size of the supplied buffer. A value of 0 is invalid and will cause H5LTopen\_file\_image to fail. +flags contains a set of flags indicating whether the image is to be opened read/write, whether HDF5 is to take control of the buffer, and how long the application promises to maintain the buffer. Possible flags are described in the table below: +Table 3. Flags for H5LTopen\_file\_image + +H5LT\_FILE\_IMAGE\_OPEN\_RW Indicates that the HDF5 Library should open the image read/write instead of the default read-only. +H5LT\_FILE\_IMAGE\_DONT\_COPY +Indicates that the HDF5 Library should not copy the file image buffer provided, but should use it directly. The HDF5 Library will release the file image when finished. The supplied buffer must have been allocated via a call to the standard C library malloc() or calloc() routines. The HDF5 Library will call free() to release the buffer. In the absence of this flag, the HDF5 Library will copy the buffer provided. The H5LT\_FILE\_IMAGE\_DONT\_COPY flag provides an application with the ability to “give ownership” of a file image buffer to the HDF5 Library. + +The HDF5 Library will modify the buffer on write if the image is opened read/write and the H5LT\_FILE\_IMAGE\_DONT\_COPY flag is set. + +The H5LT\_FILE\_IMAGE\_DONT\_RELEASE flag, see below, is invalid unless the H5LT\_FILE\_IMAGE\_DONT\_COPY flag is set + +H5LT\_FILE\_IMAGE\_DONT\_RELEASE +Indicates that the HDF5 Library should not attempt to release the buffer when the file is closed. This implies that the application will tend to this detail and that the application will not discard the buffer until after the file image is closed. + +Since there is no way to return a changed buffer base address to the application, and since realloc can change this value, calls to realloc() must be barred when this flag is set. As a result, any write that requires an increased buffer size will fail. + +This flag is invalid unless the H5LT\_FILE\_IMAGE\_DONT\_COPY flag, see above, is set. + +If the H5LT\_FILE\_IMAGE\_DONT\_COPY flag is set and this flag is not set, the HDF5 Library will release the file image buffer after the file is closed using the standard C library free() routine. + +Using this flag and the H5LT\_FILE\_IMAGE\_DONT\_COPY flag provides a way for the application to specify a buffer that the HDF5 Library can use for opening and accessing as a file image while letting the application retain ownership of the buffer. + +The following table is intended to summarize the semantics of the H5LT\_FILE\_IMAGE\_DONT\_COPY and H5LT\_FILE\_IMAGE\_DONT\_RELEASE flags (shown as “Don’t Copy Flag” and “Don’t Release Flag” respectively in the table): + +Table 4. Summary of Don’t Copy and Don’t Release Flag Actions + +Don’t Copy Flag + +Don’t Release Flag + +Make Copy of User Supplied Buffer + +Pass User Supplied Buffer to File Driver + +Release User Supplied Buffer When Done + +Permit realloc of Buffer Used by File Driver + +False + +Don’t care + +True + +False + +False + +True + +True + +False + +False + +True + +True + +True + +True + +True + +False + +True + +False + +False + +The return value of H5LTopen\_file\_image will be a file ID on success or a negative value on failure. The file ID returned should be closed with H5Fclose. + +Note that there is no way currently to specify a “backing store” file name in this definition of H5LTopen\_image. + + + +## 3. C API Call Semantics +The purpose of this chapter is to describe some issues that developers should consider when using file image buffers, property lists, and callback APIs. + +### 3.1. File Image Callback Semantics +The H5Fget/set\_file\_image\_callbacks() API calls allow an application to hook the memory management operations used when allocating, duplicating, and discarding file images in the property list, in the Core file driver, and potentially in any in-memory file driver developed in the future. + +From the perspective of the HDF5 Library, the supplied image\_malloc(), image\_memcpy(), image\_realloc(), and image\_free() callback routines must function identically to the C standard library malloc(), memcpy(), realloc(), and free() calls. What happens on the application side can be much more nuanced, particularly with the ability to pass user data to the callbacks. However, whatever the application does with these calls, it must maintain the illusion that the calls have had the expected effect. Maintaining this illusion requires some understanding of how the property list structure works, and what HDF5 will do with the initial images passed to it. + +At the beginning of this document, we talked about the need to work within the constraints of the property list mechanism. When we said “from the perspective of the HDF5 Library…” in the paragraph above, we are making reference to this point. + +The property list mechanism was developed as a way to add parameters to functions without changing the parameter list and breaking existing code. However, it was designed to use only “call by value” semantics, not “call by reference”. The decision to use “call by value” semantics requires that the values of supplied variables be copied into the property list. This has the advantage of simplifying the copying and deletion of property lists. However, if the value to be copied is large (say a 2 GB file image), the overhead can be unacceptable. + +The usual solution to this problem is to use “call by reference” where only a pointer to an object is placed in a parameter list rather than a copy of the object itself. However, use of “call by reference” semantics would greatly complicate the property list mechanism: at a minimum, it would be necessary to maintain reference counts to dynamically allocated objects so that the owner of the object would know when it was safe to free the object. + +After much discussion, we decided that the file image operations calls were sufficiently specialized that it made no sense to rework the property list mechanism to support “call by reference.” Instead we provided the file image callback mechanism to allow the user to implement some version of “call by reference” when needed. It should be noted that we expect this mechanism to be used rarely if at all. For small file images, the copying overhead should be negligible, and for large images, most use cases should be addressed by the H5LTopen\_file\_image call. + +In the (hopefully) rare event that use of the file image callbacks is necessary, the fundamental point to remember is that the callbacks must be constructed and used in such a way as to maintain the library’s illusion that it is using “call by value” semantics. + +Thus the property list mechanism must think that it is allocating a new buffer and copying the supplied buffer into it when the file image property is set. Similarly, it must think that it is allocating a new buffer and copying the contents of the existing buffer into it when it copies a property list that contains a file image. Likewise, it must think it is de-allocating a buffer when it discards a property list that contains a file image. + +Similar illusions must be maintained when a file image buffer is copied into the Core file driver (or any future driver that uses the file image callbacks) when the file driver re-sizes the buffer containing the image and finally when the driver discards the buffer. + +#### 3.1.1. Buffer Ownership +The owner of a file image in a buffer is the party that has the responsibility to discard the file image buffer when it is no longer needed. In this context, the owner is either the HDF5 Library or the application program. + +We implemented the image\_\* callback facility to allow efficient management of large file images. These facilities can be used to allow sharing of file image buffers between the application and the HDF5 library, and also transfer of ownership in either direction. In such operations, care must be taken to ensure that ownership is clear and that file image buffers are not discarded before all references to them are discarded by the non-owning party. + +Ownership of a file image buffer will only be passed to the application program if the file image callbacks are designed to do this. In such cases, the application program must refrain from freeing the buffer until the library has deleted all references to it. This in turn will happen after all property lists (if any) that refer to the buffer have been discarded, and the file driver (if any) that used the buffer has closed the file and thinks it has discarded the buffer. + +#### 3.1.2. Sharing a File image Buffer with the HDF5 Library +As mentioned above, the HDF5 property lists are a mechanism for passing values into HDF5 Library calls. They were created to allow calls to be extended with new parameters without changing the actual API or breaking existing code. They were designed based on the assumption that all new parameters would be “call by value” and not “call by reference.” Having “call by value” parameters means property lists can be copied, reused, and discarded with ease. + +Suppose an application wished to share a file image buffer with the HDF5 Library. This means the library would be allowed to read the file image, but not free it. The file image callbacks might be constructed as follows to share a buffer: + +Construct the image\_malloc() call so that it returns the address of the buffer instead of allocating new space. This will keep the library thinking that the buffers are distinct even when they are not. Support this by including the address of the buffer in the user data. As a sanity check, include the buffer’s size in the user data as well, and require image\_malloc() to fail if the requested buffer size is unexpected. Finally, include a reference counter in the user data, and increment the reference counter on each call to image\_malloc(). +Construct the image\_memcpy() call so that it does nothing. As a sanity check, make it fail if the source and destination pointers do not match the buffer address in the user data or if the size is unexpected. +Construct the image\_free() routine so that it does nothing. As a sanity check, make it compare the supplied pointer with the expected pointer in the user data. Also, make it decrement the reference counter and notify the application that the HDF5 Library is done with the buffer when the reference count drops to 0. +As the property list code will never resize a buffer, we do not discuss the image\_realloc() call here. The behavior of image\_realloc() in this scenario depends on what the application wants to do with the file image after it has been opened. We discuss this issue in the next section. Note also that the operation passed into the file image callbacks allow the callbacks to behave differently depending on the context in which they are used. + +For more information on user defined data, see the “H5Pset\_file\_image\_callbacks” section. + +#### 3.1.3. File Driver Considerations +When a file image is opened by a driver that sets both the H5FD\_FEAT\_ALLOW\_FILE\_IMAGE and the H5FD\_FEAT\_CAN\_USE\_FILE\_IMAGE\_CALLBACKS flags, the driver will allocate a buffer large enough for the initial file image and then copy the image from the property list into this buffer. As processing progresses, the driver will reallocate the image as necessary to increase its size and will eventually discard the image at file close. If defined, the driver will use the file image callbacks for these operations; otherwise, the driver will use the standard C library calls. See the "H5Pset\_file\_image\_callbacks” section for more information. + +As described above, the file image callbacks can be constructed so as to avoid the overhead of buffer allocations and copies while allowing the HDF5 Library to maintain its illusions on the subject. There are two possible complications involving the file driver. The complications are the possibility of reallocation calls from the driver and the possibility of the continued existence of property lists containing references to the buffer. + +Suppose an application wishes to share a file image buffer with the HDF5 Library. The application allows the library to read (and possibly write) the image, but not free it. We must first decide whether the image is to be opened read-only or read/write. + +If the image will be opened read-only (or if we know that any writes will not change the size of the image), the image\_realloc() call should never be invoked. Thus the image\_realloc() routine can be constructed so as to always fail, and the image\_malloc(), image\_memcpy(), and image\_free() routines can be constructed as described in the section above. + +Suppose, however, that the file image will be opened read/write and may grow during the computation. We must now allow for the base address of the buffer to change due to reallocation calls, and we must employ the user data structure to communicate any change in the buffer base address and size to the application. We pass buffer changes to the application so that the application will be able to eventually free the buffer. To this end, we might define a user data structure as shown in the example below: + + + + typedef struct udata { + void \*init\_ptr; + size\_t init\_size; + int init\_ref\_count; + void \*mod\_ptr; + size\_t mod\_size; + int mod\_ref\_count; + } +Example 1. Using a user data structure to communicate with an application + +We initialize an instance of the structure so that init\_ptr points to the buffer to be shared, init\_size contains the initial size of the buffer, and all other fields are initialized to either NULL or 0 as indicated by their type. We then pass a pointer to the instance of the user data structure to the HDF5 Library along with allocation callback functions constructed as follows: + +Construct the image\_malloc() call so that it returns the value in the init\_ptr field of the user data structure and increments the init\_ref\_count. As a sanity check, the function should fail if the requested size does not match the init\_size field in the user data structure or if any of the modified fields have values other than their initial values. +Construct the image\_memcpy() call so that it does nothing. As a sanity check, it should be made to fail if the source, destination, and size parameters do not match the init\_ptr and init\_size fields as appropriate. +Construct the image\_realloc() call so that it performs a standard realloc. Sanity checking, assuming that the realloc is successful, should be as follows: +If the mod\_ptr, mod\_size, or mod\_ref\_count fields of the user data structure still have their initial values, verify that the supplied pointer matches the init\_ptr field and that the supplied size does not match the init\_size field. Decrement init\_ref\_count, set mod\_ptr equal to the address returned by realloc, set mod\_size equal to the supplied size, and set mod\_ref\_count to 1. +If the mod\_ptr, mod\_size, or mod\_ref\_count fields of the user data structure are defined, verify that the supplied pointer matches the value of mod\_ptr and that the supplied size does not match mod\_size. Set mod\_ptr equal to the value returned by realloc, and set mod\_size equal to the supplied size. +In both cases, if all sanity checks pass, return the value returned by the realloc call. Otherwise, return NULL. + +Construct the image\_free() routine so that it does nothing. Perform sanity checks as follows: +If the H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_CLOSE flag is set, decrement the init\_ref\_count field of the user data structure. Flag an error if init\_ref\_count drops below zero. +If the H5\_FILE\_IMAGE\_OP\_FILE\_CLOSE flag is set, check to see if the mod\_ptr, mod\_size, or mod\_ref\_count fields of the user data structure have been modified from their initial values. If they have, verify that mod\_ref\_count contains 1 and then set that field to zero. If they have not been modified, proceed as per the H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_CLOSE case. +In either case, if both the init\_ref\_count and mod\_ref\_count fields have dropped to zero, notify the application that the HDF5 Library is done with the buffer. If the mod\_ptr or mod\_size fields have been modified, pass these values on to the application as well. + +### 3.2. Initial File Image Semantics +One can argue whether creating a file with an initial file image is closer to creating a file or opening a file. The consensus seems to be that it is closer to a file open, and thus we shall require that the initial image only be used for calls to H5Fopen(). + +Whatever our convention, from an internal perspective, opening a file with an initial file image is a bit of both creating a file and opening a file. Conceptually, we will create a file on disk, write the supplied image to the file, close the file, open the file as an HDF5 file, and then proceed as usual (of course, the Core VFD will not write to the file system unless it is configured to do so). This process is similar to a file create: we are creating a file that did not exist on disk to begin with and writing data to it. Also, we must verify that no file of the supplied name is open. However, this process is also similar to a file open: we must read the superblock and handle the usual file open tasks. + +Implementing the above sequence of actions has a number of implications on the behavior of the H5Fopen() call when an initial file image is supplied: + +H5Fopen() must fail if the target file driver does not set the H5FD\_FEAT\_ALLOW\_FILE\_IMAGE flag and a file image is specified in the FAPL. +If the target file driver supports the H5FD\_FEAT\_ALLOW\_FILE\_IMAGE flag, then H5Fopen() must fail if the file is already open or if a file of the specified name exists. +Even if the above constraints are satisfied, H5Fopen() must still fail if the image does not contain a valid (or perhaps just plausibly valid) image of an HDF5 file. In particular, the superblock must be processed, and the file structure be set up accordingly. +See the “Virtual File Driver Feature Flags” section for more information. + +As we indicated earlier, if an initial file image appears in the property list of an H5Fcreate() call, it is ignored. + +While the above section on the semantics of the file image callbacks may seem rather gloomy, we get the payback here. The above says everything that needs to be said about initial file image semantics in general. The sub-section below has a few more observations on the Core file driver. + +#### 3.2.1. Applying Initial File Image Semantics to the Core File Driver +At present, the Core file driver uses the open() and read() system calls to load an HDF5 file image from the file system into RAM. Further, if the backing\_store flag is set in the FAPL entry specifying the use of the Core file driver, the Core file driver’s internal image will be used to overwrite the source file on either flush or close. See the H5Pset\_fapl\_core entry in the HDF5 Reference Manual for more information. + +This results in the following observations. In all cases assume that use of the Core file driver has been specified in the FAPL. + +If the file specified in the H5Fopen() call does not exist, and no initial image is specified in the FAPL, the open must fail because there is no source for the initial image needed by the Core file driver. +If the file specified in the H5Fopen() call does exist, and an initial image is specified in the FAPL, the open must fail because the source of the needed initial image is ambiguous: the file image could be taken either from file or from the FAPL. +If the file specified in the H5Fopen() call does not exist, and an initial image is specified in the FAPL, the open will succeed. This assumes that the supplied image is valid. Further, if the backing store flag is set, the file specified in the H5Fopen() call will be created, and the contents of the Core file driver’s internal buffer will be written to the new file on flush or close. +Thus a call to H5Fopen() can result in the creation of a new HDF5 file in the file system. + +## 4. Examples +The purpose of this chapter is to provide examples of how to read or build an in-memory HDF5 file image. + +### 4.1. Reading an In-memory HDF5 File Image +The H5Pset\_file\_image() function call allows the Core file driver to be initialized from an application provided buffer. The following pseudo code illustrates its use: + + + + +H5Pset\_file\_image(fapl\_id, buf, buf\_len); + + + + +Example 2. Using H5Pset\_file\_image to initialize the Core file driver + +This solution is easy to code, but the supplied buffer is duplicated twice. The first time is in the call to H5Pset\_file\_image() when the image is duplicated and the duplicate inserted into the property list. The second time is when the file is opened: the image is copied from the property list into the initial buffer allocated by the Core file driver. This is a non-issue for small images, but this could become a significant performance hit for large images. + +If we want to avoid the extra malloc and memcpycalls, we must decide whether the application should retain ownership of the buffer or pass ownership to the HDF5 Library. + +The following pseudo code illustrates opening the image read -only using the H5LTopen\_file\_image() routine. In this example, the application retains ownership of the buffer and avoids extra buffer allocations and memcpy calls. + + + +hid\_t file\_id; +unsigned flags = H5LT\_FILE\_IMAGE\_DONT\_COPY | H5LT\_FILE\_IMAGE\_DONT\_RELEASE; +file\_id = H5LTopen\_file\_image(buf, buf\_len, flags); + + +Example 3. Using H5LTopen\_file\_image to open a read-only file image where the application retains ownership of the buffer +If the application wants to transfer ownership of the buffer to the HDF5 Library, and the standard C library routine free is an acceptable way of discarding it, the above example can be modified as follows: + + +hid\_t file\_id; +unsigned flags = H5LT\_FILE\_IMAGE\_DONT\_COPY; +file\_id = H5LTopen\_file\_image(buf, buf\_len, flags); + + +Example 4. Using H5LTopen\_file\_image to open a read-only file image where the application transfers ownership of the buffer +Again, file access is read-only. Read/write access can be obtained via the H5LTopen\_file\_image() call, but we will explore that in the section below. + + + +### 4.2. In-memory HDF5 File Image Construction +Before the implementation of file image operations, HDF5 supported construction of an image of an HDF5 file in memory with the Core file driver. The H5Fget\_file\_image() function call allows an application access to the file image without first writing it to disk. See the following code fragment: + + +H5Fflush(fid); +size = H5Fget\_file\_image(fid, NULL, 0); +buffer\_ptr = malloc(size); +H5Fget\_file\_image(fid, buffer\_ptr, size); +Example 5. Accessing the image of a file in memory + +The use of H5Fget\_file\_image() may be acceptable for small images. For large images, the cost of the malloc() and memcpy() operations may be excessive. To address this issue, the H5Pset\_file\_image\_callbacks() call allows an application to manage dynamic memory allocation for file images and memory-based file drivers (only the Core file driver at present). The following code fragment illustrates its use. Note that most error checking is omitted for simplicity and that H5Pset\_file\_image is not used to set the initial file image. + +struct udata\_t { +void \* image\_ptr; +size\_t image\_size; + } udata = {NULL, 0}; +void \*image\_malloc(size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + ((struct udata\_t \*)udata)->image\_size = size; + return(malloc(size)); +} +void \*image\_memcpy)(void \*dest, const void \*src, size\_t size, + H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ +assert(FALSE); /\* Should never be invoked in this scenario. \*/ + return(NULL); /\* always fails \*/ + } +void image\_realloc(void \*ptr, size\_t size, H5\_file\_image\_op\_t file\_image\_op, +void \*udata) +{ + ((struct udata\_t \*)udata)->image\_size = size; + return(realloc(ptr, size)); +} +herr\_t image\_free(void \*ptr, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + assert(file\_image\_op == H5\_FILE\_IMAGE\_OP\_FILE\_CLOSE); + ((struct udata\_t \*)udata)->image\_ptr = ptr; + return(0); /\* if we get here, we must have been successful \*/ +} +void \*udata\_copy(void \*udata) +{ + return(udata); +} +herr\_t udata\_free(void \*udata) +{ + return(0); +} +H5\_file\_image\_callbacks\_t callbacks = {image\_malloc, image\_memcpy, + image\_realloc, image\_free, + udata\_copy, udata\_free, + (void \*)(&udata)}; + +H5Pset\_file\_image\_callbacks(fapl\_id, &callbacks); + +assert(udata.image\_ptr!= NULL); +/\* udata now contains the base address and length of the final version of the core file \*/ + +Example 6. Using H5Pset\_file\_image\_callbacks to improve memory allocation + +The above code fragment gives the application full ownership of the buffer used by the Core file driver after the file is closed, and it notifies the application that the HDF5 Library is done with the buffer by setting udata.image\_ptr to something other than NULL. If read access to the buffer is sufficient, the H5Fget\_vfd\_handle() call can be used as an alternate solution to get access to the base address of the Core file driver’s buffer. + +The above solution avoids some unnecessary mallocand memcpycalls and should be quite adequate if an image of an HDF5 file is constructed only occasionally. However, if an HDF5 file image must be constructed regularly, and if we can put a strong and tight upper bound on the size of the necessary buffer, then the following pseudo code demonstrates a method of avoiding memory allocation completely. The downside, however, is that buffer is allocated statically. Again, much error checking is omitted for clarity. + +char buf[BIG\_ENOUGH]; +struct udata\_t { +void \* image\_ptr; +size\_t image\_size; +size\_t max\_image\_size; +int ref\_count; +} udata = {(void \*)(&(buf[0]), 0, BIG\_ENOUGH, 0}; +void \*image\_malloc(size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + assert(size <= ((struct udata\_t \*)udata)->max\_image\_size); + assert(((struct udata\_t \*)udata)->ref\_count == 0); + ((struct udata\_t \*)udata)->image\_size = size; + (((struct udata\_t \*)udata)->ref\_count)++; + return((((struct udata\_t \*)udata)->image\_ptr); +} +void \*image\_memcpy)(void \*dest, const void \*src, size\_t size, + H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ +assert(FALSE); /\* Should never be invoked in this scenario. \*/ + return(NULL); /\* always fails \*/ + } +void \*image\_realloc(void \*ptr, size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + assert(ptr == ((struct udata\_t \*)udata)->image\_ptr); +assert(size <= ((struct udata\_t \*)udata)->max\_image\_size); +assert(((struct udata\_t \*)udata)->ref\_count == 1); + ((struct udata\_t \*)udata)->image\_size = size; +return((((struct udata\_t \*)udata)->image\_ptr); +} +herr\_t image\_free(void \*ptr, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + assert(file\_image\_op == H5\_FILE\_IMAGE\_OP\_FILE\_CLOSE); + assert(ptr == ((struct udata\_t \*)udata)->image\_ptr); +assert(((struct udata\_t \*)udata)->ref\_count == 1); + (((struct udata\_t \*)udata)->ref\_count)--; + return(0); /\* if we get here, we must have been successful \*/ +} +void \*udata\_copy(void \*udata) +{ + return(udata); +} +herr\_t udata\_free(void \*udata) +{ + return(0); +} +H5\_file\_image\_callbacks\_t callbacks = {image\_malloc, image\_memcpy, + image\_realloc, image\_free, + udata\_copy, udata\_free, + (void \*)(&udata)}; +/\* end of initialization \*/ + +H5Pset\_file\_image\_callbacks(fapl\_id, &callbacks); + + + +assert(udata.ref\_count == 0); +/\* udata now contains the base address and length of the final version of the core file \*/ + + +Example 7. Using H5Pset\_file\_image\_callbacks with a static buffer + +If we can further arrange matters so that only the contents of the datasets in the HDF5 file image change, but not the structure of the file itself, we can optimize still further by re-using the image and changing only the contents of the datasets after the initial write to the buffer. The following pseudo code shows how this might be done. Note that the code assumes that buf already contains the image of the HDF5 file whose dataset contents are to be overwritten. Again, much error checking is omitted for clarity. Also, observe that the file image callbacks do not support the H5Pget\_file\_image() call. + + + + + +void \*image\_malloc(size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + assert(size <= ((struct udata\_t \*)udata)->max\_image\_size); + assert(size == ((struct udata\_t \*)udata)->image\_size); + assert(((struct udata\_t \*)udata)->ref\_count >= 0); + ((struct udata\_t \*)udata)->image\_size = size; + (((struct udata\_t \*)udata)->ref\_count)++; + return((((struct udata\_t \*)udata)->image\_ptr); +} +void \*image\_memcpy)(void \*dest, const void \*src, size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ +assert(dest == ((struct udata\_t \*)udata)->image\_ptr); +assert(src == ((struct udata\_t \*)udata)->image\_ptr); +assert(size <= ((struct udata\_t \*)udata)->max\_image\_size); +assert(size == ((struct udata\_t \*)udata)->image\_size); +assert(((struct udata\_t \*)udata)->ref\_count >= 1); + return(dest); /\* if we get here, we must have been successful \*/ +} +void \*image\_realloc(void \*ptr, size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + /\* One would think that this function is not needed in this scenario, as + \* only the contents of the HDF5 file is being changed, not its size or + \* structure. However, the Core file driver calls realloc() just before + \* close to clip the buffer to the size indicated by the end of the + \* address space. + \* + \* While this call must be supported in this case, the size of + \* the image should never change. Hence the function can limit itself + \* to performing sanity checks, and returning the base address of the + \* statically allocated buffer. + \*/ + assert(ptr == ((struct udata\_t \*)udata)->image\_ptr); +assert(size <= ((struct udata\_t \*)udata)->max\_image\_size); +assert(((struct udata\_t \*)udata)->ref\_count >= 1); + assert(((struct udata\_t \*)udata)->image\_size == size); +return((((struct udata\_t \*)udata)->image\_ptr); +} +herr\_t image\_free(void \*ptr, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + assert((file\_image\_op == H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_CLOSE) || + (file\_image\_op == H5\_FILE\_IMAGE\_OP\_FILE\_CLOSE)); +assert(((struct udata\_t \*)udata)->ref\_count >= 1); + (((struct udata\_t \*)udata)->ref\_count)--; + return(0); /\* if we get here, we must have been successful \*/ +} +void \*udata\_copy(void \*udata) +{ + return(udata); +} +herr\_t udata\_free(void \*udata) +{ + return(0); +} +H5\_file\_image\_callbacks\_t callbacks = {image\_malloc, image\_memcpy, + image\_realloc, image\_free, + udata\_copy, udata\_free, + (void \*)(&udata)}; +/\* end of initialization \*/ + +H5Pset\_file\_image\_callbacks(fapl\_id, &callbacks); +H5Pset\_file\_image(fapl\_id, udata.image\_ptr, udata.image\_len); + + + +assert(udata.ref\_count == 0); +/\* udata now contains the base address and length of the final version of the core file \*/ + + +Example 8. Using H5Pset\_file\_image\_callbacks where only the datasets change + +Before we go on, we should note that the above pseudo code can be written more compactly, albeit with fewer sanity checks, using the H5LTopen\_file\_image() call. See the example below: + + +hid\_t file\_id; +unsigned flags = H5LT\_FILE\_IMAGE\_OPEN\_RW | H5LT\_FILE\_IMAGE\_DONT\_COPY | H5LT\_FILE\_IMAGE\_DONT\_RELEASE; +/\* end initialization \*/ +file\_id = H5LTopen\_file\_image(udata.image\_ptr, udata.image\_len, flags); + +/\* udata now contains the base address and length of the final version of the core file \*/ + + + + +Example 9. Using H5LTopen\_file\_image where only the datasets change +The above pseudo code allows updates of a file image about as cheaply as possible. We assume the application has enough RAM for the image and that the HDF5 file structure is constant after the first write. + +While the scenario above is plausible, we will finish this section with a more general scenario. In the pseudo code below, we assume sufficient RAM to retain the HDF5 file image between uses, but we do not assume that the HDF5 file structure remains constant or that we can place a hard pper bound on the image size. + +Since we must use malloc, realloc, and free in this example, and since realloc can change the base address of a buffer, we must maintain two of ptr, size, and ref\_count triples in the udata structure. The first triple is for the property list (which will never change the buffer), and the second triple is for the file driver. As shall be seen, this complicates the file image callbacks considerably. Note also that while we do not use H5Pget\_file\_image() in this example, we do include support for it in the file image callbacks. As usual, much error checking is omitted in favor of clarity. + +struct udata\_t { +void \* fapl\_image\_ptr; +size\_t fapl\_image\_size; +int fapl\_ref\_count; +void \* vfd\_image\_ptr; +size\_t vfd\_image\_size; +nt vfd\_ref\_count; +} udata = {NULL, 0, 0, NULL, 0, 0}; +boolean initial\_file\_open = TRUE; +void \*image\_malloc(size\_t size, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + void \* return\_value = NULL; + switch ( file\_image\_op ) { + case H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_SET: + case H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_COPY: + assert(((struct udata\_t \*)udata)->fapl\_image\_ptr != NULL); + assert(((struct udata\_t \*)udata)->fapl\_image\_size == size); + assert(((struct udata\_t \*)udata)->fapl\_ref\_count >= 0); + return\_value = ((struct udata\_t \*)udata)->fapl\_image\_ptr; + (((struct udata\_t \*)udata)->fapl\_ref\_count)++; + break; + case H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_GET: + assert(((struct udata\_t \*)udata)->fapl\_image\_ptr != NULL); + assert(((struct udata\_t \*)udata)->vfd\_image\_size == size); + assert(((struct udata\_t \*)udata)->fapl\_ref\_count >= 1); + return\_value = ((struct udata\_t \*)udata)->fapl\_image\_ptr; + /\* don’t increment ref count \*/ + break; + case H5\_FILE\_IMAGE\_OP\_FILE\_OPEN: + assert(((struct udata\_t \*)udata)->vfd\_image\_ptr == NULL); + assert(((struct udata\_t \*)udata)->vfd\_image\_size == 0); + assert(((struct udata\_t \*)udata)->vfd\_ref\_count == 0); +if (((struct udata\_t \*)udata)->fapl\_image\_ptr == NULL ) { + ((struct udata\_t \*)udata)->vfd\_image\_ptr = +malloc(size); + ((struct udata\_t \*)udata)->vfd\_image\_size = size; + } else { + assert(((struct udata\_t \*)udata)->fapl\_image\_size == +size); + assert(((struct udata\_t \*)udata)->fapl\_ref\_count >= +1); + ((struct udata\_t \*)udata)->vfd\_image\_ptr = +((struct udata\_t \*)udata)->fapl\_image\_ptr; + ((struct udata\_t \*)udata)->vfd\_image\_size = size; + } + return\_value = ((struct udata\_t \*)udata)->vfd\_image\_ptr; + (((struct udata\_t \*)udata)->vfd\_ref\_count)++; + break; + default: + assert(FALSE); + } + return(return\_value); +} +void \*image\_memcpy)(void \*dest, const void \*src, size\_t size, + H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + switch(file\_image\_op) { + case H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_SET: + case H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_COPY: + case H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_GET: +assert(dest == ((struct udata\_t \*)udata)->fapl\_image\_ptr); +assert(src == ((struct udata\_t \*)udata)->fapl\_image\_ptr); +assert(size == ((struct udata\_t \*)udata)->fapl\_image\_size); +assert(((struct udata\_t \*)udata)->fapl\_ref\_count >= 1); +break; +case H5\_FILE\_IMAGE\_OP\_FILE\_OPEN: +assert(dest == ((struct udata\_t \*)udata)->vfd\_image\_ptr); +assert(src == ((struct udata\_t \*)udata)->fapl\_image\_ptr); +assert(size == ((struct udata\_t \*)udata)->fapl\_image\_size); +assert(size == ((struct udata\_t \*)udata)->vfd\_image\_size); +assert(((struct udata\_t \*)udata)->fapl\_ref\_count >= 1); +assert(((struct udata\_t \*)udata)->vfd\_ref\_count == 1); +break; + default: + assert(FALSE); + break; + } + return(dest); /\* if we get here, we must have been successful \*/ + } +void \*image\_realloc(void \*ptr, size\_t size, H5\_file\_image\_op\_t file\_image\_op, + void \*udata) +{ + assert(ptr == ((struct udata\_t \*)udata)->vfd\_image\_ptr); | +assert(((struct udata\_t \*)udata)->vfd\_ref\_count == 1); +((struct udata\_t \*)udata)->vfd\_image\_ptr = realloc(ptr, size); + ((struct udata\_t \*)udata)->vfd\_image\_size = size; +return((((struct udata\_t \*)udata)->vfd\_image\_ptr); +} +herr\_t image\_free(void \*ptr, H5\_file\_image\_op\_t file\_image\_op, void \*udata) +{ + switch(file\_image\_op) { + case H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_CLOSE: + assert(ptr == ((struct udata\_t \*)udata)->fapl\_image\_ptr); + assert(((struct udata\_t \*)udata)->fapl\_ref\_count >= 1); + (((struct udata\_t \*)udata)->fapl\_ref\_count)--; + break; + case H5\_FILE\_IMAGE\_OP\_FILE\_CLOSE: + assert(ptr == ((struct udata\_t \*)udata)->vfd\_image\_ptr); + assert(((struct udata\_t \*)udata)->vfd\_ref\_count == 1); + (((struct udata\_t \*)udata)->vfd\_ref\_count)--; + break; + default: + assert(FALSE); + break; + } + return(0); /\* if we get here, we must have been successful \*/ +} +void \*udata\_copy(void \*udata) +{ + return(udata); +} +herr\_t udata\_free(void \*udata) +{ + return(0); +} +H5\_file\_image\_callbacks\_t callbacks = {image\_malloc, image\_memcpy, + image\_realloc, image\_free, + udata\_copy, udata\_free, + (void \*)(&udata)}; +/\* end of initialization \*/ + +H5Pset\_file\_image\_callbacks(fapl\_id, &callbacks); +if ( initial\_file\_open ) { + initial\_file\_open = FALSE; +} else { + assert(udata.vfd\_image\_ptr != NULL); + assert(udata.vfd\_image\_size > 0); + assert(udata.vfd\_ref\_count == 0); + assert(udata.fapl\_ref\_count == 0); + udata.fapl\_image\_ptr = udata.vfd\_image\_ptr; + udata.fapl\_image\_size = udata.vfd\_image\_size; + udata.vfd\_image\_ptr = NULL; + udata.vfd\_image\_size = 0; + H5Pset\_file\_image(fapl\_id, udata.fapl\_image\_ptr, udata.fapl\_image\_size); +} + + + +assert(udata.fapl\_ref\_count == 0); +assert(udata.vfd\_ref\_count == 0); + +/\* udata.vfd\_image\_ptr and udata.vfd\_image\_size now contain the base address and length of the final version of the core file \*/ + + + +Example 10. Using H5LTopen\_file\_image where only the datasets change and where the file structure and image size might not be constant + +The above pseudo code shows how a buffer can be passed back and forth between the application and the HDF5 Library. The code also shows the application having control of the actual allocation, reallocation, and freeing of the buffer. + +### 4.3. Using HDF5 to Construct and Read a Data Packet +Using the file image operations described in this document, we can bundle up data in an image of an HDF5 file on one process, transmit the image to a second process, and then open and read the image on the second process without any mandatory file system I/O. + +We have already demonstrated the construction and reading of such buffers above, but it may be useful to offer an example of the full operation. We do so in the example below using as simple a set of calls as possible. The set of calls in the example has extra buffer allocations. To reduce extra buffer allocations, see the sections above. + +In the following example, we construct an HDF5 file image on process A and then transmit the image to process B where we then open the image and extract the desired data. Note that no file system I/O is performed: all the processing is done in memory with the Core file driver. + +\*\*\* Process A \*\*\* + +H5Fflush(fid); +size = H5Fget\_file\_image(fid, NULL, 0); +buffer\_ptr = malloc(size); +H5Fget\_file\_image(fid, buffer\_ptr, size); + + +free(buffer\_ptr); + +\*\*\* Process B \*\*\* +hid\_t file\_id; + + + + +buffer\_ptr = malloc(size) + +file\_id = H5LTopen\_file\_image(buf, + buf\_len, + H5LT\_FILE\_IMAGE\_DONT\_COPY); + +Example 11. Building and passing a file image from one process to another + +### 4.4. Using a Template File +After the above examples, an example of the use of a template file might seem anti-climactic. A template file might be used to enforce consistency on file structure between files or in parallel HDF5 to avoid long sequences of collective operations to create the desired groups, datatypes, and possibly datasets. The following pseudo code outlines a potential use: + + + + +H5Pset\_file\_image(fapl\_id, buf, buf\_len); + + + + +Example 12. Using a template file + +Observe that the above pseudo code includes an unnecessary buffer allocation and copy in the call to H5Pset\_file\_image(). As we have already discussed ways of avoiding this, we will not address that issue here. + +What is interesting in this case is to consider why the application would find this use case attractive. + +In the serial case, at first glance there seems little reason to use the initial image facility at all. It is easy enough to use standard C calls to duplicate a template file, rename it as desired, and then open it as an HDF5 file. + +However, this assumes that the template file will always be available and in the expected place. This is a questionable assumption for an application that will be widely distributed. Thus, we can at least make an argument for either keeping an image of the template file in the executable or for including code for writing the desired standard definitions to new HDF5 files. + +Assuming the image is relatively small, we can further make an argument for the image in place of the code, as, quite simply, the image should be easier to maintain and modify with an HDF5 file editor. + +However, there remains the question of why one should pass the image to the HDF5 Library instead of writing it directly with standard C calls and then using HDF5 to open it. Other than convenience and a slight reduction in code size, we are hard pressed to offer a reason. + +In contrast, the argument is stronger in the parallel case since group, datatype, and dataset creations are all expensive collective operations. The argument is also weaker: simply copying an existing template file and opening it should lose many of its disadvantages in the HPC context although we would imagine that it is always useful to reduce the number of files in a deployment. + +In closing, we would like to consider one last point. In the parallel case, we would expect template files to be quite large. Parallel HDF5 requires eager space allocation for chunked datasets. For similar reasons, we would expect template files in this context to contain long sequences of zeros with a scattering of metadata here and there. Such files would compress well, and the compressed images would be cheap to distribute across the available processes if necessary. Once distributed, each process could uncompress the image and write to file those sections containing actual data that lay within the section of the file assigned to the process. This approach might be significantly faster than a simple copy as it would allow sparse writes, and thus it might provide a compelling use case for template files. However, this approach would require extending our current API to allow compressed images. We would also have to add the H5Pget/set\_image\_decompression\_callback() API calls. We see no problem in doing this. However, it is beyond the scope of the current effort, and thus we will not pursue the matter further unless there is interest in our doing so. + +## 5. Java Signatures for File Image Operations API Calls +Potential Java function call signatures for the file image operation APIs are described in this section. These have not yet been implemented, and there are no immediate plans for implementation. + +Note that the H5LTopen\_file\_image() call is omitted. Our practice has been to not support high-level library calls in Java. + +H5Pset\_file\_image + +int H5Pset\_file\_image(int fapl\_id, const byte[] buf\_ptr); +H5Pget\_file\_image + +herr\_t H5Pget\_file\_image(hid\_t fapl\_id, byte[] buf\_ptr\_ptr); +H5\_file\_image\_op\_t + +public static H5\_file\_image\_op\_t +{ + H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_SET, + H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_COPY, + H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_GET, + H5\_FILE\_IMAGE\_OP\_PROPERTY\_LIST\_CLOSE, + H5\_FILE\_IMAGE\_OP\_FILE\_OPEN, + H5\_FILE\_IMAGE\_OP\_FILE\_RESIZE, + H5\_FILE\_IMAGE\_OP\_FILE\_CLOSE +} +H5\_file\_image\_malloc\_cb + +public interface H5\_file\_image\_malloc\_cb extends Callbacks { + buf[] callback(H5\_file\_image\_op\_t file\_image\_op, CBuserdata udata); +} +H5\_file\_image\_memcpy\_cb + +public interface H5\_file\_image\_memcpy\_cb extends Callbacks { +buf[] callback(buf[] dest, const buf[] src, H5\_file\_image\_op\_t file\_image\_op, CBuserdata +udata); +} +H5\_file\_image\_realloc\_cb + +public interface H5\_file\_image\_realloc\_cb extends Callbacks { + buf[] callback(buf[] ptr, H5\_file\_image\_op\_t file\_image\_op, CBuserdata udata); +} +H5\_file\_image\_free\_cb + +public interface H5\_file\_image\_free\_cb extends Callbacks { + void callback(buf[] ptr, H5\_file\_image\_op\_t file\_image\_op, CBuserdata udata); +} +H5\_file\_udata\_copy\_cb + +public interface H5\_file\_udata\_copy\_cb extends Callbacks { + buf[] callback(CBuserdata udata); +} +H5\_file\_udata\_free\_cb + +public interface H5\_file\_udata\_free\_cb extends Callbacks { + void callback(CBuserdata udata); +} +H5\_file\_image\_callbacks\_t + +public abstract class H5\_file\_image\_callbacks\_t +{ + H5\_file\_image\_malloc\_cb image\_malloc; + H5\_file\_image\_memcpy\_cb image\_memcpy; + H5\_file\_image\_realloc\_cb image\_realloc; + H5\_file\_image\_free\_cb image\_free; + H5\_file\_udata\_copy\_cb udata\_copy; + H5\_file\_udata\_free\_cb udata\_free; + CBuserdata udata; + public H5\_file\_image\_callbacks\_t( + H5\_file\_image\_malloc\_cb image\_malloc, + H5\_file\_image\_memcpy\_cb image\_memcpy, + H5\_file\_image\_realloc\_cb image\_realloc, + H5\_file\_image\_free\_cb image\_free, + H5\_file\_udata\_copy\_cb udata\_copy, + H5\_file\_udata\_free\_cb udata\_free, + CBuserdata udata) { + this.image\_malloc = image\_malloc; + this.image\_memcpy = image\_memcpy; + this.image\_realloc = image\_realloc; + this.image\_free = image\_free; + this.udata\_copy = udata\_copy; + this.udata\_free = udata\_free; + this.udata = udata; + } +} +H5Pset\_file\_image\_callbacks + +int H5Pset\_file\_image\_callbacks(int fapl\_id, + H5\_file\_image\_callbacks\_t callbacks\_ptr); +H5Pget\_file\_image\_callbacks + +int H5Pget\_file\_image\_callbacks(int fapl\_id, + H5\_file\_image\_callbacks\_t[] callbacks\_ptr); +H5Fget\_file\_image + +long H5Fget\_file\_image(int file\_id, byte[] buf\_ptr); +## 6. Fortran Signatures for File Image Operations API Calls +Potential Fortran function call signatures for the file image operation APIs are described in this section. These have not yet been implemented, and there are no immediate plans for implementation. + +### 6.1. Low-level Fortran API Routines +The Fortran low-level APIs make use of Fortran 2003’s ISO\_C\_BINDING module in order to achieve portable and standard conforming interoperability with the C APIs. The C pointer (C\_PTR) and function pointer (C\_FUN\_PTR) types are returned from the intrinsic procedures C\_LOC(X) and C\_FUNLOC(X), respectively, defined in the ISO\_C\_BINDING module. The argument X is the data or function to which the C pointers point to and must have the TARGET attribute in the calling program. Note that the variable name lengths of the Fortran equivalent of the predefined C constants were shortened to less than 31 characters in order to be Fortran standard compliant. + +#### 6.1.1. H5Pset\_file\_image\_f +The signature of H5Pset\_file\_image\_f is defined as follows: + +SUBROUTINE H5Pset\_file\_image\_f(fapl\_id, buf\_ptr, buf\_len, hdferr) +The parameters of H5Pset\_file\_image are defined as follows: + +INTEGER(hid\_t), INTENT(IN):: fapl\_id +Will contain the ID of the target file access property list. + +TYPE(C\_PTR), INTENT(IN):: buf\_ptr +Will supply the C pointer to the initial file image or C\_NULL\_PTR if no initial file image is desired. + +INTEGER(size\_t), INTENT(IN):: buf\_len +Will contain the size of the supplied buffer or 0 if no initial image is desired. + +INTEGER, INTENT(OUT) :: hdferr +Will return the error status: 0 for success and -1 for failure. + + + +#### 6.1.2. H5Pget\_file\_image\_f +The signature of H5Pget\_file\_image\_f is defined as follows: + +SUBROUTINE H5Pget\_file\_image\_f(fapl\_id, buf\_ptr, buf\_len, hdferr) +The parameters of H5Pget\_file\_image\_f are defined as follows: + +INTEGER(hid\_t), INTENT(IN) :: fapl\_id +Will contain the ID of the target file access property list + +TYPE(C\_PTR), INTENT(INOUT), VALUE :: buf\_ptr +Will hold either a C\_NULL\_PTR or a scalar of type c\_ptr. If buf\_ptr is not C\_NULL\_PTR, on successful return, buf\_ptr shall contain a C pointer to a copy of the initial image provided in the last call to H5Pset\_file\_image\_f for the supplied fapl\_id, or buf\_ptr shall contain a C\_NULL\_PTR if there is no initial image set. The Fortran pointer can be obtained using the intrinsic C\_F\_POINTER. + +INTEGER(size\_t), INTENT(OUT) :: buf\_len +Will contain the value of the buffer parameter for the initial image in the supplied fapl\_id. The value will be 0 if no initial image is set. + +INTEGER, INTENT(OUT) :: hdferr +Will return the error status: 0 for success and -1 for failure. + + + +#### 6.1.3. H5Pset\_file\_image\_callbacks\_f +The signature of H5Pset\_file\_image\_callbacks\_f is defined as follows: + +INTEGER :: H5\_IMAGE\_OP\_PROPERTY\_LIST\_SET\_F=0, + H5\_IMAGE\_OP\_PROPERTY\_LIST\_COPY\_F=1, + H5\_IMAGE\_OP\_PROPERTY\_LIST\_GET\_F=2, + H5\_IMAGE\_OP\_PROPERTY\_LIST\_CLOSE\_F=3, + H5\_IMAGE\_OP\_FILE\_OPEN\_F=4, + H5\_IMAGE\_OP\_FILE\_RESIZE\_F=5, + H5\_IMAGE\_OP\_FILE\_CLOSE\_F=6 +TYPE, BIND(C) :: H5\_file\_image\_callbacks\_t + TYPE(C\_FUN\_PTR), VALUE :: image\_malloc + TYPE(C\_FUN\_PTR), VALUE :: image\_memcpy + TYPE(C\_FUN\_PTR), VALUE :: image\_realloc + TYPE(C\_FUN\_PTR), VALUE :: image\_free + TYPE(C\_FUN\_PTR), VALUE :: udata + TYPE(C\_FUN\_PTR), VALUE :: udata\_copy + TYPE(C\_FUN\_PTR), VALUE :: udata\_free + TYPE(C\_PTR), VALUE :: udata +END TYPE H5\_file\_image\_callbacks\_t +The semantics of the above values will be the same as those defined in the C enum. See Section 2.1.3 for more information. + +Fortran Callback APIs + +The Fortran callback APIs are shown below. + +FUNCTION op\_func(size, file\_image\_op, udata,) RESULT(image\_malloc) +INTEGER(size\_t) :: size +Will contain the size of the image buffer to allocate in bytes. +INTEGER :: file\_image\_op +Will be set to one of the values of H5\_IMAGE\_OP\_\* indicating the operation being performed on the file image when this callback is invoked. +TYPE(C\_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset\_file\_image\_callbacks\_f. +TYPE(C\_FUN\_PTR), VALUE :: image\_malloc +Shall contain a pointer to a function with functionality identical to the standard C library memcpy() call. + + +FUNCTION op\_func(dest, src, size, & file\_image\_op, udata) RESULT(image\_memcpy) +TYPE(C\_PTR), VALUE :: dest +Will contain the address of the buffer into which to copy. +TYPE(C\_PTR), VALUE :: src +Will contain the address of the buffer from which to copy +INTEGER(size\_t) :: size +Will contain the number of bytes to copy. +INTEGER :: file\_image\_op +Will be set to one of the values of H5\_IMAGE\_OP\_\* indicating the operation being performed on the file image when this callback is invoked. +TYPE(C\_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset\_file\_image\_callbacks\_f. +TYPE(C\_FUN\_PTR), VALUE :: image\_memcpy +Shall contain a pointer to a function with functionality identical to the standard C library memcpy() call. + + +FUNCTION op\_func(ptr, size, & file\_image\_op, udata) RESULT(image\_realloc) +TYPE(C\_PTR), VALUE :: ptr +Will contain the pointer to the buffer being reallocated +INTEGER(size\_t) :: size +Will contain the desired size of the buffer after realloc in bytes. +INTEGER :: file\_image\_op +Will be set to one of the values of H5\_IMAGE\_OP\_\* indicating the operation being performed on the file image when this callback is invoked. +TYPE(C\_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset\_file\_image\_callbacks\_f. +TYPE(C\_FUN\_PTR), VALUE :: image\_realloc +Shall contain a pointer to a unction functionality identical to the standard C library realloc() call. + + +FUNCTION op\_func(ptr, file\_image\_op, udata) RESULT(image\_free) +TYPE(C\_PTR), VALUE :: ptr +Will contain the pointer to the buffer being released. +INTEGER :: file\_image\_op +Will be set to one of the values of H5\_IMAGE\_OP\_\* indicating the operation being performed on the file image when this callback is invoked. +TYPE(C\_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset\_file\_image\_callbacks\_f. +TYPE(C\_PTR), VALUE :: image\_free +Shall contain a pointer to a function with functionality identical to the standard C library free() call + + +FUNCTION op\_func(udata) RESULT(udata\_copy) +TYPE(C\_PTR), VALUE :: udata +Will be set to the value passed in for the udata parameter to H5Pset\_file\_image\_callbacks\_f. +TYPE(C\_FUN\_PTR), VALUE :: udata\_copy +Shall contain a pointer to a function that will allocate a buffer of suitable size, copy the contents of the supplied udata into the new buffer, and return the address of the new buffer. The function will return C\_NULL\_PTR on failure. + + +FUNCTION op\_func(udata) RESULT(udata\_free) +TYPE(C\_PTR), VALUE :: udata +Shall contain a pointer value, potentially to user-defined data, that will be passed to the image\_malloc, image\_memcpy, image\_realloc, and image\_free callbacks. + + +The signature of H5Pset\_file\_image\_callbacks\_f is defined as follows: + +SUBROUTINE H5Pset\_file\_image\_callbacks\_f(fapl\_id, &callbacks\_ptr, hdferr) +The parameters are defined as follows: + + + +INTEGER(hid\_t), INTENT(IN) :: fapl\_id +Will contain the ID of the target file access property list. + +TYPE(H5\_file\_image\_callbacks\_t), INTENT(IN) :: callbacks\_ptr +Will contain the callback derived type. callbacks\_ptr shall contain a pointer to the Fortran function via the intrinsic functions C\_LOC(X) and C\_FUNLOC(X). + +INTEGER, INTENT(OUT) :: hdferr +Will return the error status: 0 for success and -1 for failure. + + + +#### 6.1.4. H5Pget\_file\_image\_callbacks\_f +The H5Pget\_file\_image\_callbacks\_f routine is designed to obtain the current file image callbacks from a file access property list. + +The signature is defined as follows + +SUBROUTINE H5Pget\_file\_image\_callbacks\_f(fapl\_id, callbacks\_ptr, hdferr) +The parameters are defined as follows: + +INTEGER(hid\_t), INTENT(IN) :: fapl\_id +Will contain the ID of the target file access property list. + +TYPE(H5\_file\_image\_callbacks\_t), INTENT(OUT) :: callbacks\_ptr +Will contain the callback derived type. Each member of the derived type shall have the same meaning as its C counterpart. See section 2.1.4 for more information. + +INTEGER, INTENT(OUT) :: hdferr +Will return the error status: 0 for success and -1 for failure. + + + +#### 6.1.5. Fortran Virtual File Driver Feature Flags +Implementation of the H5Pget/set\_file\_image\_callbacks\_f() and H5Pget/set\_file\_image\_f() APIs requires a pair of new virtual file driver feature flags: + +H5FD\_FEAT\_LET\_IMAGE\_F +H5FD\_FEAT\_LET\_IMAGE\_CALLBACK\_F +See the “Virtual File Driver Feature Flags” section for more information. + +#### 6.1.6. H5Fget\_file\_image\_f +The signature of H5Fget\_file\_image\_f shall be defined as follows: + +SUBROUTINE H5Fget\_file\_image\_f(file\_id, buf\_ptr, buf\_len, hdferr, buf\_size) +The parameters of H5Fget\_file\_image\_f are defined as follows: + +INTEGER(hid\_t), INTENT(IN) :: file\_id +Will contain the ID of the target file. + +TYPE(C\_PTR), INTENT(IN) :: buf\_ptr +Will contain a C pointer to the buffer into which the image of the HDF5 file is to be copied. If buf\_ptr is C\_NULL\_PTR, no data will be copied. + +INTEGER(size\_t), INTENT(IN) :: buf\_len +Will contain the size in bytes of the supplied buffer. + +INTEGER(ssizet\_t), INTENT(OUT), OPTIONAL :: buf\_size +Will indicate the buffer size required to store the file image (in other words, the length of the file). If only the buf\_size is needed, then buf\_ptr should be also be set to C\_NULL\_PTR + +INTEGER, INTENT(OUT) :: hdferr +Returns the error status: 0 for success and -1 for failure. + + + +See the “H5Fget\_file\_image” section for more information. + +6.2. High-level Fortran API Routine +The new Fortran high-level routine H5LTopen\_file\_image\_f will provide a wrapper for the high-level H5LTopen\_file\_image function. Consequently, the high-level Fortran API will not be implemented using low-level HDF5 Fortran APIs. +6.2.1. H5LTopen\_file\_image\_f +The signature of H5LTopen\_file\_image\_f is defined as follows: + +SUBROUTINE H5LTopen\_file\_image\_f(buf\_ptr, buf\_len, flags, file\_id, hdferr) +The parameters of H5LTopen\_file\_image\_f are defined as follows: + +TYPE(C\_PTR), INTENT(IN), VALUE :: buf\_ptr +Will contain a pointer to the supplied initial image. A C\_NULL\_PTR value is invalid and will cause H5LTopen\_file\_image\_f to fail. + +INTEGER(size\_t), INTENT(IN) :: buf\_len +Will contain the size of the supplied buffer. A value of 0 is invalid and will cause H5LTopen\_file\_image\_f to fail. + +INTEGER, INTENT(IN) :: flags +Will contain a set of flags indicating whether the image is to be opened read/write, whether HDF5 is to take control of the buffer, and how long the application promises to maintain the buffer. Possible flags are as follows: H5LT\_IMAGE\_OPEN\_RW\_F, H5LT\_IMAGE\_DONT\_COPY\_F, and H5LT\_IMAGE\_DONT\_RELEASE\_F. The C equivalent flags are defined in the “H5LTopen\_file\_image” section. + +INTEGER(hid\_t), INTENT(IN) :: file\_id +Will be a file ID on success. + +INTEGER, INTENT(OUT) :: hdferr +Returns the error status: 0 for success and -1 for failure. diff --git a/documentation/hdf5-docs/advanced_topics/images/tutr-swmr1.png b/documentation/hdf5-docs/advanced_topics/images/tutr-swmr1.png new file mode 100644 index 00000000..71124161 Binary files /dev/null and b/documentation/hdf5-docs/advanced_topics/images/tutr-swmr1.png differ diff --git a/documentation/hdf5-docs/advanced_topics/images/tutr-swmr2.png b/documentation/hdf5-docs/advanced_topics/images/tutr-swmr2.png new file mode 100644 index 00000000..15c6b453 Binary files /dev/null and b/documentation/hdf5-docs/advanced_topics/images/tutr-swmr2.png differ diff --git a/documentation/hdf5-docs/advanced_topics/images/tutr-swmr3.png b/documentation/hdf5-docs/advanced_topics/images/tutr-swmr3.png new file mode 100644 index 00000000..973be56b Binary files /dev/null and b/documentation/hdf5-docs/advanced_topics/images/tutr-swmr3.png differ diff --git a/documentation/hdf5-docs/advanced_topics/images/tutrvds-ex.png b/documentation/hdf5-docs/advanced_topics/images/tutrvds-ex.png new file mode 100644 index 00000000..c9867c9f Binary files /dev/null and b/documentation/hdf5-docs/advanced_topics/images/tutrvds-ex.png differ diff --git a/documentation/hdf5-docs/advanced_topics/images/tutrvds-map.png b/documentation/hdf5-docs/advanced_topics/images/tutrvds-map.png new file mode 100644 index 00000000..9fb2013c Binary files /dev/null and b/documentation/hdf5-docs/advanced_topics/images/tutrvds-map.png differ diff --git a/documentation/hdf5-docs/advanced_topics/images/tutrvds-multimgs.png b/documentation/hdf5-docs/advanced_topics/images/tutrvds-multimgs.png new file mode 100644 index 00000000..68328cd6 Binary files /dev/null and b/documentation/hdf5-docs/advanced_topics/images/tutrvds-multimgs.png differ diff --git a/documentation/hdf5-docs/advanced_topics/images/tutrvds-snglimg.png b/documentation/hdf5-docs/advanced_topics/images/tutrvds-snglimg.png new file mode 100644 index 00000000..249e56c8 Binary files /dev/null and b/documentation/hdf5-docs/advanced_topics/images/tutrvds-snglimg.png differ diff --git a/documentation/hdf5-docs/advanced_topics/intro_SWMR.md b/documentation/hdf5-docs/advanced_topics/intro_SWMR.md new file mode 100644 index 00000000..4a83ce86 --- /dev/null +++ b/documentation/hdf5-docs/advanced_topics/intro_SWMR.md @@ -0,0 +1,154 @@ +--- +title: Introduction to Single-Writer\\_Multiple-Reader (SWMR) +redirect\_from: + +--- +##\*\*\* UNDER CONSTRUCTION \*\*\* + +# Introduction to Single-Writer\\_Multiple-Reader (SWMR) + +Introduction to SWMR +The Single-Writer / Multiple-Reader (SWMR) feature enables multiple processes to read an HDF5 file while it is being written to (by a single process) without using locks or requiring communication between processes. + + + +All communication between processes must be performed via the HDF5 file. The HDF5 file under SWMR access must reside on a system that complies with POSIX write() semantics. + +The basic engineering challenge for this to work was to ensure that the readers of an HDF5 file always see a coherent (though possibly not up to date) HDF5 file. + +The issue is that when writing data there is information in the metadata cache in addition to the physical file on disk: + +However, the readers can only see the state contained in the physical file: + + + +The SWMR solution implements dependencies on when the metadata can be flushed to the file. This ensures that metadata cache flush operations occur in the proper order, so that there will never be internal file pointers in the physical file that point to invalid (unflushed) file addresses. + +A beneficial side effect of using SWMR access is better fault tolerance. It is more difficult to corrupt a file when using SWMR. + + +Documentation +SWMR User's Guide + + PDF + +HDF5 Library APIs +Page: +H5F\_START\_SWMR\_WRITE — Enables SWMR writing mode for a file +Page: +H5DO\_APPEND — Appends data to a dataset along a specified dimension +Page: +H5P\_SET\_OBJECT\_FLUSH\_CB — Sets a callback function to invoke when an object flush occurs in the file +Page: +H5P\_GET\_OBJECT\_FLUSH\_CB — Retrieves the object flush property values from the file access property list +Page: +H5O\_DISABLE\_MDC\_FLUSHES — Prevents metadata entries for an HDF5 object from being flushed from the metadata cache to storage +Page: +H5O\_ENABLE\_MDC\_FLUSHES — Enables flushing of dirty metadata entries from a file’s metadata cache +Page: +H5O\_ARE\_MDC\_FLUSHES\_DISABLED — Determines if an HDF5 object has had flushes of metadata entries disabled +Tools +Page: +h5watch — Outputs new records appended to a dataset as the dataset grows +Page: +h5format\_convert — Converts the layout format version and chunked indexing types of datasets created with HDF5-1.10 so that applications built with HDF5-1.8 can access them +Page: +h5clear — Clears superblock status\_flags field, removes metadata cache image, prints EOA and EOF, or sets EOA of a file +Design Documents +Error while fetching page properties report data: + +Programming Model +Please be aware that the SWMR feature requires that an HDF5 file be created with the latest file format. See H5P\_SET\_LIBVER\_BOUNDS for more information. + +To use SWMR follow the the general programming model for creating and accessing HDF5 files and objects along with the steps described below. + +SWMR Writer: +The SWMR writer either opens an existing file and objects or creates them as follows. + +Open an existing file: + +Call H5Fopen using the H5F\_ACC\_SWMR\_WRITE flag. +Begin writing datasets. +Periodically flush data. +Create a new file: + +Call H5Fcreate using the latest file format. +Create groups, datasets and attributes, and then close the attributes. +Call H5F\_START\_SWMR\_WRITE to start SWMR access to the file. +Periodically flush data. +Example Code: + +Create the file using the latest file format property: + + fapl = H5Pcreate (H5P\_FILE\_ACCESS); + status = H5Pset\_libver\_bounds (fapl, H5F\_LIBVER\_LATEST, H5F\_LIBVER\_LATEST); + fid = H5Fcreate (filename, H5F\_ACC\_TRUNC, H5P\_DEFAULT, fapl); +[Create objects (files, datasets, ...). Close any attributes and named datatype objects. Groups and datasets may remain open before starting SWMR access to them.] + +Start SWMR access to the file: + + status = H5Fstart\_swmr\_write (fid); +Reopen the datasets and start writing, periodically flushing data: + + status = H5Dwrite (dset\_id, ...); + status = H5Dflush (dset\_id); +SWMR Reader: +The SWMR reader must continually poll for new data: + + + +Call H5Fopen using the H5F\_ACC\_SWMR\_READ flag. +Poll, checking the size of the dataset to see if there is new data available for reading. +Read new data, if any. +Example Code: + +Open the file using the SWMR read flag: + + fid = H5Fopen (filename, H5F\_ACC\_RDONLY | H5F\_ACC\_SWMR\_READ, H5P\_DEFAULT); +Open the dataset and then repeatedly poll the dataset, by getting the dimensions, reading new data, and refreshing: + + dset\_id = H5Dopen (...); + space\_id = H5Dget\_space (...); + while (...) { + status = H5Dread (dset\_id, ...); + status = H5Drefresh (dset\_id); + space\_id = H5Dget\_space (...); + } + +Limitations and Scope +An HDF5 file under SWMR access must reside on a system that complies with POSIX write() semantics. It is also limited in scope as follows: + +The writer process is only allowed to modify raw data of existing datasets by; + +Appending data along any unlimited dimension. +Modifying existing data +The following operations are not allowed (and the corresponding HDF5 files will fail): + +The writer cannot add new objects to the file. +The writer cannot delete objects in the file. +The writer cannot modify or append data with variable length, string or region reference datatypes. +File space recycling is not allowed. As a result the size of a file modified by a SWMR writer may be larger than a file modified by a non-SWMR writer. + +Tools for Working with SWMR +Two new tools, h5watch and h5clear, are available for use with SWMR. The other HDF5 utilities have also been modified to recognize SWMR: + +The h5watch tool allows a user to monitor the growth of a dataset. +The h5clear tool clears the status flags in the superblock of an HDF5 file. +The rest of the HDF5 tools will exit gracefully but not work with SWMR otherwise. + +Programming Example +A good example of using SWMR is included with the HDF5 tests in the source code. You can run it while reading the file it creates. If you then interrupt the application and reader and look at the resulting file, you will see that the file is still valid. Follow these steps: + +Download the HDF5-1.10 source code to a local directory on a filesystem (that complies with POSIX write() semantics). Build the software. No special configuration options are needed to use SWMR. + +Invoke two command terminal windows. In one window go into the bin/ directory of the built binaries. In the other window go into the test/ directory of the HDF5-1.10 source code that was just built. + +In the window in the test/ directory compile and run use\_append\_chunk.c. The example writes a three dimensional dataset by planes (with chunks of size 1 x 256 x 256). + +In the other window (in the bin/ directory) run h5watch on the file created by use\_append\_chunk.c (use\_append\_chunk.h5). It should be run while use\_append\_chunk is executing and you will see valid data displayed with h5watch. + +Interrupt use\_append\_chunk while it is running, and stop h5watch. + +Use h5clear to clear the status flags in the superbock of the HDF5 file (use\_append\_chunk.h5). + +View the file with h5dump. You will see that it is a valid file even though the application did not close properly. It will contain data up to the point that it was interrupted. diff --git a/documentation/hdf5-docs/advanced_topics/intro_VDS.md b/documentation/hdf5-docs/advanced_topics/intro_VDS.md new file mode 100644 index 00000000..95485d5e --- /dev/null +++ b/documentation/hdf5-docs/advanced_topics/intro_VDS.md @@ -0,0 +1,109 @@ +--- +title: Introduction to the Virtual Dataset - VDS +redirect\_from: + +--- +##\*\*\* UNDER CONSTRUCTION \*\*\* + +# Introduction to the Virtual Dataset - VDS + +The HDF5 Virtual Dataset (VDS) feature enables users to access data in a collection of HDF5 files as a single HDF5 dataset and to use the HDF5 APIs to work with that dataset. + +For example, your data may be collected into four files: + + + +You can map the datasets in the four files into a single VDS that can be accessed just like any other dataset: + + + + + +The mapping between a VDS and the HDF5 source datasets is persistent and transparent to an application. If a source file is missing the fill value will be displayed. + +See the Virtual (VDS) Documentation for complete details regarding the VDS feature. + +The VDS feature was implemented using hyperslab selection (H5S\_SELECT\_HYPERSLAB). See the tutorial on Reading From or Writing to a Subset of a Dataset for more information on selecting hyperslabs. + +Programming Model +To create a Virtual Dataset you simply follow the HDF5 programming model and add a few additional API calls to map the source code datasets to the VDS. + +Following are the steps for creating a Virtual Dataset: + +Create the source datasets that will comprise the VDS +Create the VDS: ‐ Define a datatype and dataspace (can be unlimited) +‐ Define the dataset creation property list (including fill value) +‐ (Repeat for each source dataset) Map elements from the source dataset to elements of the VDS: +Select elements in the source dataset (source selection) +Select elements in the virtual dataset (destination selection) +Map destination selections to source selections (see Functions for Working with a VDS) + +‐ Call H5Dcreate using the properties defined above +Access the VDS as a regular HDF5 dataset +Close the VDS when finished + +Functions for Working with a VDS +The H5P\_SET\_VIRTUAL API sets the mapping between virtual and source datasets. This is a dataset creation property list. Using this API will change the layout of the dataset to H5D\_VIRTUAL. As with specifying any dataset creation property list, an instance of the property list is created, modified, passed into the dataset creation call and then closed: + + dcpl = H5Pcreate (H5P\_DATASET\_CREATE); + + src\_space = H5screate\_simple ... + status = H5Sselect\_hyperslab (space, ... + status = H5Pset\_virtual (dcpl, space, SRC\_FILE[i], SRC\_DATASET[i], src\_space); + + dset = H5Dcreate2 (file, DATASET, H5T\_NATIVE\_INT, space, H5P\_DEFAULT, dcpl, H5P\_DEFAULT); + + status = H5Pclose (dcpl); +There are several other APIs introduced with Virtual Datasets, including query functions. For details see the complete list of HDF5 library APIs that support Virtual Datasets + + +Limitations +This feature requires HDF5-1.10. +The number of source datasets is unlimited. However, there is a limit on the size of each source dataset. + + +Programming Examples +Example 1 +This example creates three HDF5 files, each with a one-dimensional dataset of 6 elements. The datasets in these files are the source datasets that are then used to create a 4 x 6 Virtual Dataset with a fill value of -1. The first three rows of the VDS are mapped to the data from the three source datasets as shown below: + + + + + +In this example the three source datasets are mapped to the VDS with this code: + + src\_space = H5Screate\_simple (RANK1, dims, NULL); + for (i = 0; i < 3; i++) { + start[0] = (hsize\_t)i; + /* Select i-th row in the virtual dataset; selection in the source datasets is the same. */ + status = H5Sselect\_hyperslab (space, H5S\_SELECT\_SET, start, NULL, count, block); + status = H5Pset\_virtual (dcpl, space, SRC\_FILE[i], SRC\_DATASET[i], src\_space); + } +After the VDS is created and closed, it is reopened. The property list is then queried to determine the layout of the dataset and its mappings, and the data in the VDS is read and printed. + +This example is in the HDF5 source code and can be obtained from here: + +C Example + +For details on compiling an HDF5 application: [ Compiling HDF5 Applications ] + +Example 2 +This example shows how to use a C-style printf statement for specifying multiple source datasets as one virtual dataset. Only one mapping is required. In other words only one H5P\_SET\_VIRTUAL call is needed to map multiple datasets. It creates a 2-dimensional unlimited VDS. Then it re-opens the file, makes queries, and reads the virtual dataset. + +The source datasets are specified as A-0, A-1, A-2, and A-3. These are mapped to the virtual dataset with one call: + + status = H5Pset\_virtual (dcpl, vspace, SRCFILE, "/A-%b", src\_space); + +The %b indicates that the block count of the selection in the dimension should be used. + +C Example + +For details on compiling an HDF5 application: [ Compiling HDF5 Applications ] + + +Using h5dump with a VDS +The h5dump utility can be used to view a VDS. The h5dump output for a VDS looks exactly like that for any other dataset. If h5dump cannot find a source dataset then the fill value will be displayed. + +You can determine that a dataset is a VDS by looking at its properties with h5dump -p. It will display each source dataset mapping, beginning with Mapping 0. Below is an excerpt of the output of h5dump -p on the vds.h5 file created in Example 1.You can see that the entire source file a.h5 is mapped to the first row of the /VDS dataset: + + diff --git a/documentation/hdf5-docs/advanced_topics_list.md b/documentation/hdf5-docs/advanced_topics_list.md new file mode 100644 index 00000000..a9326fa8 --- /dev/null +++ b/documentation/hdf5-docs/advanced_topics_list.md @@ -0,0 +1,13 @@ +--- +title: HDF5 Features +redirect_from: display/HDF5/Advanced+Topics+in+HDF5 + +--- + +# Advanced Topics in HDF5 + +### [HDF5 File Image Operations](https://docs.hdfgroup.org/hdf5/rfc/HDF5FileImageOperations.pdf) +### [Copying Committed Datatypes with H5Ocopy](https://docs.hdfgroup.org/hdf5/develop/group___o_c_p_p_l.html) +### [HDF5 Data Flow Pipeline for H5Dread](advanced_topics/data_flow_pline_H5Dread.md) +### [Introduction to Single-Writer_Multiple-Reader (SWMR)](advanced_topics/intro_SWMR.md) +### [Introduction to the Virtual Dataset - VDS](advanced_topics/intro_VDS.md) diff --git a/documentation/hdf5-docs/chunking_in_hdf5.md b/documentation/hdf5-docs/chunking_in_hdf5.md new file mode 100644 index 00000000..77f137a3 --- /dev/null +++ b/documentation/hdf5-docs/chunking_in_hdf5.md @@ -0,0 +1,236 @@ +--- +title: Chunking in HDF5 +redirect_from: + - display/HDF5/Chunking+in+HDF5 +--- + +# Chunking in HDF5 + +## Introduction + +Datasets in HDF5 not only provide a convenient, structured, and self-describing way to store data, but are also designed to do so with good performance. In order to maximize performance, the HDF5 library provides ways to specify how the data is stored on disk, how it is accessed, and how it should be held in memory. + +## What are Chunks? +Datasets in HDF5 can represent arrays with any number of dimensions (up to 32). However, in the file this dataset must be stored as part of the 1-dimensional stream of data that is the low-level file. The way in which the multidimensional dataset is mapped to the serial file is called the layout. The most obvious way to accomplish this is to simply flatten the dataset in a way similar to how arrays are stored in memory, serializing the entire dataset into a monolithic block on disk, which maps directly to a memory buffer the size of the dataset. This is called a contiguous layout. + +An alternative to the contiguous layout is the chunked layout. Whereas contiguous datasets are stored in a single block in the file, chunked datasets are split into multiple chunks which are all stored separately in the file. The chunks can be stored in any order and any position within the HDF5 file. Chunks can then be read and written individually, improving performance when operating on a subset of the dataset. + +The API functions used to read and write chunked datasets are exactly the same functions used to read and write contiguous datasets. The only difference is a single call to set up the layout on a property list before the dataset is created. In this way, a program can switch between using chunked and contiguous datasets by simply altering that call. Example 1, below, creates a dataset with a size of 12x12 and a chunk size of 4x4. The example could be changed to create a contiguous dataset instead by simply commenting out the call to H5Pset\_chunk and changing dcpl\_id in the H5Dcreate call to H5P\_DEFAULT. + +Example 1: Creating a chunked dataset + +~~~ +#include "hdf5.h" +#define FILENAME "file.h5" +#define DATASET "dataset" + +int main() { + + hid_t file_id, dset_id, space_id, dcpl_id; + hsize_t chunk_dims[2] = {4, 4}; + hsize_t dset_dims[2] = {12, 12}; + herr_t status; + int i, j; + int buffer[12][12]; + + /* Create the file */ + file_id = H5Fcreate(FILENAME, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT); + + /* Create a dataset creation property list and set it to use chunking */ + dcpl_id = H5Pcreate(H5P_DATASET_CREATE); + status = H5Pset_chunk(dcpl_id, 2, chunk_dims); + + /* Create the dataspace and the chunked dataset */ + space_id = H5Screate_simple(2, dset_dims, NULL); + dset_id = H5Dcreate(file_id, DATASET, H5T_STD_I32BE, space_id, H5P_DEFAULT, dcpl_id, H5P_DEFAULT); + + /* Initialize dataset */ + for (i = 0; i < 12; i++) + for (j = 0; j < 12; j++) + buffer[i][j] = i + j + 1; + + /* Write to the dataset */ + status = H5Dwrite(dset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer); + + /* Close */ + status = H5Dclose(dset_id); + status = H5Sclose(space_id); + status = H5Pclose(dcpl_id); + status = H5Fclose(file_id); +} +~~~ + +The chunks of a chunked dataset are split along logical boundaries in the dataset's representation as an array, not along boundaries in the serialized form. Suppose a dataset has a chunk size of 2x2. In this case, the first chunk would go from (0,0) to (2,2), the second from (0,2) to (2,4), and so on. By selecting the chunk size carefully, it is possible to fine tune I/O to maximize performance for any access pattern. Chunking is also required to use advanced features such as compression and dataset resizing. + +![Contiguous and chunked datasets](images/chunking1and2.PNG) + +## Data Storage Order +To understand the effects of chunking on I/O performance it is necessary to understand the order in which data is actually stored on disk. When using the C interface, data elements are stored in "row-major" order, meaning that, for a 2- dimensional dataset, rows of data are stored in-order on the disk. This is equivalent to the storage order of C arrays in memory. + +Suppose we have a 10x10 contiguous dataset B. The first element stored on disk is B[0][0], the second B[0][1], the eleventh B[1][0], and so on. If we want to read the elements from B[2][3] to B[2][7], we have to read the elements in the 24th, 25th, 26th, 27th, and 28th positions. Since all of these positions are contiguous, or next to each other, this can be done in a single read operation: read 5 elements starting at the 24th position. This operation is illustrated in figure 3: the pink cells represent elements to be read and the solid line represents a read operation. Now suppose we want to read the elements in the column from B[3][2] to B[7][2]. In this case we must read the elements in the 33rd, 43rd, 53rd, 63rd, and 73rd positions. Since these positions are not contiguous, this must be done in 5 separate read operations. This operation is illustrated in figure 4: the solid lines again represent read operations, and the dotted lines represent seek operations. An alternative would be to perform a single large read operation , in this case 41 elements starting at the 33rd position. This is called a sieve buffer and is supported by HDF5 for contiguous datasets, but not for chunked datasets. By setting the chunk sizes correctly, it is possible to greatly exceed the performance of the sieve buffer scheme. + + +![Reading part of row and reading part of column from a contiguous dataset](images/chunking3and4.PNG) + +Likewise, in higher dimensions, the last dimension specified is the fastest changing on disk. So if we have a four dimensional dataset A, then the first element on disk would be A[0][0][0][0], the second A[0][0][0][1], the third A[0][0][0][2], and so on. + +## Chunking and Partial I/O +The issues outlined above regarding data storage order help to illustrate one of the major benefits of dataset chunking, its ability to improve the performance of partial I/O. Partial I/O is an I/O operation (read or write) which operates on only one part of the dataset. To maximize the performance of partial I/O, the data elements selected for I/O must be contiguous on disk. As we saw above, with a contiguous dataset, this means that the selection must always equal the extent in all but the slowest changing dimension, unless the selection in the slowest changing dimension is a single element. With a 2-d dataset in C, this means that the selection must be as wide as the entire dataset unless only a single row is selected. With a 3-d dataset, this means that the selection must be as wide and as deep as the entire dataset, unless only a single row is selected, in which case it must still be as deep as the entire dataset, unless only a single column is also selected. + +Chunking allows the user to modify the conditions for maximum performance by changing the regions in the dataset which are contiguous. For example, reading a 20x20 selection in a contiguous dataset with a width greater than 20 would require 20 separate and non-contiguous read operations. If the same operation were performed on a dataset that was created with a chunk size of 20x20, the operation would require only a single read operation. In general, if your selections are always the same size (or multiples of the same size), and start at multiples of that size, then the chunk size should be set to the selection size, or an integer divisor of it. This recommendation is subject to the guidelines in the pitfalls section; specifically, it should not be too small or too large. + +Using this strategy, we can greatly improve the performance of the operation shown in figure 4. If we create the dataset with a chunk size of 10x1, each column of the dataset will be stored separately and contiguously. The read of a partial column can then be done is a single operation. This is illustrated in figure 5, and the code to implement a similar operation is shown in example 2. For simplicity, example 2 implements writing to this dataset instead of reading from it. + +![Reading part of a column from a chunked dataset](images/chunking5.PNG) + + +Example 2: Writing part of a column to a chunked dataset + +~~~ +#include "hdf5.h" +#define FILENAME "file.h5" +#define DATASET "dataset" + +int main() { + + hid\_t file\_id, dset\_id, fspace\_id, mspace\_id, dcpl\_id; + hsize\_t chunk\_dims[2] = {10, 1}; + hsize\_t dset\_dims[2] = {10, 10}; + hsize\_t mem\_dims[1] = {5}; + hsize\_t start[2] = {3, 2}; + hsize\_t count[2] = {5, 1}; + herr\_t status; + int buffer[5], i; + + /* Create the file */ + file\_id = H5Fcreate(FILENAME, H5F\_ACC\_TRUNC, H5P\_DEFAULT, H5P\_DEFAULT); + + /* Create a dataset creation property list to use chunking with a chunk size of 10x1 */ + dcpl\_id = H5Pcreate(H5P\_DATASET\_CREATE); + + status = H5Pset\_chunk(dcpl\_id, 2, chunk\_dims); + + /* Create the dataspace and the chunked dataset */ + fspace\_id = H5Screate\_simple(2, dset\_dims, NULL); + + dset\_id = H5Dcreate(file\_id, DATASET, H5T\_STD\_I32BE, fspace\_id, H5P\_DEFAULT, dcpl\_id, H5P\_DEFAULT); + + /* Select the elements from 3, 2 to 7, 2 */ + status = H5Sselect\_hyperslab(fspace\_id, H5S\_SELECT\_SET, start, NULL, count, NULL); + + /* Create the memory dataspace */ + mspace\_id = H5Screate\_simple(1, mem\_dims, NULL); + + /* Initialize dataset */ + for (i = 0; i < 5; i++) + buffer[i] = i+1; + + /* Write to the dataset */ + status = H5Dwrite(dset\_id, H5T\_NATIVE\_INT, mspace\_id, fspace\_id, H5P\_DEFAULT, buffer); + + /* Close */ + status = H5Dclose(dset\_id); + status = H5Sclose(fspace\_id); + status = H5Sclose(mspace\_id); + status = H5Pclose(dcpl\_id); + status = H5Fclose(file\_id); +} +~~~ + +## Chunk Caching +Another major feature of the dataset chunking scheme is the chunk cache. As it sounds, this is a cache of the chunks in the dataset. This cache can greatly improve performance whenever the same chunks are read from or written to multiple times, by preventing the library from having to read from and write to disk multiple times. However, the current implementation of the chunk cache does not adjust its parameters automatically, and therefore the parameters must be adjusted manually to achieve optimal performance. In some rare cases it may be best to completely disable the chunk caching scheme. Each open dataset has its own chunk cache, which is separate from the caches for all other open datasets. + +When a selection is read from a chunked dataset, the chunks containing the selection are first read into the cache, and then the selected parts of those chunks are copied into the user's buffer. The cached chunks stay in the cache until they are evicted, which typically occurs because more space is needed in the cache for new chunks, but they can also be evicted if hash values collide (more on this later). Once the chunk is evicted it is written to disk if necessary and freed from memory. + +This process is illustrated in figures 6 and 7. In figure 6, the application requests a row of values, and the library responds by bringing the chunks containing that row into cache, and retrieving the values from cache. In figure 7, the application requests a different row that is covered by the same chunks, and the library retrieves the values directly from cache without touching the disk. + +![Reading a row from a chunked dataset with the chunk cache enabled](images/chunking6.PNG) + +![Reading a row from a chunked dataset with the chunks already cached](images/chunking7.PNG) + +In order to allow the chunks to be looked up quickly in cache, each chunk is assigned a unique hash value that is used to look up the chunk. The cache contains a simple array of pointers to chunks, which is called a hash table. A chunk's hash value is simply the index into the hash table of the pointer to that chunk. While the pointer at this location might instead point to a different chunk or to nothing at all, no other locations in the hash table can contain a pointer to the chunk in question. Therefore, the library only has to check this one location in the hash table to tell if a chunk is in cache or not. This also means that if two or more chunks share the same hash value, then only one of those chunks can be in the cache at the same time. When a chunk is brought into cache and another chunk with the same hash value is already in cache, the second chunk must be evicted first. Therefore it is very important to make sure that the size of the hash table, also called the nslots parameter in H5Pset\_cache and H5Pset\_chunk\_cache, is large enough to minimize the number of hash value collisions. + +Prior to 1.10, the library determines the hash value for a chunk by assigning a unique index that is a linear index into a hypothetical array of chunks. That is, the upper-left chunk has an index of 0, the one to the right of that has an index of 1, and so on. + +For example, the algorithm prior to 1.10 simply incremented the index by one along the fastest growing dimension. The diagram below illustrates the indices for a 5 x 3 chunk prior to HDF5 1.10: + +0 1 2 +3 4 5 +6 7 8 +9 10 11 +12 13 14 + +As of HDF5 1.10, the library uses a more complicated way to determine the chunk index. Each dimension gets a fixed number of bits for the number of chunks in that dimension. When creating the dataset, the library first determines the number of bits needed to encode the number of chunks in each dimension individually by using the log2 function. It then partitions the chunk index into bitfields, one for each dimension, where the size of each bitfield is as computed above. The fastest changing dimension is the least significant bit. To compute the chunk index for an individual chunk, for each dimension, the coordinates of that chunk in an array of chunks is placed into the corresponding bitfield. The 5 x 3 chunk example above needs 5 bits for its indices (as shown below, the 3 bits in blue are for the row, and the 2 bits in green are for the column): + +![5 bits](images/chunking8.PNG) + +Therefore, the indices for the 5 x 3 chunks become like this: + +0 1 2 +4 5 6 +8 9 10 +12 13 14 +16 17 18 + +This index is then divided by the size of the hash table, nslots, and the remainder, or modulus, is the hash value. Because this scheme can result in regularly spaced indices being used frequently, it is important that nslots be a prime number to minimize the chance of collisions. In general, nslots should probably be set to a number approximately 100 times the number of chunks that can fit in nbytes bytes, unless memory is extremely limited. There is of course no advantage in setting nslots to a number larger than the total number of chunks in the dataset. + +The w0 parameter affects how the library decides which chunk to evict when it needs room in the cache. If w0 is set to 0, then the library will always evict the least recently used chunk in cache. If w0 is set to 1, the library will always evict the least recently used chunk which has been fully read or written, and if none have been fully read or written, it will evict the least recently used chunk. If w0 is between 0 and 1, the behavior will be a blend of the two. Therefore, if the application will access the same data more than once, w0 should be set closer to 0, and if the application does not, w0 should be set closer to 1. + +It is important to remember that chunk caching will only give a benefit when reading or writing the same chunk more than once. If, for example, an application is reading an entire dataset, with only whole chunks selected for each operation, then chunk caching will not help performance, and it may be preferable to completely disable the chunk cache in order to save memory. It may also be advantageous to disable the chunk cache when writing small amounts to many different chunks, if memory is not large enough to hold all those chunks in cache at once. + +## I/O Filters and Compression + +Dataset chunking also enables the use of I/O filters, including compression. The filters are applied to each chunk individually, and the entire chunk is processed at once. The filter must be applied every time the chunk is loaded into cache, and every time the chunk is flushed to disk. These facts all make choosing the proper settings for the chunk cache and chunk size even more critical for the performance of filtered datasets. + +Because the entire chunk must be filtered every time disk I/O occurs, it is no longer a viable option to disable the chunk cache when writing small amounts of data to many different chunks. To achieve acceptable performance, it is critical to minimize the chance that a chunk will be flushed from cache before it is completely read or written. This can be done by increasing the size of the chunk cache, adjusting the size of the chunks, or adjusting I/O patterns. + +## Chunk Maximum Limits + +Chunks have some maximum limits. They are: + +* The maximum number of elements in a chunk is 232-1 which is equal to 4,294,967,295. +* The maximum size for any chunk is 4GB. +* The size of a chunk cannot exceed the size of a fixed-size dataset. For example, a dataset consisting of a 5x4 fixed-size array cannot be defined with 10x10 chunks. + +For more information, see the entry for H5P\_SET\_CHUNK in the HDF5 Reference Manual. + +## Pitfalls + +Inappropriate chunk size and cache settings can dramatically reduce performance. There are a number of ways this can happen. Some of the more common issues include: + +* Chunks are too small +There is a certain amount of overhead associated with finding chunks. When chunks are made smaller, there are more of them in the dataset. When performing I/O on a dataset, if there are many chunks in the selection, it will take extra time to look up each chunk. In addition, since the chunks are stored independently, more chunks results in more I/O operations, further compounding the issue. The extra metadata needed to locate the chunks also causes the file size to increase as chunks are made smaller. Making chunks larger results in fewer chunk lookups, smaller file size, and fewer I/O operations in most cases. + +* Chunks are too large +It may be tempting to simply set the chunk size to be the same as the dataset size in order to enable compression on a contiguous dataset. However, this can have unintended consequences. Because the entire chunk must be read from disk and decompressed before performing any operations, this will impose a great performance penalty when operating on a small subset of the dataset if the cache is not large enough to hold the one-chunk dataset. In addition, if the dataset is large enough, since the entire chunk must be held in memory while compressing and decompressing, the operation could cause the operating system to page memory to disk, slowing down the entire system. + +* Cache is not big enough +Similarly, if the chunk cache is not set to a large enough size for the chunk size and access pattern, poor performance will result. In general, the chunk cache should be large enough to fit all of the chunks that contain part of a hyperslab selection used to read or write. When the chunk cache is not large enough, all of the chunks in the selection will be read into cache, written to disk (if writing), and evicted. If the application then revisits the same chunks, they will have to be read and possibly written again, whereas if the cache were large enough they would only have to be read (and possibly written) once. However, if selections for I/O always coincide with chunk boundaries, this does not matter as much, as there is no wasted I/O and the application is unlikely to revisit the same chunks soon after. + +If the total size of the chunks involved in a selection is too big to practically fit into memory, and neither the chunk nor the selection can be resized or reshaped, it may be better to disable the chunk cache. Whether this is better depends on the storage order of the selected elements. It will also make little difference if the dataset is filtered, as entire chunks must be brought into memory anyways in that case. When the chunk cache is disabled and there are no filters, all I/O is done directly to and from the disk. If the selection is mostly along the fastest changing dimension (i.e. rows), then the data will be more contiguous on disk, and direct I/O will be more efficient than reading entire chunks, and hence the cache should be disabled. If however the selection is mostly along the slowest changing dimension (columns), then the data will not be contiguous on disk, and direct I/O will involve a large number of small operations, and it will probably be more efficient to just operate on the entire chunk, therefore the cache should be set large enough to hold at least 1 chunk. To disable the chunk cache, either nbytes or nslots should be set to 0. + +* Improper hash table size +Because only one chunk can be present in each slot of the hash table, it is possible for an improperly set hash table size (nslots) to severely impact performance. For example, if there are 100 columns of chunks in a dataset, and the hash table size is set to 100, then all the chunks in each row will have the same hash value. Attempting to access a row of elements will result in each chunk being brought into cache and then evicted to allow the next one to occupy its slot in the hash table, even if the chunk cache is large enough, in terms of nbytes, to hold all of them. Similar situations can arise when nslots is a factor or multiple of the number of rows of chunks, or equivalent situations in higher dimensions. + +Luckily, because each slot in the hash table only occupies the size of the pointer for the system, usually 4 or 8 bytes, there is little reason to keep nslots small. Again, a general rule is that nslots should be set to a prime number at least 100 times the number of chunks that can fit in nbytes, or simply set to the number of chunks in the dataset. + +## Additional Resources + +The slide set “HDF5 Advanced Topics: Chunking in HDF5” (PDF), a tutorial from HDF and HDF-EOS Workshop XIII (2009) provides additional HDF5 chunking use cases and examples. + +The page HDF5 Examples lists many code examples that are regularly tested with the HDF5 library. Several illustrate the use of chunking in HDF5, particularly “Read/Write Chunked Dataset” and any examples demonstrating filters. + +“Dataset Chunking Issues” provides additional information regarding chunking that has not yet been incorporated into this document. + +Directions for Future Development +As seen above, the HDF5 chunk cache currently requires careful control of the parameters in order to achieve optimal performance. In the future, we plan to improve the chunk cache to be more foolproof in many ways, and deliver acceptable performance in most cases even when no thought is given to the chunking parameters. + +One way to make the chunk cache more user-friendly is to automatically resize the chunk cache as needed for each operation. The cache should be able to detect when the cache should be skipped or when it needs to be enlarged based on the pattern of I/O operations. At a minimum, it should be able to detect when the cache would severely hurt performance for a single operation and disable the cache for that operation. This would of course be optional. + +Another way is to allow chaining of entries in the hash table. This would make the hash table size much less of an issue, as chunks could share the same hash value by making a linked list. + +Finally, it may even be desirable to set some reasonable default chunk size based on the dataset size and possibly some other information on the intended access pattern. This would probably be a high-level routine. + +Other features planned for chunking include new index methods (besides b-trees), disabling filters for chunks that are partially over the edge of a dataset, only storing the used portions of these edge chunks, and allowing multiple reader processes to read the same dataset as a single writer process writes to it. + + diff --git a/documentation/hdf5-docs/images/chunking1and2.PNG b/documentation/hdf5-docs/images/chunking1and2.PNG new file mode 100644 index 00000000..d0828675 Binary files /dev/null and b/documentation/hdf5-docs/images/chunking1and2.PNG differ diff --git a/documentation/hdf5-docs/images/chunking3and4.PNG b/documentation/hdf5-docs/images/chunking3and4.PNG new file mode 100644 index 00000000..b17c7bac Binary files /dev/null and b/documentation/hdf5-docs/images/chunking3and4.PNG differ diff --git a/documentation/hdf5-docs/images/chunking5.PNG b/documentation/hdf5-docs/images/chunking5.PNG new file mode 100644 index 00000000..130a85ec Binary files /dev/null and b/documentation/hdf5-docs/images/chunking5.PNG differ diff --git a/documentation/hdf5-docs/images/chunking6.PNG b/documentation/hdf5-docs/images/chunking6.PNG new file mode 100644 index 00000000..495d55f4 Binary files /dev/null and b/documentation/hdf5-docs/images/chunking6.PNG differ diff --git a/documentation/hdf5-docs/images/chunking7.PNG b/documentation/hdf5-docs/images/chunking7.PNG new file mode 100644 index 00000000..ce73cf82 Binary files /dev/null and b/documentation/hdf5-docs/images/chunking7.PNG differ diff --git a/documentation/hdf5-docs/images/chunking8.PNG b/documentation/hdf5-docs/images/chunking8.PNG new file mode 100644 index 00000000..f21a46f4 Binary files /dev/null and b/documentation/hdf5-docs/images/chunking8.PNG differ diff --git a/documentation/hdf5-docs/registered_filter_plugins.md b/documentation/hdf5-docs/registered_filter_plugins.md deleted file mode 100644 index b9428123..00000000 --- a/documentation/hdf5-docs/registered_filter_plugins.md +++ /dev/null @@ -1,76 +0,0 @@ ---- -title: Registered Filter Plugins ---- - -# Registered Filter Plugins - -Please be sure to see HDF5 Filter Plugins, a convenience software that packages together many of the commonly used filters that users have created and registered. - -## Information on Registered Filter Plugins -Members of the HDF5 user community can create and register Third-Party (compression or other) filters for use with HDF5. See Example Code to Enable BZIP2 Compression in HDF5 for how to create a filter. - -To register a filter please contact The HDF Helpdesk with the following information: -* Contact information for the developer requesting a new identifier -* Short description of the new filter -* Links to any relevant information including licensing information - -Here is the current policy regarding filter identifier assignment: -* The filter identifier is designed to be a unique identifier for the filter. Values from zero through 32,767 are reserved for filters supported by The HDF Group in the HDF5 library and for filters requested and supported by the 3rd party. -* Values from 32768 to 65535 are reserved for non-distributed uses (e.g., internal company usage) or for application usage when testing a feature. The HDF Group does not track or document the usage of filters with identifiers from this range. - -Please contact the maintainer of a filter for help with the filter/compression support in HDF5. - -## List of Filters Registered with The HDF Group - -| Filter | Identifier Name | Short Description | -| --- | --- | --- | -| 305 | LZO | LZO lossless compression used by PyTables -| 307 | BZIP2 | BZIP2 lossless compression used by PyTables -| 32000 | LZF | LZF lossless compression used by H5Py project -| 32001 | BLOSC | Blosc lossless compression used by PyTables -| 32002 | MAFISC | Modified LZMA compression filter, MAFISC (Multidimensional Adaptive Filtering Improved Scientific data Compression) -| 32003 | Snappy | Snappy lossless compression -| 32004 | LZ4 | LZ4 fast lossless compression algorithm -| 32005 | APAX | Samplify's APAX Numerical Encoding Technology -| 32006 | CBF | All imgCIF/CBF compressions and decompressions, including Canonical, Packed, Packed Version 2, Byte Offset and Nibble Offset -| 32007 | JPEG-XR | Enables images to be compressed/decompressed with JPEG-XR compression -| 32008 | bitshuffle | Extreme version of shuffle filter that shuffles data at bit level instead of byte level -| 32009 | SPDP | SPDP fast lossless compression algorithm for single- and double-precision floating-point data -| 32010 | LPC-Rice | LPC-Rice multi-threaded lossless compression -| 32011 | CCSDS-123 | ESA CCSDS-123 multi-threaded compression filter -| 32012 | JPEG-LS | CharLS JPEG-LS multi-threaded compression filter -| [32013](https://h5z-zfp.readthedocs.io/en/latest/) | [zfp](https://zfp.readthedocs.io/en/latest/) | Lossy & lossless compression of floating point and integer datasets to meet rate, accuracy, and/or precision targets. -| 32014 | fpzip | Fast and Efficient Lossy or Lossless Compressor for Floating-Point Data -| 32015 | Zstandard | Real-time compression algorithm with wide range of compression / speed trade-off and fast decoder -| 32016 | B³D | GPU based image compression method developed for light-microscopy applications -| 32017 | SZ | An error-bounded lossy compressor for scientific floating-point data -| 32018 | FCIDECOMP | EUMETSAT CharLS compression filter for use with netCDF -| 32019 | JPEG | Jpeg compression filter -| 32020 | VBZ | Compression filter for raw dna signal data used by Oxford Nanopore -| 32021 | FAPEC | Versatile and efficient data compressor supporting many kinds of data and using an outlier-resilient entropy coder -| 32022 | BitGroom| The BitGroom quantization algorithm -| 32023 | Granular BitRound (GBR) | The GBG quantization algorithm is a significant improvement to the BitGroom filter -| 32024 | SZ3 | A modular error-bounded lossy compression framework for scientific datasets -| 32025 | Delta-Rice | Lossless compression algorithm optimized for digitized analog signals based on delta encoding and rice coding -| 32026 | BLOSC | The recent new-generation version of the Blosc compression library -| 32027 | FLAC | FLAC audio compression filter in HDF5 - -## Example Code to enable BZIP2 Compression in HDF5 - -Please be aware that compression filters require that the library not use `H5_MEMORY_ALLOC_SANITY_CHECK`. Building in debug mode automatically enables this feature in earlier releases, which causes memory allocation and free problems in filter applications. Future versions of HDF5 will not enable this feature. - -The [`bz_example.tar.gz`](/documentation/hdf5-docs/bz_example.tar.gz) file contains an example of implementing the BZIP2 filter to enable BZIP2 compression in HDF5. (This example is based on PyTables code that uses BZIP2 compression.). Download and uncompress this file as follows: - - gzip -cd bz_example.tar.gz | tar xvf - - -To compile the example, you will need to install the HDF5 library and use the h5cc compile script found in the bin/ directory of the HDF5 installation. - -For information on h5cc, see [Compiling Your HDF5 Application](https://docs.hdfgroup.org/hdf5/develop/_l_b_compiling.html). - -Please note that tools like h5dump that display information in an HDF5 file will not be able to display data that is compressed with BZIP2 compression, since BZIP2 is not implemented in HDF5. - -However, as of HDF5-1.8.11, a new HDF5 feature will enable the `h5dump` tool to determine that the data is compressed with an external compression filter such as BZIP2, and will automatically load the appropriate library and display the uncompressed data. - -The bz_example example code can be used for modifying the HDF5 source to "include" BZIP2 as one of the "internal" filters. For information on how to do this, see how ZLIB (the deflate filter) is implemented in the HDF5 source code. Specifically look at these files: - - `H5Z.c, H5Zdeflate.c and H5Pocpl.c` diff --git a/documentation/hdf5-docs/release_specific_info.md b/documentation/hdf5-docs/release_specific_info.md index 7de6acfd..45f7ae4c 100644 --- a/documentation/hdf5-docs/release_specific_info.md +++ b/documentation/hdf5-docs/release_specific_info.md @@ -26,4 +26,4 @@ redirect_from: * New Features * [Software Changes from Release to Release](release_specifics/sw_changes_1.8.md) -### [API compatibility Macros in HDF5](release_specifics/api_comp_macros.md) +### [API Compatibility Macros in HDF5](release_specifics/api_comp_macros.md) diff --git a/documentation/hdf5-docs/release_specific_info.md_worked b/documentation/hdf5-docs/release_specific_info.md_worked deleted file mode 100644 index 0804823e..00000000 --- a/documentation/hdf5-docs/release_specific_info.md_worked +++ /dev/null @@ -1,27 +0,0 @@ ---- -title: Release Specific Information -redirect_from: - - display/HDF5/Release+Specific+Information ---- - -### [HDF5 1.14](/documentation/hdf5-docs/release_specifics/hdf5_1_14.md) -* [New Features](/documentation/hdf5-docs/release_specifics/new_features_1_14.md) -* [Software Changes from Release to Release](/documentation/hdf5-docs/release_specifics/sw_changes_1.14.md) -* [Migrating from HDF5 1.12 to HDF5 1.14](/documentation/hdf5-docs/release_specifics/Migrating_from_HDF5_1.12_to_HDF5_1.14.md) - -### [HDF5 1.12](/documentation/hdf5-docs/release_specifics/hdf5_1_12.md) -* [New Features](/documentation/hdf5-docs/release_specifics/new_features_1_12.md) -* [Software Changes from Release to Release](/documentation/hdf5-docs/release_specifics/sw_changes_1.12.md) -* [Migrating from HDF5 1.10 to HDF5 1.12](/documentation/hdf5-docs/release_specifics/Migrating_from_HDF5_1.10_to_HDF5_1.14.md) - -### [HDF5 1.10](/documentation/hdf5-docs/release_specifics/hdf5_1_10.md) -* [New Features](/documentation/hdf5-docs/release_specifics/new_features_1_10.md) -* [Why should I care about the HDF5-1.10.2 release? (blog)]() -* [Software Changes from Release to Release](/documentation/hdf5-docs/release_specifics/sw_changes_1.10.md) -* [Migrating from HDF5 1.8 to HDF5 1.10](/documentation/hdf5-docs/release_specifics/) - -### [HDF5 1.8](/documentation/hdf5-docs/release_specifics/hdf5_1_8.md) -* New Features -* [Software Changes from Release to Release](/documentation/hdf5-docs/release_specifics/sw_changes_1.8.md) - -### [API compatibility Macros in HDF5](documentation/hdf5-docs/release_specifics/api_comp_macros.md) diff --git a/documentation/hdf5-docs/release_specifics/api_comp_macros.md b/documentation/hdf5-docs/release_specifics/api_comp_macros.md deleted file mode 100644 index 9e3c06cc..00000000 --- a/documentation/hdf5-docs/release_specifics/api_comp_macros.md +++ /dev/null @@ -1,384 +0,0 @@ ---- -title: API Compatibility Macros -redirect_from: - - display/HDF5/API+Compatibility+Macros ---- - -# API Compatibility Macros - -## Audience -The target audience for this document has existing applications that use the HDF5 library, and is considering moving to the latest HDF5 release to take advantage of the latest library features and enhancements. - -## Compatibility Issues -With each major release of HDF5, such as 1.12 or 1.10, certain compatibility issues must be considered when migrating applications from an earlier major release. - -This document describes the approach taken by The HDF Group to help existing users of HDF5 address compatibility issues in the HDF5 API. - -## Summary and Motivation -In response to new and evolving requirements for the library and data format, several basic functions have changed since HDF5 was first released. To allow existing applications to continue to compile and run properly, all versions of these functions have been retained in the later releases of the HDF5 library. - -Given the scope of changes available with each major release of HDF5, and recognizing the potentially time-consuming task of editing all the affected calls in user applications, The HDF Group has created a set of macros that can be used to flexibly and easily map existing API calls to previous release functions. We refer to these as the API compatibility macros. - -The HDF Group generally encourages users to update applications to work with the latest HDF5 library release so that all new features and enhancements are available to them. At the same time, The HDF Group understands that, under some circumstances, updating applications may not be feasible or necessary. The API compatibility macros, described in this document, provide a bridge from old APIs to new and can be particularly helpful in situations such as these: - -Source code is not available - only binaries are available; updating the application is not feasible. -Source code is available, but there are no resources to update it. -Source code is available, as are resources to update it, but the old version works quite well so updates are not a priority. At the same time, it is desirable to take advantage of certain efficiencies in the newer HDF5 release that do not require code changes. -Source code is available, as are resources to update it, but the applications are large or complex, and must continue to run while the code updates are carried out. - -## Understanding and Using the Macros -As part of latest HDF5 release, several functions that existed in previous versions of the library were updated with new calling parameters and given new names. The updated versions of the functions have a number (for eg '2') at the end of the original function name. The original versions of these functions were retained and renamed to have an earlier number (for eg '1') at the end of the original function name. - -For example, consider the function H5Lvisit in HDF5 release 1.10 as compared with 1.12: - -| | | -| ------------------------------------- | ------------------------------------------------------------------------------------------------- | -| Original function name and signature in 1.10.0 | herr_t H5Lvisit ( hid_t grp_id, H5_index_t idx_type, H5_iter_order_t order, H5L_iterate_t op, void \*op_data ) -| Updated function and signature, introduced in release 1.12.0 | herr_t H5Lvisit2 ( hid_t group_id, H5_index_t idx_type, H5_iter_order_t order, H5L_iterate2_t op, void \*op_data ) | -| Original function and signature, renamed in release 1.12.0 | herr_t H5Lvisit1 ( hid_t group_id, H5_index_t idx_type, H5_iter_order_t order, H5L_iterate1_t op, void \*op_data ) | -| API compatibility macro, introduced in release 1.12.0 | H5Lvisit
    The macro, H5Lvisit, will be mapped to either H5Lvisit1 or H5Lvisit2. The mapping is determined by a combination of the configuration options use to build the HDF5 library and compile-time options used to build the application. The calling parameters used with the H5Lvisit compatibility macro should match the number and type of the function the macros will be mapped to (H5Lvisit1 or H5Lvisit2).
    The function names ending in <91>1<92> or <91>2<92> are referred to as versioned names, and the corresponding functions are referred to as versioned functions. For new code development, The HDF Group recommends use of the compatibility macro mapped to the latest version of the function. The original version of the function should be considered deprecated and, in general, should not be used when developing new code. | - -## Compatibility Macro Mapping Options To determine the mapping for a given API compatibility macro in a given application, a combination of user-controlled selections, collectively referred to as the compatibility macro mapping options, is considered in the following sequence: - -What compatibility macro configuration option was used to build the HDF5 library? We refer to this selection as the library mapping. - -Was a compatibility macro global compile-time option specified when the application was built? We refer to this (optional) selection as the application mapping. If an application mapping exists, it overrides the library mapping. (See adjacent notes.) - Were any compatibility macro function-level compile-time options specified when the application was built? We refer to these (optional) selections as function mappings. If function mappings exist, they override library and application mappings for the relevant API compatibility macros. (See adjacent notes.) - -Notes: An application mapping can map APIs to the same version or to a version older than the configured library mapping. When the application attempts to map APIs to a newer version of the API than the library was configured with, it will fail to <93>upgrade<94> the mapping (and may fail silently). -When it is necessary to <93>upgrade<94> the macro mappings from those set in the library mapping, it must be done at the per-function level, using the function-level mappings. As long as one does not try to map a function to a version that was compiled out in the library mapping, individual functions can be upgraded or downgraded freely. - - -## Library Mapping Options - -When the HDF5 library is built, configure flags can be used to control the API compatibility macro mapping behavior exhibited by the library. This behavior can be overridden by application and function mappings. One configure flag excludes deprecated functions from the HDF5 library, making them unavailable to applications linked with the library. - - - -Table 1: Library Mapping Options - -| configure flag | Macros map to release
    (versioned function; H5Lvisit shown) | Deprecated functions available?
    (H5Lvisit1) | -| -------------- | ------------------------------------------------------------- | ---------------------------------------------- | -| --with-default-api-version=v112
    (the default in 1.12) | 1.12.x (H5Lvisit2) | yes | -| --with-default-api-version=v110 | 1.10.x (H5Lvisit1) | yes | -| --with-default-api-version=v18 | 1.8.x (H5Lvisit1) | yes | -| --with-default-api-version=v16 | 1.6.x (H5Lvisit1) | yes | -| --disable-deprecated-symbols | 1.12.x (H5Lvisit2) | no | - - -Refer to the file libhdf5.settings in the directory where the HDF5 library is installed to determine the configure flags used to build the library. In particular, look for the two lines shown here under Features: - - Default API mapping: v112 - - With deprecated public symbols: yes - -## Application Mapping Options - -When an application using HDF5 APIs is built and linked with the HDF5 library, compile-time options to h5cc can be used to control the API compatibility macro mapping behavior exhibited by the application. The application mapping overrides the behavior specified by the library mapping, and can be overridden on a function-by-function basis by the function mappings. - -If the HDF5 library was configured with the --disable-deprecated-symbols flag, then the deprecated functions will not be available, regardless of the application mapping options. - Table 2: Application Mapping Options - -| h5cc option | Macros map to release
    (versioned function; H5Lvisit shown) | Deprecated functions available?
    (H5Lvisit1) | -| -------------- | ------------------------------------------------------------- | ---------------------------------------------- | -| -DH5_USE_112_API
    (Default behavior if no option specified.) | 1.12.x (H5Lvisit2) | yes
    \*if available in library | -| -DH5_USE_110_API | 1.10.x (HLvisit1) | yes\*
    \*if available in library | -| -DH5_USE_18_API | 1.8.x (H5Lvisit1) | yes\* -\*if available in library | -| -DH5_USE_16_API | 1.6.x (H5Lvisit1) | yes\* -\*if available in library | -| -DH5_NO_DEPRECATED_SYMBOLS | 1.10.x (H5Lvisit1) | no | - - -## Function Mapping Options - -Function mappings are specified when the application is built. These mappings can be used to control the mapping of the API compatibility macros to underlying functions on a function-by-function basis. The function mappings override the library and application mappings discussed earlier. - -If the HDF5 library was configured with the --disable-deprecated-symbols flag, or -DH5_NO_DEPRECATED_SYMBOLS is used to compile the application, then the deprecated functions will not be available, regardless of the function mapping options. - -For every function with multiple available versions, a compile-time version flag can be defined to selectively map the function macro to the desired versioned function. The function mapping consists of the function name followed by "\_vers" which is mapped by number to a specific function or struct: - -| Macro | Function Mapping | Mapped to function or struct | -| ----- | --------------------- | -------------------------- | -| H5xxx | H5xxx_vers=1 | H5xxx1 | -| | H5xxx_vers=2 | H5xxx2 | - -For example, in version 1.10 the H5Rreference macro can be mapped to either H5Rreference1 or H5Rreference2. When used, the value of the H5Rreference_vers compile-time version flag determines which function will be called: - -* When H5Rreference_vers is set to 1, the macro H5Rreference will be mapped to H5Rreference1. - h5cc ... -DH5Rreference_vers=1 ... - - -* When H5Rdereference_vers is set to 2, the macro H5Rdereference will be mapped to H5Rdereference2. - h5cc ... -DH5Rreference_vers=2 ... - - -* When H5Rreference_vers is not set, the macro H5Rreference will be mapped to either H5Rreference1 or H5Rreference2, based on the application mapping, if one was specified, or on the library mapping. - h5cc ... - -~~~ -Please be aware that some function mappings use mapped structures, as well. If compiling an application with a function mapping that uses a mapped structure, you must include each function and mapped structure plus EVERY function that uses the mapped structure, whether or not that function is used in the application. In 1.12, mappings of structures are used by the H5L and H5O function mappings. - -For example, the application h5ex_g_iterate.c (found on the Examples by API page under "Groups") only calls H5Lvisit , H5Ovisit , and H5Oget_info_by_name . To compile this application with 1.10 APIs in 1.12 with the function specific mappings, then not only must H5Lvisit_vers, H5Ovisit_vers, and H5Oget_info_by_name_vers be specified on the command line, but the mapped structures and every function that uses the mapped structures must be included, as well. The full compile line is shown below: - h5cc -DH5Lvisit_vers=1 -DH5Ovisit_vers=1 -DH5Oget_info_by_name_vers=1 -DH5Lvisit_by_name_vers=1 -DH5Literate_vers=1 -DH5Literate_by_name_vers=1 -DH5O_info_t_vers=1 -DH5L_info_t_vers=1 -DH5L_iterate_t_vers=1 -DH5Lget_info_by_idx_vers=1 -DH5Lget_info_vers=1 h5ex_g_visit.c -~~~ - -Function Mapping Options in Releases 1.12.x - - -H5L_GET_INFO -H5L_GET_INFO2 - -Function mapping: H5Lget_info_vers=2 -Struct mapping: H5L_info_t_vers=2 -H5L_GET_INFO1 - -Function mapping H5Lget_info_vers=1 -Struct mapping: H5L_info_t_vers=1 -H5L_GET_INFO_BY_IDX -H5L_GET_INFO_BY_IDX2 - -Function mapping: H5Lget_info_by_idx_vers=2 -Struct mapping: H5L_info_t_vers=2 -H5L_GET_INFO_BY_IDX1 - -Function mapping: H5Lget_info_by_idx_vers=1 -Struct mapping: H5L_info_t_vers=1 -H5L_ITERATE -H5L_ITERATE2 - -Function mapping: H5Literate_vers=2 -Struct mapping: H5L_iterate_t_vers=2 -H5L_ITERATE1 - -Function mapping: H5Literate_vers=1 -Struct mapping: H5L_iterate_t_vers=1 -H5L_ITERATE_BY_NAME -H5L_ITERATE_BY_NAME2 - -Function mapping: H5Literate_by_name_vers=2 -Struct mapping: H5L_iterate_t_vers=2 -H5L_ITERATE_BY_NAME1 - -Function mapping: H5Literate_by_name_vers=1 -Struct mapping: H5L_iterate_t_vers=1 -H5L_VISIT -H5L_VISIT2 - -Function mapping: H5Lvisit_vers=2 -Struct mapping: H5L_iterate_t_vers=2 -H5L_VISIT1 - -Function mapping: H5Lvisit_vers=1 -Struct mapping: H5L_iterate_t_vers=1 -H5L_VISIT_BY_NAME -H5L_VISIT_BY_NAME2 - -Function mapping: H5Lvisit_by_name_vers=2 -Struct mapping: H5L_iterate_t_vers=2 -H5L_VISIT_BY_NAME1 - -Function mapping: H5Lvisit_by_name_vers=1 -Struct mapping: H5L_iterate_t_vers=1 -H5O_GET_INFO -H5O_GET_INFO3 - -Function mapping: H5Oget_info_vers=3 -Struct mapping: H5O_info_t_vers=2 -H5O_GET_INFO1 - -Function mapping: H5Oget_info_vers=1 -Struct mapping: H5O_info_t_vers=1 -H5O_GET_INFO_BY_IDX -H5O_GET_INFO_BY_IDX3 - -Function mapping: H5Oget_info_by_idx_vers=3 -Struct mapping: H5O_info_t_vers=2 -H5O_GET_INFO_BY_IDX1 - -Function mapping: H5Oget_info_by_idx_vers=1 -Struct mapping: H5O_info_t_vers=1 -H5O_GET_INFO_BY_NAME -H5O_GET_INFO_BY_NAME3 - -Function mapping: H5O_get_info_by_name_vers=3 -Struct mapping: H5O_info_t_vers=2 -H5O_GET_INFO_BY_NAME1 - -Function mapping: H5O_get_info_by_name_vers=1 -Struct mapping: H5O_info_t_vers=1 -H5O_VISIT -H5O_VISIT3 - -Function mapping: H5Ovisit_vers=3 -Struct mapping: H5O_iterate_t_vers=2 -H5O_VISIT1 - -Function mapping: H5Ovisit_vers=1 -Struct mapping: H5O_iterate_t_vers=1 -H5O_VISIT_BY_NAME -H5O_VISIT_BY_NAME3 - -Function mapping: H5Ovisit_by_name_vers=3 -Struct mapping: H5O_iterate_t_vers=2 -H5O_VISIT_BY_NAME1 - -Function mapping: H5Ovisit_by_name_vers=1 -Struct mapping: H5O_iterate_t_vers=1 -H5P_ENCODE -H5P_ENCODE2 - -Function mapping: H5Pencode_vers=2 -H5P_ENCODE1 - -Function mapping: H5Pencode_vers=1 -H5S_ENCODE -H5S_ENCODE2 - -Function mapping: H5Sencode_vers=2 -H5S_ENCODE1 - -Function mapping: H5Sencode_vers=1 -Function Mapping Options in Releases 1.10.x - -The versioned H5Oget_info functions (H5Oget_info1 and H5Oget_info2) were added in 1.10.3, and H5Oget_info was replaced by a macro to invoke H5Oget_info1 or H5Oget_info2. However, this broke compatibility and caused problems for users because there was no longer a function H5Oget_info. In 1.10.4 and subsequent 1.10.x versions the macro was removed, H5Oget_info1 was deprecated, and H5Oget_info was resurrected as a function. H5Oget_info2 remained as a function, but is not a versioned alternative to the original H5Oget_info. The same is true for H5Oget_info_by name, H5Oget_info_by_idx, H5Ovisit, and H5Ovisit_by_name. The version 2 functions were added to improve performance. - -The unversioned originals and version 2 of those functions exist in 1.10 because having released them in 1.10.3, it would break compatibility to remove them, so the original and version 2 functions remained in the source but without any macro to map to one or the other. Therefore, version 2 functions are available, but only when invoked directly. - -In 1.12 there is a version 3 of all 5 functions which uses version 2 H5Oinfo2_t or H5Oiterate2_t structures. Both versions 1 and 2 are deprecated and macros replace the unversioned functions, mapping to version 1 for 18 and 110 default apis and to version 3 for 112 default api. Version 2 is available, but will only be invoked if invoked directly. - - -Macro -Default function used - -(if no macro specified) - -Introduced in - -h5cc version flag and value Mapped to function or struct -H5Rdereference - -H5Rdereference2 HDF5-1.10.0 -DH5Rdereference_vers=1 H5Rdereference1 --DH5Rdereference_vers=2 H5Rdereference2 -H5Fget_info - -H5Fget_info2 HDF5-1.10.0 -DH5Fget_info_vers=1 H5Fget_info1 with struct H5F_info1_t --DH5Fget_info_vers=2 H5Fget_info2 with struct H5F_info2_t -H5Oget_info - -H5Oget_info1 HDF5-1.10.3 -DH5Oget_info_vers=1 H5Oget_info1 --DH5Oget_info_vers=2 H5Oget_info2 -H5Oget_info_by_idx - -H5Oget_info_by_idx1 HDF5-1.10.3 -DH5Oget_info_by_idx_vers=1 H5Oget_info_by_idx1 --DH5Oget_info_by_idx_vers=2 H5Oget_info_by_idx2 -H5Oget_info_by_name H5Oget_info_by_name1 HDF5-1.10.3 -DH5Oget_info_by_name_vers=1 H5Oget_info_by_name1 --DH5Oget_info_by_name_vers=2 H5Oget_info_by_name2 -H5Ovisit H5Ovisit1 HDF5-1.10.3 -DH5Ovisit_vers=1 H5Ovisit1 --DH5Ovisit_vers=2 -H5Ovisit2 - -H5Ovisit_by_name H5Ovisit_by_name1 HDF5-1.10.3 -DH5Ovisit_by_name_vers=1 H5Ovisit_by_name1 --DH5Ovisit_by_name_vers=2 H5Ovisit_by_name2 -Function Mapping Options in Releases 1.8.x - -At release 1.8.0, the API compatibility macros, function mapping compile-time version flags and values, and corresponding versioned functions listed in the following table were introduced. If the application being compiled to run with any 1.10.x release was written to use any 1.6.x release of HDF5, you must also consider these macros and mapping options. - - - -Table 5: Function Mapping Options in Releases 1.8.x -Macro h5cc version flag and value Mapped to function -or struct -H5Acreate -DH5Acreate_vers=1 H5Acreate1 --DH5Acreate_vers=2 H5Acreate2 -H5Aiterate -DH5Aiterate_vers=1 H5Aiterate1 -with struct H5A_operator1_t --DH5Aiterate_vers=2 H5Aiterate2 -with struct H5A_operator2_t -H5Dcreate -DH5Dcreate_vers=1 H5Dcreate1 --DH5Dcreate_vers=2 H5Dcreate2 -H5Dopen -DH5Dopen_vers=1 H5Dopen1 --DH5Dopen_vers=2 H5Dopen2 -H5Eclear -DH5Eclear_vers=1 H5Eclear1 --DH5Eclear_vers=2 H5Eclear2 -H5Eprint -DH5Eprint_vers=1 H5Eprint1 --DH5Eprint_vers=2 H5Eprint2 -H5Epush -DH5Epush_vers=1 H5Epush1 --DH5Epush_vers=2 H5Epush2 -H5Eset_auto -DH5Eset_auto_vers=1 H5Eset_auto1 --DH5Eset_auto_vers=2 H5Eset_auto2 -H5Eget_auto -DH5Eget_auto_vers=1 H5Eget_auto1 --DH5Eget_auto_vers=2 H5Eget_auto2 -H5E_auto_t -Struct for H5Eset_auto -and H5Eget_auto -DH5E_auto_t_vers=1 H5E_auto1_t --DH5E_auto_t_vers=2 H5E_auto2_t -H5Ewalk -DH5Ewalk_vers=1 H5Ewalk1 -with callback H5E_walk1_t -and struct H5E_error1_t --DH5Ewalk_vers=2 H5Ewalk2 -with callback H5E_walk2_t -and struct H5E_error2_t -H5Gcreate -DH5Gcreate_vers=1 H5Gcreate1 --DH5Gcreate_vers=2 H5Gcreate2 -H5Gopen -DH5Gopen_vers=1 H5Gopen1 --DH5Gopen_vers=2 H5Gopen2 -H5Pget_filter -DH5Pget_filter_vers=1 H5Pget_filter1 --DH5Pget_filter_vers=2 H5Pget_filter2 -H5Pget_filter_by_id -DH5Pget_filter_by_id_vers=1 H5Pget_filter_by_id1 --DH5Pget_filter_by_id_vers=2 H5Pget_filter_by_id2 -H5Pinsert -DH5Pinsert_vers=1 H5Pinsert1 --DH5Pinsert_vers=2 H5Pinsert2 -H5Pregister -DH5Pregister_vers=1 H5Pregister1 --DH5Pregister_vers=2 H5Pregister2 -H5Rget_obj_type -DH5Rget_obj_typevers=1 H5Rget_obj_type1 --DH5Rget_obj_type_vers=2 H5Rget_obj_type2 -H5Tarray_create -DH5Tarray_create_vers=1 H5Tarray_create1 --DH5Tarray_create_vers=2 H5Tarray_create2 -H5Tcommit -DH5Tcommit_vers=1 H5Tcommit1 --DH5Tcommit_vers=2 H5Tcommit2 -H5Tget_array_dims -DH5Tget_array_dims_vers=1 H5Tget_array_dims1 --DH5Tget_array_dims_vers=2 H5Tget_array_dims2 -H5Topen -DH5Topen_vers=1 H5Topen1 --DH5Topen_vers=2 H5Topen2 -H5Z_class_t Struct for H5Zregister -DH5Z_class_t_vers=1 H5Z_class1_t --DH5Z_class_t_vers=2 H5Z_class2_t - -Further Information - -See the HDF5 Reference Manual for complete descriptions of all API compatibility macros and versioned functions shown. - -It is possible to specify multiple function mappings for a single application build: - -h5cc ... -DH5Rdereference_vers=1 -DH5Fget_info_vers=2 ...As a result of the function and struct mappings in this compile example, all occurrences of the macro H5Rdereference will be mapped to H5Rdereference1 and all occurrences of the macro H5Fget_info will be mapped to H5Fget_info2 for the application being built. - -The function and struct mappings can be used to guarantee that a given API compatibility macro will be mapped to the desired underlying function or struct version regardless of the library or application mappings. In cases where an application may benefit greatly from features offered by some of the later APIs, or must continue to use some earlier API versions for compatibility reasons, this fine-grained control may be very important. - -As noted earlier, the function mappings can only reference versioned functions that are included in the HDF5 library, as determined by the configure flag used to build the library. For example, if the HDF5 library being linked with the application was built with the --disable-deprecated-symbols option, version 1 of the underlying functions would not be available, and the example above that defined H5Rdereference_vers=1 would not be supported. - -The function mappings do not negate any available functions. If H5Rdereference1 is available in the installed version of the HDF5 library, and the application was not compiled with the -DH5_NO_DEPRECATED_SYMBOLS flag, the function H5Rdereference1 will remain available to the application through its versioned name. Similarly, H5Rdereference2 will remain available to the application as H5Rdereference2. The function mapping version flag H5Rdereference_vers only controls the mapping of the API compatibility macro H5Rdereference to one of the two available functions. - -This can be especially useful in any case where the programmer does not have direct control over global macro definitions, such as when writing code meant to be copied to multiple applications or when writing code in a header file. - -## Compatibility Macros in HDF5 1.6.8 and Later -A series of similar compatibility macros were introduced into the release 1.6 series of the library, starting with release 1.6.8. These macros simply alias the <91>1<92> version functions, callbacks, and typedefs listed above to their original non-numbered names. - -These macros were strictly a forward-looking feature at that time; they were not necessary for compatibility in 1.6.x. These macros were created at that time to enable writing code that could be used with any version of the library after 1.6.8 and any library compilation options except H5_NO_DEPRECATED_SYMBOLS, by always using the <91>1<92> version of versioned functions and types. For example, H5Dopen1 will always be interpreted in exactly the same manner by any version of the library since 1.6.8. - -## Common Use Case -A common scenario where the API compatibility macros may be helpful is the migration of an existing application to a new HDF5 release An incremental migration plan is outlined here: - -Build the HDF5 library without specifying any library mapping configure flag. In this default mode, the 1.6.x, 1.8.x, and 1.10.x versions of the underlying functions are available, and the API compatibility macros will be mapped to the current HDF5 versioned functions. - - -Compile the application with the -DH5_USE_NN_API application mapping option if it was written for use with an earlier HDF5 library. Because the application mapping overrides the library mapping, the macros will all be mapped to the earlier versions of the functions. - - -Remap one API compatibility macro at a time (or sets of macros), to use the current HDF5 versions. At each stage, use the function mappings to map the macros being worked on to the current versions. For example, use the -DH5Rdereference_vers=2 version flag setting to remap the H5Rdereference macro to H5Rdereference2, the 1.10.x version. -During this step, the application code will need to be modified to change the calling parameters used with the API compatibility macros to match the number and type of the 1.10.x versioned functions. The macro name, for example H5Rdereference, should continue to be used in the code, to allow for possible re-mappings to later versioned functions in a future release. - - - After all macros have been migrated to the latest versioned functions in step 3, compile the application without any application or function mappings. This build uses the library mappings set in step 1, and maps API compatibility macros to the latest versions. - - -Finally, compile the application with the application mapping -DH5_NO_DEPRECATED_SYMBOLS, and address any failures to complete the application migration process. diff --git a/documentation/hdf5-docs/release_specifics/new_features_1_14.md b/documentation/hdf5-docs/release_specifics/new_features_1_14.md index 25ef387a..52910444 100644 --- a/documentation/hdf5-docs/release_specifics/new_features_1_14.md +++ b/documentation/hdf5-docs/release_specifics/new_features_1_14.md @@ -6,7 +6,15 @@ redirect_from: # New Features in HDF5 1.14 -HDF5 Release 1.14.0 is the final released version of all the features that were released in 1.13.0-1.13.3. Thus, the new features in the HDF4 1.14 release include: +The new features in the HDF4 1.14 series include: + +* [16 bit floating point and Complex number datatypes](https://github.com/HDFGroup/hdf5doc/blob/master/RFCs/HDF5_Library/Float16/RFC__Adding_support_for_16_bit_floating_point_and_Complex_number_datatypes_to_HDF5.pdf) +Support for the 16-bit floating-point \_Float16 C type has been added to +HDF5. On platforms where this type is available, this can enable more +efficient storage of floating-point data when an application doesn't +need the precision of larger floating-point datatypes. It can also allow +for improved performance when converting between 16-bit floating-point +data and data of another HDF5 datatype. * [Asynchronous I/O operations](asyn_ops_wHDF5_VOL_connectors.md) HDF5 provides asynchronous APIs for the HDF5 VOL connectors that @@ -14,7 +22,7 @@ support asynchronous HDF5 operations using the HDF5 Event Set (H5ES) API. This allows I/O to proceed in the background while the application is performing other tasks. -* [Subfiling VFD](http://docs.hdfgroup.org/hdf5/rfc/RFC_VFD_subfiling_200424.pdf) +* [Subfiling VFD](https://docs.hdfgroup.org/hdf5/rfc/RFC_VFD_subfiling_200424.pdf) The basic idea behind sub-filing is to find the middle ground between single shared file and one file per process - thereby avoiding some of the complexity of one file per process, and minimizing the locking @@ -44,3 +52,5 @@ also including critical logging capabilities to capture outputs from applying the serial tools over large collections of HDF5 files. +Note that the HDF5 Release 1.14.0 is the final released version of all the features +that were released in 1.13.0-1.13.3. diff --git a/documentation/hdf5-docs/release_specifics/release_specific_info.md b/documentation/hdf5-docs/release_specifics/release_specific_info.md deleted file mode 100644 index 3cda8d66..00000000 --- a/documentation/hdf5-docs/release_specifics/release_specific_info.md +++ /dev/null @@ -1,29 +0,0 @@ ---- -title: Release Specific Information -redirect_from: - - display/HDF5/Release+Specific+Information ---- - -# Release Specific Information - -### [HDF5 1.14](hdf5_1_14.md) -* [New Features](new_features_1_14.md) -* [Software Changes from Release to Release](sw_changes_1.14.md) -* [Migrating to HDF5 1.14 from previous releases](Migrating_from_HDF5_1.12_to_HDF5_1.14.md) - -### [HDF5 1.12](hdf5_1_12.md) -* [New Features](documentation/hdf5-docs/release_specifics/new_features_1_12.md) -* [Software Changes from Release to Release](sw_changes_1.12.md) -* [Migrating from HDF5 1.10 to HDF5 1.12](Migrating_from_HDF5_1.10_to_HDF5_1.12.md) - -### [HDF5 1.10](hdf5_1_10.md) -* [New Features](documentation/hdf5-docs/release_specifics/new_features_1_10.md) -* [Why should I care about the HDF5-1.10.2 release? (blog)]() -* [Software Changes from Release to Release](sw_changes_1.10.md) -* [Migrating from HDF5 1.8 to HDF5 1.10](Migrating_from_HDF5_1.8_to_HDF5_1.10.md) - -### [HDF5 1.8](hdf5_1_8.md) -* New Features -* [Software Changes from Release to Release](sw_changes_1.8.md) - -### [API compatibility Macros in HDF5](documentation/hdf5-docs/release_specifics/api_comp_macros.md) diff --git a/documentation/hdf5-docs/release_specifics/sw_changes_1.10.md b/documentation/hdf5-docs/release_specifics/sw_changes_1.10.md index 9b2ef3d2..9841a80c 100644 --- a/documentation/hdf5-docs/release_specifics/sw_changes_1.10.md +++ b/documentation/hdf5-docs/release_specifics/sw_changes_1.10.md @@ -1,6 +1,6 @@ --- title: Software+Changes+from+Release+to+Release+for+HDF5+1.10 -redirect_from: +redirect from: - display/HDF5/Software+Changes+from+Release+to+Release+for+HDF5+1.10 --- @@ -17,20 +17,22 @@ Note that bug fixes and performance enhancements in the C library are automatica The following information is included below. -* [Compatiblity and Performance Issues](#compatiblity-and-performance-issues) -* [Release 1.10.9 versus 1.10.8](#release-1.10.9-versus-1.10.8) -* [Release 1.10.8 versus 1.10.7](#release-1.10.8-versus-1.10.7) -* [Release 1.10.7 versus 1.10.6](#release-1.10.7-versus-1.10.6) -* [Release 1.10.6 versus 1.10.5](#release-1.10.6-versus-1.10.5) -* [Release 1.10.5 versus 1.10.4, 1.10.3, and 1.10.2](#release-1.10.5-versus-1.10.4) -* [Release 1.10.4 versus Release 1.10.3](#release-1.10.4-versus-1.10.3) -* [Release 1.10.3 versus Release 1.10.2](#release-1.10.3-versus-1.10.2) -* [Release 1.10.2 versus Release 1.10.1](#release-1.10.2-versus-1.10.1) -* [Release 1.10.1 versus Release 1.10.0 (and 1.10.0-patch1)](#release-1.10.1-versus-1.10.0) -* [Release 1.10.0 of March 2016 versus Release 1.8.16](#release-1.10.0-versus-1.8.16) +* Compatiblity and Performance Issues +* Release 1.10.9 versus 1.10.8] +* Release 1.10.8 versus 1.10.7 +* Release 1.10.7 versus 1.10.6 +* Release 1.10.6 versus 1.10.5 +* Release 1.10.5 versus 1.10.4, 1.10.3, and 1.10.2 +* Release 1.10.4 versus Release 1.10.3 +* Release 1.10.3 versus Release 1.10.2 +* Release 1.10.2 versus Release 1.10.1 +* Release 1.10.1 versus Release 1.10.0 (and 1.10.0-patch1) +* Release 1.10.0 of March 2016 versus Release 1.8.16 +See [API Compatibility Reports for 1.10]() for information regarding compatibility with previous releases. + +

    Compatiblity and Performance Issues

    -## Compatibility and Performance Issues Not all HDF5-1.10 releases are compatible. Users should NOT be using 1.10 releases prior to HDF5-1.10.3. See the compatibility matrix below for details on compatibility between 1.10 releases: | Release | 1.10.5+ | 1.10.4 | 1.10.3 | 1.10.2 | 1.10.1 | 1.10.0-patch1 | 1.10.0 | @@ -44,119 +46,125 @@ The following images show how performance has changed from release to release. [cgns, HDF5 versions](images/cgns.png) +[writeLgNumDsets](images/writeLgNumDsets.png) -The release notes also list changes made to the library, but these notes tend to be more at a more detail-oriented level. The release notes may include new features, bugs fixed, supported configuration features, platforms on which the library has been tested, and known problems. The release note files are listed below and can be found at the top level of the HDF5 source code tree in the release_docs directory. +The release notes also list changes made to the library, but these notes tend to be more at a more detail-oriented level. The release notes may include new features, bugs fixed, supported configuration features, platforms on which the library has been tested, and known problems. The release note files are listed below and can be found at the top level of the HDF5 source code tree in the release\_docs directory. -| | | -| ------------------------ | ------------------------------------------------------------ | -| Release Notes | Technical notes regarding the current release of the HDF5 library (RELEASE.txt in the source code) | -| HISTORY-1_10.txt | Release information for all HDF5-1.10 releases | -| HISTORY-1_8_0-1_10_0.txt | Development history between the HDF5-1.8.0 and HDF5-1.10.0 releases | -| HISTORY-1_8.txt | Release information for HDF5-1.8.0 through HDF5-1.8.21 releases | -| HISTORY-1_0-1_8_0_rc3.txt | Technical notes starting with HDF5-1.0.0 and ending with HDF5-1.8.0-rc3 (the state of the code prior to the HDF5-1.8.0 release) | +| | | +| ----------------------------- | ------------------------------------------------------------ | +| **Release Notes** | Technical notes regarding the current release of the HDF5 library (RELEASE.txt in the source code) | +| **HISTORY-1_10.txt** | Release information for all HDF5-1.10 releases | +| **HISTORY-1_8_0-1_10_0.txt** | Development history between the HDF5-1.8.0 and HDF5-1.10.0 releases | +| **HISTORY-1_8.txt** | Release information for HDF5-1.8.0 through HDF5-1.8.21 releases | +| **HISTORY-1_0-1_8_0_rc3.txt** | Technical notes starting with HDF5-1.0.0 and ending with HDF5-1.8.0-rc3 (the state of the code prior to the HDF5-1.8.0 release) | +

    Release 1.10.9 versus 1.10.8

    -## Release 1.10.9 versus 1.10.8 ### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros -In the Java Wrapper +In the Java API One Java wrapper was added: -H5.H5export_dataset +H5.H5export\_dataset ### Compatibility Notes and Reports -See the API compatibility report for the HDF5 library between 1.10.8 and 1.10.9 for information regarding compatibility with the previous release. The API Compatibility Report page includes all 1.10 compatibility reports. +See the [API compatibility report between 1.10.8 and 1.10.9]() for information regarding compatibility with the previous release. -## Release 1.10.8 versus 1.10.7 -New and Changed Functions, Classes, Subroutines, Wrappers, and Macros -In the C++ Wrapper +

    Release 1.10.8 versus 1.10.7

    + +### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros + +#### In the C++ API One C++ wrapper was added: DataSet::operator= -Compatibility Notes and Reports -See the API compatibility report for the HDF5 library between 1.10.7 and 1.10.8 for information regarding compatibility with the previous release. The API Compatibility Report page includes all 1.10 compatibility reports. +### Compatibility Notes and Reports +See the [API compatibility report between 1.10.7 and 1.10.8]() for information regarding compatibility with the previous release. -## Release 1.10.7 versus 1.10.6 -New and Changed Functions, Classes, Subroutines, Wrappers, and Macros -In the C Interface (main library) +

    Release 1.10.7 versus 1.10.6

    -The following are new C functions in this release: +### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros +#### In the C Interface (main library) -H5P_GET_FAPL_SPLITTER -Retrieves information for a splitter file access property list -H5P_SET_FAPL_SPLITTER Sets the file access property list to use the splitter driver -H5P_GET_FILE_LOCKING Gets the file locking property values -H5P_SET_FILE_LOCKING Sets the file locking property values -H5_GET_ALLOC_STATS -Gets the memory allocation statistics for the library -H5_GET_FREE_LIST_SIZES Gets the current size of the free lists used to manage memory -H5S_COMBINE_HYPERSLAB Performs an operation on a hyperslab and an existing selection and returns the resulting selection -H5S_COMBINE_SELECT Combines two hyperslab selections with an operation, returning a dataspace with the resulting selection -H5S_MODIFY_SELECT Refines a hyperslab selection with an operation using a second hyperslab to modify it -H5S_SELECT_ADJUST Adjusts a selection by subtracting an offset -H5S_SELECT_COPY Copies a selection from one dataspace to another -H5S_SELECT_INTERSECT_BLOCK Checks if current selection intersects with a block -H5S_SELECT_PROJECT_INTERSECTION Projects the intersection of two source selections to a destination selection -H5S_SELECT_SHAPE_SAME -Checks if two selections are the same shape +The following are new C functions in this release: +| Function | Description | +| ----------------------------- | ------------------------------------------------------------ | +| H5P\_GET\_FAPL\_SPLITTER | Retrieves information for a splitter file access property list | +| H5P\_SET\_FAPL\_SPLITTER | Sets the file access property list to use the splitter driver | +| H5P\_GET\_FILE\_LOCKING | Gets the file locking property values | +| H5P\_SET\_FILE\_LOCKING | Sets the file locking property values | +| H5\_GET\_ALLOC\_STATS | Gets the memory allocation statistics for the library | +| H5\_GET\_FREE\_LIST\_SIZES | Gets the current size of the free lists used to manage memory | +| H5S\_COMBINE\_HYPERSLAB | Performs an operation on a hyperslab and an existing selection and returns the resulting selection | +| H5S\_COMBINE\_SELECT | Combines two hyperslab selections with an operation, returning a dataspace with the resulting selection | +| H5S\_MODIFY\_SELECT | Refines a hyperslab selection with an operation using a second hyperslab to modify it | +| H5S\_SELECT\_ADJUST | Adjusts a selection by subtracting an offset | +| H5S\_SELECT\_COPY | Copies a selection from one dataspace to another | +| H5S\_SELECT\_INTERSECT\_BLOCK | Checks if current selection intersects with a block | +| H5S\_SELECT\_PROJECT\_INTERSECTION | Projects the intersection of two source selections to a destination selection | +| H5S\_SELECT\_SHAPE\_SAME | Checks if two selections are the same shape | + +#### In the C++ API +The following C++ wrappers were added: -In the C++ Wrapper +FileAccPropList::getFileLocking See H5P\_GET\_FILE\_LOCKING for details +FileAccPropList::setFileLocking See H5P\_SET\_FILE\_LOCKING for details -The following C++ wrappers were added: +### Compatibility Notes and Reports +See the [API compatibility report between 1.10.7 and 1.10.8]() for information regarding compatibility with the previous release. -FileAccPropList::getFileLocking See H5P_GET_FILE_LOCKING for details -FileAccPropList::setFileLocking See H5P_SET_FILE_LOCKING for details -Compatibility Notes and Reports -See the API compatibility report for the HDF5 library between 1.10.6 and 1.10.7 for information regarding compatibility with the previous release. The API Compatibility Report page includes all 1.10 compatibility reports. +

    Release 1.10.6 versus 1.10.5

    -## Release 1.10.6 versus 1.10.5 -New and Changed Functions, Classes, Subroutines, Wrappers, and Macros -In the C Interface (main library) +### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros +#### In the C Interface (main library) The following are new C functions in this release: -H5P_GET_FAPL_HDFS Gets the information of the given Read-Only HDFS virtual file driver -H5P_GET_FAPL_ROS3 Gets the information of the given Read-Only S3 virtual file driver -H5P_SET_FAPL_HDFS Sets up Read-Only HDFS virtual file driver -H5P_SET_FAPL_ROS3 Sets up Read-Only S3 virtual file driver +| Function | Description | +| --------------------- | ------------------------------------------------------------ | +| H5P\_GET\_FAPL\_HDFS | Gets the information of the given Read-Only HDFS virtual file driver | +| H5P\_GET\_FAPL\_ROS3 | Gets the information of the given Read-Only S3 virtual file driver | +| H5P\_SET\_FAPL\_HDFS | Sets up Read-Only HDFS virtual file driver | +| H5P\_SET\_FAPL\_ROS3 | Sets up Read-Only S3 virtual file driver | +#### In the C++ API + +The following C++ wrappers were added: -In the C++ Wrapper +LinkCreatPropList::getCreateIntermediateGroup() const +See H5P\_GET\_CREATE\_INTERMEDIATE\_GROUP +LinkCreatPropList::setCreateIntermediateGroup(bool crt\_intmd\_group) const +See H5P\_SET\_CREATE\_INTERMEDIATE\_GROUP -The following C++ wrapper was added: +### Compatibility Notes and Reports +See the [API compatibility report between 1.10.7 and 1.10.8]() for information regarding compatibility with the previous release. -LinkCreatPropList::getCreateIntermediateGroup ( ) const -See H5P_GET_CREATE_INTERMEDIATE_GROUP -LinkCreatPropList::setCreateIntermediateGroup ( bool crt_intmd_group ) const -See H5P_SET_CREATE_INTERMEDIATE_GROUP -Compatibility Notes and Reports -See the API Compatibility Report for information regarding compatibility with previous releases. +

    Release 1.10.5 versus 1.10.4

    -## Release 1.10.5 versus 1.10.4, 1.10.3, and 1.10.2 -New and Changed Functions, Classes, Subroutines, Wrappers, and Macros -In the C Interface (main library) +### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros +#### In the C Interface (main library) The following are new C functions in this release: -H5D_GET_CHUNK_INFO Retrieves information about a chunk specified by the chunk index -H5D_GET_CHUNK_INFO_BY_COORD Retrieves information about a chunk specified by its coordinates -H5D_GET_NUM_CHUNKS Retrieves number of chunks that have nonempty intersection with a specified selection -H5F_GET_DSET_NO_ATTRS_HINT +H5D\_GET\_CHUNK\_INFO Retrieves information about a chunk specified by the chunk index +H5D\_GET\_CHUNK\_INFO\_BY\_COORD Retrieves information about a chunk specified by its coordinates +H5D\_GET\_NUM\_CHUNKS Retrieves number of chunks that have nonempty intersection with a specified selection +H5F\_GET\_DSET\_NO\_ATTRS\_HINT Retrieves the setting for determining whether the specified file does or does not create minimized dataset object headers -H5F_SET_DSET_NO_ATTRS_HINT +H5F\_SET\_DSET\_NO\_ATTRS\_HINT Sets the flag to create minimized dataset object headers -H5P_GET_DSET_NO_ATTRS_HINT +H5P\_GET\_DSET\_NO\_ATTRS\_HINT Retrieves the setting for determining whether the specified DCPL does or does not create minimized dataset object headers -H5P_SET_DSET_NO_ATTRS_HINT +H5P\_SET\_DSET\_NO\_ATTRS\_HINT Sets the flag to create minimized dataset object headers @@ -164,59 +172,59 @@ Sets the flag to create minimized dataset object headers The following changed in this release: -H5O_GET_INFO, H5O_GET_INFO_BY_NAME, H5O_GET_INFO_BY_IDX, H5O_VISIT, H5O_VISIT_BY_NAME +H5O\_GET\_INFO, H5O\_GET\_INFO\_BY\_NAME, H5O\_GET\_INFO\_BY\_IDX, H5O\_VISIT, H5O\_VISIT\_BY\_NAME -In 1.10.3 the original functions were versioned to H5Oget_info*1 and H5Ovisit*1 and the macros H5Oget_info* and H5Ovisit* were created. This broke the API compatibility for a maintenance release. In HDF5-1.10.5, the macros introduced in HDF5-1.10.3 were removed. The H5Oget_info*1 and H5Ovisit*1 APIs were copied to H5Oget_Info* and H5Ovisit*. As an example, H5Oget_info and H5Oget_info1 are identical in this release. +In 1.10.3 the original functions were versioned to H5Oget\_info*1 and H5Ovisit*1 and the macros H5Oget\_info* and H5Ovisit* were created. This broke the API compatibility for a maintenance release. In HDF5-1.10.5, the macros introduced in HDF5-1.10.3 were removed. The H5Oget\_info*1 and H5Ovisit*1 APIs were copied to H5Oget\_Info* and H5Ovisit*. As an example, H5Oget\_info and H5Oget\_info1 are identical in this release. -In the C++ Wrapper +In the C++ API The following C++ wrapper was added: H5Object::visit() -Wrapper for the C API H5O_VISIT2. Recursively visit elements reachable from an HDF5 object and perform a common set of operations across all of those elements. See H5O_VISIT2 for more information on this function. +Wrapper for the C API H5O\_VISIT2. Recursively visit elements reachable from an HDF5 object and perform a common set of operations across all of those elements. See H5O\_VISIT2 for more information on this function. -In the Fortran Wrapper +In the Fortran API The following Fortran wrappers were added or changed: -h5fget_dset_no_attrs_hint_f +h5fget\_dset\_no\_attrs\_hint\_f -h5fset_dset_no_attrs_hint_f +h5fset\_dset\_no\_attrs\_hint\_f -h5pget_dset_no_attrs_hint_f +h5pget\_dset\_no\_attrs\_hint\_f -h5pset_dset_no_attrs_hint_f +h5pset\_dset\_no\_attrs\_hint\_f -Wrappers for the dataset object header minimization calls. See H5F_GET_DSET_NO_ATTRS_HINT, H5F_SET_DSET_NO_ATTRS_HINT, H5P_GET_DSET_NO_ATTRS_HINT, and H5P_SET_DSET_NO_ATTRS_HINT. -h5ovisit_f +Wrappers for the dataset object header minimization calls. See H5F\_GET\_DSET\_NO\_ATTRS\_HINT, H5F\_SET\_DSET\_NO\_ATTRS\_HINT, H5P\_GET\_DSET\_NO\_ATTRS\_HINT, and H5P\_SET\_DSET\_NO\_ATTRS\_HINT. +h5ovisit\_f -h5oget_info_by_name_f +h5oget\_info\_by\_name\_f -h5oget_info +h5oget\_info -h5oget_info_by_idx +h5oget\_info\_by\_idx -h5ovisit_by_name_f +h5ovisit\_by\_name\_f -Added new Fortran 'fields' optional parameter. See H5O_VISIT2, H5O_GET_INFO_BY_NAME2, H5O_GET_INFO2, H5O_GET_INFO_BY_IDX2, and H5O_VISIT_BY_NAME2. +Added new Fortran 'fields' optional parameter. See H5O\_VISIT2, H5O\_GET\_INFO\_BY\_NAME2, H5O\_GET\_INFO2, H5O\_GET\_INFO\_BY\_IDX2, and H5O\_VISIT\_BY\_NAME2. The following Fortran utility function was added: -h5gmtime converts (C) 'time_t' structure to Fortran DATE AND TIME storage format +h5gmtime converts (C) 'time\_t' structure to Fortran DATE AND TIME storage format A new Fortran derived type was added: -c_h5o_info_t -This is interoperable with C's h5o_info_t. This is needed for callback functions which pass C's h5o_info_t data type definition. +c\_h5o\_info\_t +This is interoperable with C's h5o\_info\_t. This is needed for callback functions which pass C's h5o\_info\_t data type definition. -See the Fortran signature for H5O_GET_INFO2. +See the Fortran signature for H5O\_GET\_INFO2. @@ -224,87 +232,87 @@ In the Java wrapper The following Java wrappers were added or changed: -H5Fset_libver_bounds See the C API H5F_SET_LIBVER_BOUNDS for information on this function -H5Fget_dset_no_attrs_hint +H5Fset\_libver\_bounds See the C API H5F\_SET\_LIBVER\_BOUNDS for information on this function +H5Fget\_dset\_no\_attrs\_hint -H5Fset_dset_no_attrs_hint +H5Fset\_dset\_no\_attrs\_hint -H5Pget_dset_no_attrs_hint +H5Pget\_dset\_no\_attrs\_hint -H5Pset_dset_no_attrs_hint +H5Pset\_dset\_no\_attrs\_hint -Wrappers for the dataset object header minimization calls See H5F_GET_DSET_NO_ATTRS_HINT, H5F_SET_DSET_NO_ATTRS_HINT, H5P_GET_DSET_NO_ATTRS_HINT, and H5P_SET_DSET_NO_ATTRS_HINT for more information on these APIs. -Compatibility Notes and Reports -See these API Compatibility Reports for 1.10 for information regarding compatibility with previous releases. Reports are available comparing HDF5-1.10.5 vs 1.10.2, HDF5-1.10.5 vs 1.10.3, and HDF5-1.10.5 vs 1.10.4. +Wrappers for the dataset object header minimization calls See H5F\_GET\_DSET\_NO\_ATTRS\_HINT, H5F\_SET\_DSET\_NO\_ATTRS\_HINT, H5P\_GET\_DSET\_NO\_ATTRS\_HINT, and H5P\_SET\_DSET\_NO\_ATTRS\_HINT for more information on these APIs. - +### Compatibility Notes and Reports +See the [API compatibility report between 1.10.5 and 1.10.2/1.10.3/1.10.4]() for details. -## Release 1.10.4 versus Release 1.10.3 -See the API compatibility Report for information regarding compatibility with previous releases +

    Release 1.10.4 versus 1.10.3

    - +See the [API compatibility report between 1.10.4 and 1.10.3]() for details. + +

    Release 1.10.3 versus 1.10.2

    -## Release 1.10.3 versus Release 1.10.2 New and Changed Functions, Classes, Subroutines, Wrappers, and Macros In the C Interface (main library) The following are new C functions in this release: -H5D_READ_CHUNK Moved from HDF5 High Level Optimizations library to core library -H5D_WRITE_CHUNK Moved from HDF5 High Level Optimizations library to core library -H5O_GET_INFO +H5D\_READ\_CHUNK Moved from HDF5 High Level Optimizations library to core library +H5D\_WRITE\_CHUNK Moved from HDF5 High Level Optimizations library to core library +H5O\_GET\_INFO -H5O_GET_INFO1 +H5O\_GET\_INFO1 -H5O_GET_INFO2 +H5O\_GET\_INFO2 -The function H5O_GET_INFO was moved to H5O_GET_INFO1, and the macro H5O_GET_INFO was created that can be mapped to either H5O_GET_INFO1 or H5O_GET_INFO2. For HDF5-1.10 and earlier releases, H5O_GET_INFO is mapped to H5O_GET_INFO1 by default. -H5O_GET_INFO_BY_IDX +The function H5O\_GET\_INFO was moved to H5O\_GET\_INFO1, and the macro H5O\_GET\_INFO was created that can be mapped to either H5O\_GET\_INFO1 or H5O\_GET\_INFO2. For HDF5-1.10 and earlier releases, H5O\_GET\_INFO is mapped to H5O\_GET\_INFO1 by default. +H5O\_GET\_INFO\_BY\_IDX -H5O_GET_INFO_BY_IDX1 +H5O\_GET\_INFO\_BY\_IDX1 -H5O_GET_INFO_BY_IDX2 +H5O\_GET\_INFO\_BY\_IDX2 -The function H5O_GET_INFO_BY_IDX was moved to H5O_GET_INFO_BY_IDX1, and the macro H5O_GET_INFO_BY_IDX was created that can be mapped to either H5O_GET_INFO_BY_IDX1 or H5O_GET_INFO_BY_IDX2. For HDF5-1.10 and earlier releases, H5O_GET_INFO_BY_IDX is mapped to H5O_GET_INFO_BY_IDX1 by default. -H5O_GET_INFO_BY_NAME +The function H5O\_GET\_INFO\_BY\_IDX was moved to H5O\_GET\_INFO\_BY\_IDX1, and the macro H5O\_GET\_INFO\_BY\_IDX was created that can be mapped to either H5O\_GET\_INFO\_BY\_IDX1 or H5O\_GET\_INFO\_BY\_IDX2. For HDF5-1.10 and earlier releases, H5O\_GET\_INFO\_BY\_IDX is mapped to H5O\_GET\_INFO\_BY\_IDX1 by default. +H5O\_GET\_INFO\_BY\_NAME -H5O_GET_INFO_BY_NAME1 +H5O\_GET\_INFO\_BY\_NAME1 -H5O_GET_INFO_BY_NAME2 +H5O\_GET\_INFO\_BY\_NAME2 -The function H5O_GET_INFO_BY_NAME was moved to H5O_GET_INFO_BY_NAME1, and the macro H5O_GET_INFO_BY_NAME was created that can be mapped to either H5O_GET_INFO_BY_NAME1 or H5O_GET_INFO_BY_NAME2. For HDF5-1.10 and earlier releases, H5O_GET_INFO_BY_NAME is mapped to H5O_GET_INFO_BY_NAME1 by default. -H5O_VISIT +The function H5O\_GET\_INFO\_BY\_NAME was moved to H5O\_GET\_INFO\_BY\_NAME1, and the macro H5O\_GET\_INFO\_BY\_NAME was created that can be mapped to either H5O\_GET\_INFO\_BY\_NAME1 or H5O\_GET\_INFO\_BY\_NAME2. For HDF5-1.10 and earlier releases, H5O\_GET\_INFO\_BY\_NAME is mapped to H5O\_GET\_INFO\_BY\_NAME1 by default. +H5O\_VISIT -H5O_VISIT1 +H5O\_VISIT1 -H5O_VISIT2 +H5O\_VISIT2 -The function H5O_VISIT was moved to H5O_VISIT1, and the macro H5O_VISIT was created that can be mapped to either H5O_VISIT1 or H5O_VISIT2. For HDF5-1.10 and earlier releases, H5O_VISIT is mapped to H5O_VISIT1 by default. -H5O_VISIT_BY_NAME +The function H5O\_VISIT was moved to H5O\_VISIT1, and the macro H5O\_VISIT was created that can be mapped to either H5O\_VISIT1 or H5O\_VISIT2. For HDF5-1.10 and earlier releases, H5O\_VISIT is mapped to H5O\_VISIT1 by default. +H5O\_VISIT\_BY\_NAME -H5O_VISIT_BY_NAME1 +H5O\_VISIT\_BY\_NAME1 -H5O_VISIT_BY_NAME2 +H5O\_VISIT\_BY\_NAME2 -The function H5O_VISIT_BY_NAME was moved to H5O_VISIT_BY_NAME1, and the macro H5O_VISIT_BY_NAME was created that can be mapped to either H5O_VISIT_BY_NAME1 or H5O_VISIT_BY_NAME2. For HDF5-1.10 and earlier releases, H5O_VISIT_BY_NAME is mapped to H5O_VISIT_BY_NAME1 by default. +The function H5O\_VISIT\_BY\_NAME was moved to H5O\_VISIT\_BY\_NAME1, and the macro H5O\_VISIT\_BY\_NAME was created that can be mapped to either H5O\_VISIT\_BY\_NAME1 or H5O\_VISIT\_BY\_NAME2. For HDF5-1.10 and earlier releases, H5O\_VISIT\_BY\_NAME is mapped to H5O\_VISIT\_BY\_NAME1 by default. In the C High Level Interface The following C functions were deprecated in this release: -H5DO_READ_CHUNK Deprecated, moved to H5D_READ_CHUNK -H5DO_WRITE_CHUNK Deprecated, moved to H5D_WRITE_CHUNK +H5DO\_READ\_CHUNK Deprecated, moved to H5D\_READ\_CHUNK +H5DO\_WRITE\_CHUNK Deprecated, moved to H5D\_WRITE\_CHUNK -In the C++ Wrapper +In the C++ API Several C++ wrappers were added or modified to provide additional support. See the API Compatibility Report for details. -Compatibility Notes and Report -See the API Compatibility Report for information regarding compatibility with previous releases. +### Compatibility Notes and Report +See the [API compatibility report between 1.10.4 and 1.10.3]() for details. + +

    Release 1.10.2 versus 1.10.1

    -## Release 1.10.2 versus Release 1.10.1 This section lists interface-level changes and other user-visible changes in behavior in the transition from HDF5 Release 1.10.1 to Release 1.10.2. New and Changed Functions, Classes, Subroutines, Wrappers, and Macros @@ -312,40 +320,40 @@ In the C Interface (main library) The following are new C functions in this release: -H5D_GET_CHUNK_STORAGE_SIZE Returns storage amount allocated within a file for a raw data chunk in a dataset -H5F_GET_EOA Retrieves the file's EOA -H5F_INCREMENT_FILESIZE +H5D\_GET\_CHUNK\_STORAGE\_SIZE Returns storage amount allocated within a file for a raw data chunk in a dataset +H5F\_GET\_EOA Retrieves the file's EOA +H5F\_INCREMENT\_FILESIZE Sets the file's EOA to the maximum of (EOA, EOF) + increment -H5F_SET_LIBVER_BOUNDS Enables the switch of version bounds setting for a file -H5FDdriver_query Queries a VFL driver for its feature flags when a file is not available (not documented in Reference Manual) -H5P_GET_VIRTUAL_PREFIX Retrieves prefix applied to VDS source file paths -H5P_SET_VIRTUAL_PREFIX Sets prefix to be applied to VDS source file paths +H5F\_SET\_LIBVER\_BOUNDS Enables the switch of version bounds setting for a file +H5FDdriver\_query Queries a VFL driver for its feature flags when a file is not available (not documented in Reference Manual) +H5P\_GET\_VIRTUAL\_PREFIX Retrieves prefix applied to VDS source file paths +H5P\_SET\_VIRTUAL\_PREFIX Sets prefix to be applied to VDS source file paths The following C functions changed in this release: -H5P_SET_LIBVER_BOUNDS HDF5-1.10 was added to the range of versions -H5P_SET_VIRTUAL A change was made to the method of searching for VDS source files -H5PL* The parameters for many of the H5PL APIs were renamed +H5P\_SET\_LIBVER\_BOUNDS HDF5-1.10 was added to the range of versions +H5P\_SET\_VIRTUAL A change was made to the method of searching for VDS source files +H5PL\* The parameters for many of the H5PL APIs were renamed In the C High Level Interface The following new C function was added to this release: -H5DO_READ_CHUNK Reads a raw data chunk directly from a dataset in a file +H5DO\_READ\_CHUNK Reads a raw data chunk directly from a dataset in a file -In the C++ Wrapper +In the C++ API The following C++ wrappers were added: -H5Lcreate_soft -Creates a soft link from link_name to target_name +H5Lcreate\_soft +Creates a soft link from link\_name to target\_name -H5Lcreate_hard -Creates a hard link from new_name to curr_name +H5Lcreate\_hard +Creates a hard link from new\_name to curr\_name H5Lcopy Copy an object from a group of file @@ -362,7 +370,7 @@ Creates a binary object description of this datatype H5Tdecode Returns the decoded type from the binary object description -H5Lget_info +H5Lget\_info Returns the information of the named link @@ -377,29 +385,29 @@ See the API Compatibility report for complete details. -In the Java Wrapper +In the Java API The following Java wrappers were added: -H5Pset_evict_on_close +H5Pset\_evict\_on\_close Controls the library's behavior of evicting metadata associated with a closed object -H5Pget_evict_on_close +H5Pget\_evict\_on\_close Retrieves the file access property list setting that determines whether an HDF5 object will be evicted from the library's metadata cache when closed -H5Pset_chunk_opts +H5Pset\_chunk\_opts Sets the edge chunk option in a dataset creation property list -H5Pget_chunk_opts +H5Pget\_chunk\_opts Retrieves the edge chunk option setting from a dataset creation property list -H5Pset_efile_prefix +H5Pset\_efile\_prefix Sets the external dataset storage file prefix in the dataset access property list -H5Pget_efile_prefix +H5Pget\_efile\_prefix Retrieves the prefix for external raw data storage files as set in the dataset access property list -H5Pset_virtual_prefix +H5Pset\_virtual\_prefix Sets prefix to be applied to VDS source file paths -H5Pget_virtual_prefix +H5Pget\_virtual\_prefix Retrieves prefix applied to VDS source file paths See the Release.txt file for details. @@ -418,10 +426,13 @@ C is >= 0; C is optional and will default to 1M when not set A new option was added to h5diff: --enable-error-stack Enable the error stack -Compatibility Notes and Report -See API Compatibility Reports for 1.10 for information regarding compatibility with previous releases. -## Release 1.10.1 versus Release 1.10.0 (and 1.10.0-patch1) +### Compatibility Notes and Report + +See the [API compatibility report between 1.10.4 and 1.10.3]() for details. + +

    Release 1.10.1 versus 1.10.0

    + This section lists interface-level changes and other user-visible changes in behavior in the transition from HDF5 Release 1.10.0 (and HDF5-1.10.0-patch1) to Release 1.10.1. New Features @@ -441,64 +452,64 @@ The following features are described and documented in New Features in HDF5 Rele Metadata Cache Image: -H5Pget_mdc_image_config +H5Pget\_mdc\_image\_config Retrieves the metadata cache image configuration values for a file access property list. -H5Pset_mdc_image_config +H5Pset\_mdc\_image\_config Sets the metadata cache image option for a file access property list. -H5Fget_mdc_image_info +H5Fget\_mdc\_image\_info Gets information about a metadata cache image if it exists. Metadata Cache Evict on Close: -H5Pget_evict_on_close +H5Pget\_evict\_on\_close Retrieves the property list setting that determines whether an HDF5 object will be evicted from the library's metadata cache when it is closed. -H5Pset_evict_on_close +H5Pset\_evict\_on\_close Controls the library's behavior of evicting metadata associated with a closed object. Paged Aggregation: -H5Pget_file_space_page_size +H5Pget\_file\_space\_page\_size Retrieves the file space page size for a file creation property list. -H5Pset_file_space_page_size +H5Pset\_file\_space\_page\_size Sets the file space page size (used with paged aggregation) for a file creation property list. -H5Pget_file_space_strategy +H5Pget\_file\_space\_strategy Retrieves the file space handling strategy for a file creation property list. -H5Pset_file_space_strategy +H5Pset\_file\_space\_strategy Sets the file space allocation strategy for a file creation property list. Page Buffering: -H5Pget_page_buffer_size +H5Pget\_page\_buffer\_size Retrieves the maximum size for the page buffer and the minimum percentage for metadata and raw data pages. -H5Pset_page_buffer_size +H5Pset\_page\_buffer\_size Sets the maximum size for the page buffer and the minimum percentage for metadata and raw data pages. -H5Fget_page_buffering_stats +H5Fget\_page\_buffering\_stats Retrieves statistics about page access when it is enabled. -H5Freset_page_buffering_stats +H5Freset\_page\_buffering\_stats Resets the page buffer statistics. @@ -522,49 +533,45 @@ H5PLreplace H5PLsize -In the C++ Wrapper +In the C++ API New member functions were added to provide const versions. For example, these methods, -ArrayType::getArrayDims ( hsize_t* dims ) const +ArrayType::getArrayDims ( hsize\_t\* dims ) const ArrayType::getArrayNDims ( ) const - replace these: -ArrayType::getArrayDims ( hsize_t* dims ) +ArrayType::getArrayDims(hsize\_t\* dims) -ArrayType::getArrayNDims ( ) +ArrayType::getArrayNDims() Several functions were moved to other classes. For example, this method, -H5Location::openDataSet ( char const* name ) const +H5Location::openDataSet (char const\* name) const replaces: -CommonFG::openDataSet ( char const* name ) const - - +CommonFG::openDataSet (char const\* name) const PLEASE review the Compatibility report below for complete information on the C++ changes in this release. -Compatibility Report -Compatibility report for Release 1.10.1 versus Release 1.10.0-patch1 +### Compatibility Report -See API Compatibility Reports for 1.10 for information regarding compatibility with previous releases. +See the [API compatibility report between 1.10.1 and 1.10.0-patch1]() for details. -## Release 1.10.0 of March 2016 versus Release 1.8.16 +

    Release 1.10.0 versus Release 1.8.16

    This section lists interface-level changes and other user-visible changes in behavior in the transition from HDF5 Release 1.8.16 to Release 1.10.0. Changed Type -hid_t +hid\_t Changed from a 32-bit to a 64-bit value. -hid_t is the type is used for all HDF5 identifiers. This change, which is necessary to accomodate the capacities of modern computing systems, therefore affects all HDF5 applications. If an application has been using HDF5's hid_t the type, recompilation will normally be sufficient to take advantage of HDF5 Release 1.10.0. If an application uses an integer type instead of HDF5's hid_t type, those identifiers must be changed to a 64-bit type when the application is ported to the 1.10.x series. +hid\_t is the type is used for all HDF5 identifiers. This change, which is necessary to accomodate the capacities of modern computing systems, therefore affects all HDF5 applications. If an application has been using HDF5's hid\_t the type, recompilation will normally be sufficient to take advantage of HDF5 Release 1.10.0. If an application uses an integer type instead of HDF5's hid\_t type, those identifiers must be changed to a 64-bit type when the application is ported to the 1.10.x series. New Features and Feature Sets Several new features are introduced in HDF5 Release 1.10.0. @@ -597,7 +604,7 @@ The following features are described and documented in New Features in HDF5 Rele Single-writer / Multiple-reader, commonly called SWMR: -H5Fstart_swmr_write +H5Fstart\_swmr\_write Enables SWMR writing mode for a file. @@ -606,51 +613,51 @@ H5DOappend Appends data to a dataset along a specified dimension. (This is a high-level API.) -H5Pget_append_flush +H5Pget\_append\_flush Retrieves the values of the append property that is set up in the dataset access property list. -H5Pset_append_flush +H5Pset\_append\_flush Sets two actions to perform when the size of a dataset's dimension being appended reaches a specified boundary. -H5Pget_object_flush_cb +H5Pget\_object\_flush\_cb Retrieves the object flush property values from the file access property list. -H5Pset_object_flush_cb +H5Pset\_object\_flush\_cb Sets a callback function to invoke when an object flush occurs in the file. -H5Odisable_mdc_flushes +H5Odisable\_mdc\_flushes Prevents metadata entries for an HDF5 object from being flushed from the metadata cache to storage. -H5Oenable_mdc_flushes +H5Oenable\_mdc\_flushes Returns the cache entries associated with an HDF5 object to the default metadata flush and eviction algorithm. -H5Oare_mdc_flushes_disabled +H5Oare\_mdc\_flushes\_disabled Determines if an HDF5 object (dataset, group, committed datatype) has had flushes of metadata entries disabled. -H5Fdisable_mdc_flushes +H5Fdisable\_mdc\_flushes Globally prevents dirty metadata entries from being flushed from the metadata cache to storage. -H5Fenable_mdc_flushes +H5Fenable\_mdc\_flushes Returns a file's metadata cache to the standard eviction and flushing algorithm. -H5Fare_mdc_flushes_disabled +H5Fare\_mdc\_flushes\_disabled Determines if flushes have been globally disabled for a file's metadata cache. -H5Fget_mdc_flush_disabled_obj_ids +H5Fget\_mdc\_flush\_disabled\_obj\_ids @@ -663,30 +670,30 @@ h5watch Allows users to output new records appended to a dataset under SWMR access as it grows. The functionality is similar to the Unix user command tail with the follow option, which outputs appended data as the file grows. -h5format_convert +h5format\_convert This tool allows users to convert the indexing type of a chunked dataset made with a 1.10.x version of the HDF5 Library when the latest file format is used to the 1.8.x version 1 B-tree indexing type. For example, datasets created using SWMR access, can be converted to be accessed by the HDF5 1.18 library and tools. The tool does not rewrite raw data, but it does rewrite HDF5 metadata. Collective Metadata I/O: -H5Pset_coll_metadata_write -h5pset_coll_metadata_write_f +H5Pset\_coll\_metadata\_write +h5pset\_coll\_metadata\_write\_f Establishes I/O mode property setting, collective or independent, for metadata writes. -H5Pget_coll_metadata_write -h5pget_coll_metadata_write_f +H5Pget\_coll\_metadata\_write +h5pget\_coll\_metadata\_write\_f Retrieves I/O mode property setting for metadata writes. -H5Pset_all_coll_metadata_ops -h5pset_all_coll_metadata_ops_f +H5Pset\_all\_coll\_metadata\_ops +h5pset\_all\_coll\_metadata\_ops\_f Establishes I/O mode, collective or independent, for metadata read operations. -H5Pget_all_coll_metadata_ops -h5pget_all_coll_metadata_ops_f +H5Pget\_all\_coll\_metadata\_ops +h5pget\_all\_coll\_metadata\_ops\_f Retrieves I/O mode for metadata read operations. @@ -695,19 +702,19 @@ Retrieves I/O mode for metadata read operations. Fine-tuning the Metadata Cache: -H5Fget_metadata_read_retries_info +H5Fget\_metadata\_read\_retries\_info Retrieves the collection of read retries for metadata items with checksum. -H5Pget_metadata_read_attempts +H5Pget\_metadata\_read\_attempts Retrieves the number of read attempts from a file access property list. -H5Pset_metadata_read_attempts +H5Pset\_metadata\_read\_attempts @@ -745,30 +752,30 @@ H5Trefresh Causes all buffers associated with a committed datatype to be cleared and immediately re-loaded with updated contents from disk storage. -H5Fget_intent +H5Fget\_intent Determines the read/write or read-only status of a file. Logging APIs: -H5Pset_mdc_log_options +H5Pset\_mdc\_log\_options Sets metadata cache logging options. -H5Pget_mdc_log_options +H5Pget\_mdc\_log\_options Gets metadata cache logging options. -H5Fstart_mdc_logging +H5Fstart\_mdc\_logging Starts logging metadata cache events if logging was previously enabled. -H5Fstop_mdc_logging +H5Fstop\_mdc\_logging Stops logging metadata cache events if logging was previously enabled and is currently ongoing. -H5Pget_mdc_logging_status +H5Pget\_mdc\_logging\_status @@ -777,31 +784,31 @@ Gets the current metadata cache logging status. File Space Management: -H5Fget_free_sections +H5Fget\_free\_sections Retrieves free-space section information for a file. -H5Fget_freespace +H5Fget\_freespace Returns the amount of free space in a file. -H5Fget_info2 +H5Fget\_info2 Returns global information for a file. -H5Pset_file_space +H5Pset\_file\_space Sets the file space management strategy and/or the free-space section threshold for an HDF5 file. -H5Pget_file_space +H5Pget\_file\_space @@ -819,66 +826,66 @@ Repacks HDF5 files with various options, including the ability to change the app Virtual Dataset or VDS: -H5Pset_virtual -h5pset_virtual_f +H5Pset\_virtual +h5pset\_virtual\_f Sets the mapping between virtual and source datasets. -H5Pget_virtual_count -h5pget_virtual_count_f +H5Pget\_virtual\_count +h5pget\_virtual\_count\_f Retrieves the number of mappings for the virtual dataset. -H5Pget_virtual_vspace -h5pget_virtual_vspace_f +H5Pget\_virtual\_vspace +h5pget\_virtual\_vspace\_f Retrieves a dataspace identifier for the selection within the virtual dataset used in the mapping. -H5Pget_virtual_srcspace -h5pget_virtual_srcspace_f +H5Pget\_virtual\_srcspace +h5pget\_virtual\_srcspace\_f Retrieves a dataspace identifier for the selection within the source dataset used in the mapping. -H5Pget_virtual_dsetname -h5pget_virtual_dsetname_f +H5Pget\_virtual\_dsetname +h5pget\_virtual\_dsetname\_f Retrieves the name of a source dataset used in the mapping. -H5Pget_virtual_filename -h5pget_virtual_filename_f +H5Pget\_virtual\_filename +h5pget\_virtual\_filename\_f Retrieves the filename of a source dataset used in the mapping. -H5Pset_virtual_printf_gap -h5pset_virtual_printf_gap_f +H5Pset\_virtual\_printf\_gap +h5pset\_virtual\_printf\_gap\_f Sets maximum number of missing source files and/or datasets with printf-style names when getting the extent of an unlimited virtual dataset. -H5Pget_virtual_printf_gap -h5pget_virtual_printf_gap_f +H5Pget\_virtual\_printf\_gap +h5pget\_virtual\_printf\_gap\_f Returns maximum number of missing source files and/or datasets with printf-style names when getting the extent for an unlimited virtual dataset. -H5Pset_virtual_view -h5pset_virtual_view_f +H5Pset\_virtual\_view +h5pset\_virtual\_view\_f Sets the view of the virtual dataset to include or exclude missing mapped elements. -H5Pget_virtual_view -h5pget_virtual_view_f +H5Pget\_virtual\_view +h5pget\_virtual\_view\_f Retrieves the view of a virtual dataset. Supporting Functions: -H5Sis_regular_hyperslab -h5sis_regular_hyperslab_f +H5Sis\_regular\_hyperslab +h5sis\_regular\_hyperslab\_f Determines whether a hyperslab selection is regular. -H5Sget_regular_hyperslab -h5sget_regular_hyperslab_f +H5Sget\_regular\_hyperslab +h5sget\_regular\_hyperslab\_f Retrieves a regular hyperslab selection. @@ -886,38 +893,38 @@ Retrieves a regular hyperslab selection. Modified Functions: The following pre-exising functions have been modified to understand virtual datasets. -H5Pset_layout -h5pset_layout_f +H5Pset\_layout +h5pset\_layout\_f Specifies the layout to be used for a dataset. -Virtual dataset, H5D_VIRTUAL, has been added to the list of layouts available through this function. +Virtual dataset, H5D\_VIRTUAL, has been added to the list of layouts available through this function. -H5Pget_layout -h5pget_layout_f +H5Pget\_layout +h5pget\_layout\_f Retrieves the layout in use for a dataset. -Virtual dataset, H5D_VIRTUAL, has been added to the list of layouts. +Virtual dataset, H5D\_VIRTUAL, has been added to the list of layouts. Partial Edge Chunks: -H5Pset_chunk_opts +H5Pset\_chunk\_opts Sets a partial edge chunk option in a dataset access property list. -H5Pget_chunk_opts +H5Pget\_chunk\_opts Retrieves partial edge chunk option setting from a dataset access property list. Relative Pathnames for External Links: -H5Pset_elink_prefix +H5Pset\_elink\_prefix These functions enable the user to specify the locations of external files. (These functions are not yet documented.) -H5Pget_elink_prefix +H5Pget\_elink\_prefix Property List Encoding and Decoding: @@ -952,25 +959,25 @@ H5PTcreate Takes a property list identifier to provide flexibility on creation properties. -H5PTcreate_fl has been removed. +H5PTcreate\_fl has been removed. -H5PTfree_vlen_buff +H5PTfree\_vlen\_buff -Replaces H5PTfree_vlen_readbuff. +Replaces H5PTfree\_vlen\_readbuff. New functions: Two accessor functions have been added. -H5PTget_dataset +H5PTget\_dataset Returns the identifier of the dataset associated a packet table. -H5PTget_type +H5PTget\_type Returns the identifier of the datatype used by a packet table. -H5PTis_varlen +H5PTis\_varlen Determines whether a type is variable-length. @@ -980,13 +987,13 @@ Overloaded constructor An overloaded constructor has been added. -FL_PacketTable +FL\_PacketTable -Takes a property list identifier to provide flexibility on creation properties.>/dd> +Takes a property list identifier to provide flexibility on creation properties. -H5PTfree_vlen_buff +H5PTfree\_vlen\_buff -Replaces H5PTfree_vlen_readbuff. +Replaces H5PTfree\_vlen\_readbuff. Accessor wrappers @@ -1004,11 +1011,11 @@ Other wrappers PacketTable::FreeBuff() -Replaces VL_PacketTable::FreeReadBuff(). +Replaces VL\_PacketTable::FreeReadBuff(). PacketTable::IsVariableLength() -Replaces VL_PacketTable::IsVariableLength(). +Replaces VL\_PacketTable::IsVariableLength(). Overloaded functions: @@ -1027,7 +1034,7 @@ Configure option: CMake option: -HDF5_BUILD_JAVA:BOOL=ON +HDF5\_BUILD\_JAVA:BOOL=ON Prior to the 1.10.x series, the HDF5 Java tools were built from an independent repository and were not as fully integrated with HDF5. were built from an independent repository and were not as fully integrated with HDF5. @@ -1058,23 +1065,23 @@ New versioned functions and associated compatibility macros Two functions and a struct have been converted to a versioned form in this release. Compatibility macros have been created for each. -H5Fget_info +H5Fget\_info -The original function is renamed to H5Fget_info1 and deprecated. +The original function is renamed to H5Fget\_info1 and deprecated. -A new version of the function, H5Fget_info2, is introduced. +A new version of the function, H5Fget\_info2, is introduced. -The compatiblity macro H5Fget_info is introduced. +The compatiblity macro H5Fget\_info is introduced. -H5F_info_t +H5F\_info\_t -This is the struct used by the H5Fget_info functions and macro. +This is the struct used by the H5Fget\_info functions and macro. -The original struct is renamed to H5F_info1_t and deprecated. +The original struct is renamed to H5F\_info1\_t and deprecated. -A new version of the struct, H5F_info2_t, is introduced. +A new version of the struct, H5F\_info2\_t, is introduced. -The compatiblity macro H5F_info_t is introduced. +The compatiblity macro H5F\_info\_t is introduced. H5Rdereference @@ -1084,26 +1091,26 @@ A new version of the function, H5Rdereference2, is introduced. The compatiblity macro H5Rdereference is introduced. -Autotools Configuration and Large File Support +### Autotools Configuration and Large File Support Autotools configuration has been extensively reworked and autotool's handling of large file support has been overhauled in this release. See the following sections in RELEASE.txt: -[Autotools Configuration Has Been Extensively Reworked](Autotools Configuration Has Been Extensively Reworked) -[LFS Changes](LFS Changes) -RELEASE.txt can be found in the release_docs/ subdirectory at the root level of the HDF5 code distribution. +* Autotools Configuration Has Been Extensively Reworked +* LFS Changes + +RELEASE.txt can be found in the release\_docs/ subdirectory at the root level of the HDF5 code distribution. -Compatibility Report and Comments -Compatibility report for Release 1.10.0 versus Release 1.8.16 +### Compatibility Report and Comments - See API Compatibility Reports for 1.10 for information regarding compatibility with previous releases. +[Compatibility report for Release 1.10.0 versus Release 1.8.16]() Comments regarding the report -In the C interface, the hid_t change from 32-bit to 64-bit was made in order to address a performance problem that arose when the library "ran out" of valid object identifiers to issue and thus needed to employ an expensive algorithm to find previously issued identifiers that could be re-issued. This problem is avoided by switching the size of the hid_t type to 64-bit integers instead of 32-bit integers in order to make the pool of available integers significantly larger. (H5E_major_t and H5E_minor_t are aliased to hid_t which is why they changed size as well). (An alternate solution to this problem was applied in release HDF5 1.8.5 but this is the cleaner/preferred solution and had to wait until 1.10.0 to be included). +In the C interface, the hid\_t change from 32-bit to 64-bit was made in order to address a performance problem that arose when the library "ran out" of valid object identifiers to issue and thus needed to employ an expensive algorithm to find previously issued identifiers that could be re-issued. This problem is avoided by switching the size of the hid\_t type to 64-bit integers instead of 32-bit integers in order to make the pool of available integers significantly larger. (H5E\_major\_t and H5E\_minor\_t are aliased to hid\_t which is why they changed size as well). (An alternate solution to this problem was applied in release HDF5 1.8.5 but this is the cleaner/preferred solution and had to wait until 1.10.0 to be included). -hbool_t will now be defined as a _Bool type when configure determines that it's available. +hbool\_t will now be defined as a \_Bool type when configure determines that it's available. -Public structs that have members of type hid_t or hbool_t are affected by the above changes accordingly. +Public structs that have members of type hid\_t or hbool\_t are affected by the above changes accordingly. -The H5Fget_info function was renamed due to the introduction of a newer version of the function which returns additional information. The H5Rdereference function was renamed due to the introduction of a newer version of the function which allows a data access property list to be passed in. Both changes are accompanied with compatibility macros, so while existing code will need to be recompiled in order to use the newer library version, these functions do not need to be changed in application code using them provided that the HDF5 API compatibility macros are configured appropriately. +The H5Fget\_info function was renamed due to the introduction of a newer version of the function which returns additional information. The H5Rdereference function was renamed due to the introduction of a newer version of the function which allows a data access property list to be passed in. Both changes are accompanied with compatibility macros, so while existing code will need to be recompiled in order to use the newer library version, these functions do not need to be changed in application code using them provided that the HDF5 API compatibility macros are configured appropriately. diff --git a/documentation/hdf5-docs/release_specifics/sw_changes_1.12.md b/documentation/hdf5-docs/release_specifics/sw_changes_1.12.md index f2584281..e1cba4a0 100644 --- a/documentation/hdf5-docs/release_specifics/sw_changes_1.12.md +++ b/documentation/hdf5-docs/release_specifics/sw_changes_1.12.md @@ -18,23 +18,24 @@ Note that bug fixes and performance enhancements in the C library are automatica The following information is included below. -* Release 1.12.2 versus Release 1.12.1 -* [Release 1.12.1 versus Release 1.12.0](#Release-1.12.1-versus-Release-1.12.0) -* [Release 1.12.0 versus Release 1.10.6](#Release-1.12.0-versus-Release-1.10.6) +* Release 1.12.2 versus Release 1.12.1 +* Release 1.12.1 versus Release 1.12.0 +* Release 1.12.0 versus Release 1.10.6 The release notes also list changes made to the library, but these notes tend to be more at a more detail-oriented level. The release notes may include new features, bugs fixed, supported configuration features, platforms on which the library has been tested, and known problems. The release note files are listed below and can be found at the top level of the HDF5 source code tree in the release_docs directory. -| | | -| ------------------------ | ------------------------------------------------------------ | -| Release Notes | Technical notes regarding the current release of the HDF5 library (RELEASE.txt in the source code) | -| HISTORY-1_12.txt | Release information for HDF5-1.12.0 through 1.12.1 | -| HISTORY-1_10_0-1_12_0.txt | Development history between the HDF5-1.10.0 and HDF5-1.12.0 releases | -| HISTORY-1_10.txt | Release information for all HDF5-1.10 releases | -| HISTORY-1_8_0-1_10_0.txt | Development history between the HDF5-1.8.0 and HDF5-1.10.0 releases | -| HISTORY-1_8.txt | Release information for HDF5-1.8.0 through 1.8.21 | -| HISTORY-1_0-1_8_0_rc3.txt | Technical notes starting with HDF5-1.0.0 and ending with HDF5-1.8.0-rc3 (the state of the code prior to the HDF5-1.8.0 release) | - -## Release 1.12.2 versus Release 1.12.1 +| | | +| ----------------------------- | ------------------------------------------------------------ | +| **Release Notes** | Technical notes regarding the current release of the HDF5 library (RELEASE.txt in the source code) | +| **HISTORY-1_12.txt** | Release information for HDF5-1.12.0 through 1.12.1 | +| **HISTORY-1_10_0-1_12_0.txt** | Development history between the HDF5-1.10.0 and HDF5-1.12.0 releases | +| **HISTORY-1_10.txt** | Release information for all HDF5-1.10 releases | +| **HISTORY-1_8_0-1_10_0.txt** | Development history between the HDF5-1.8.0 and HDF5-1.10.0 releases | +| **HISTORY-1_8.txt** | Release information for HDF5-1.8.0 through 1.8.21 | +| **HISTORY-1_0-1_8_0_rc3.txt** | Technical notes starting with HDF5-1.0.0 and ending with HDF5-1.8.0-rc3 (the state of the code prior to the HDF5-1.8.0 release) | + +

    Release 1.12.2 versus Release 1.12.1

    + This section lists interface-level changes and other user-visible changes in behavior in the transition from HDF5 Release 1.12.1 to Release 1.12.2. ### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros @@ -47,7 +48,8 @@ H5LTset_attribute_ullong Create an unsigned long long attribute H5VLobject_is_native Determines whether an object ID represents a native VOL connector object See API Compatibility Reports for 1.12.2 for complete details. -## Release 1.12.1 versus Release 1.12.0 +

    Release 1.12.1 versus Release 1.12.0

    + This section lists interface-level changes and other user-visible changes in behavior in the transition from HDF5 Release 1.12.0 to Release 1.12.1. The following are new C functions in this release: @@ -82,11 +84,12 @@ See the API Compatibility report for complete details. Compatibility Notes and Report See API Compatibility Reports for 1.12 for information regarding compatibility with previous releases. -## Release 1.12.0 versus Release 1.10.6 +

    Release 1.12.1 versus Release 1.12.0

    + This section lists interface-level changes and other user-visible changes in behavior in the transition from HDF5 Release 1.10.6 to Release 1.12.0. ### New Features -For a description of the major new features that were introduced, please see New Features in HDF5 Release 1.12. +For a description of the major new features that were introduced, please see [New Features in HDF5 Release 1.12](new_features_1_12.md). ### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros diff --git a/documentation/hdf5-docs/release_specifics/sw_changes_1.14.md b/documentation/hdf5-docs/release_specifics/sw_changes_1.14.md index 7d48f806..b3796bd8 100644 --- a/documentation/hdf5-docs/release_specifics/sw_changes_1.14.md +++ b/documentation/hdf5-docs/release_specifics/sw_changes_1.14.md @@ -10,12 +10,212 @@ redirect_from: See [API Compatibility Macros](api_comp_macros.html) in HDF5 for details on using HDF5 version 1.14 with previous releases. + +* [Compatibility report for Release 1.14.1 versus Release 1.14.0](https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDFGroup/hdf5doc/master/html/ADGuide/Compatibility_Report/hdf5-1.14.0-vs-hdf5-1.14.1-interface_compatibility_report.html) + +* [Compatibility report for Release 1.14.0 versus Release 1.12.2](https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDFGroup/hdf5doc/master/html/ADGuide/Compatibility_Report/hdf5-1.12.2-vs-hdf5-1.14.0-interface_compatibility_report.html) + + +This page provides information on the changes that a maintenance developer needs to be aware of between successive releases of HDF5, such as: + +New or changed features or tools +Syntax and behavioral changes in the existing application programming interface (the API) +Certain types of changes in configuration or build processes +Note that bug fixes and performance enhancements in the C library are automatically picked up by the C++, Fortran, and Java libraries. + +The following information is included below. + +* Release 1.14.4 versus Release 1.14.3 +* Release 1.14.3 versus Release 1.14.2 +* Release 1.14.2 versus Release 1.14.1 +* Release 1.14.1 versus Release 1.14.0 +* Release 1.14.1 versus Release 1.12.2 + +The release notes also list changes made to the library, but these notes tend to be more at a more detail-oriented level. The release notes may include new features, bugs fixed, supported configuration features, platforms on which the library has been tested, and known problems. The release note files are listed in each release section and can be found at the top level of the HDF5 source code tree in the release\_docs directory. + +

    Release 1.14.4 versus Release 1.14.3

    + +

    Release 1.14.3 versus Release 1.14.2

    + +This section lists interface-level changes and other user-visible changes in behavior in the transition from HDF5 Release 1.14.2 to Release 1.14.3. + +### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros +Following are the new APIs introduced in HDF5-1.14.3. + +#### In the C Interface (main library) + +| Function | Description | +| ---------------------------------- | ----------------------------------- | +| H5Pget_actual_selection_io_mode() | Retrieves actual selection I/O mode | + +#### In the Fortran Interface + +| Function | Description | +| ---------------------------------- | ----------------------------------- | +| h5get_free_list_sizes_f | Retrieves actual selection I/O mode | +| h5dwrite_chunk_f | Retrieves actual selection I/O mode | +| h5dread_chunk_f | Retrieves actual selection I/O mode | +| h5fget_info_f | Retrieves actual selection I/O mode | +| h5lvisit_f | Retrieves actual selection I/O mode | +| h5lvisit_by_name_f | Retrieves actual selection I/O mode | +| h5pget_no_selection_io_cause_f | Retrieves actual selection I/O mode | +| h5pget_mpio_no_collective_cause_f | Retrieves actual selection I/O mode | +| h5sselect_shape_same_f | Retrieves actual selection I/O mode | +| h5sselect_intersect_block_f | Retrieves actual selection I/O mode | +| h5pget_file_space_page_size_f | Retrieves actual selection I/O mode | +| h5pset_file_space_page_size_f | Retrieves actual selection I/O mode | +| h5pget_file_space_strategy_f | Retrieves actual selection I/O mode | +| h5pset_file_space_strategy_f | Retrieves actual selection I/O mode | + +In addition, there are other new Fortran functions including the Fortran async APIs +and the Fortran VOL capability definitions. + * [Compatibility report for Release 1.14.3 versus Release 1.14.2](https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDFGroup/hdf5doc/master/html/ADGuide/Compatibility_Report/hdf5-1.14.2-vs-hdf5-1.14.3-interface_compatibility_report.html) +

    Release 1.14.2 versus Release 1.14.1

    + +This section lists interface-level changes and other user-visible changes in behavior in the transition from HDF5 Release 1.14.1 to Release 1.14.2. + +### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros +Following are the new APIs introduced in HDF5-1.14.2. + +#### In the C Interface (main library) + +| Function | Description | +| ---------------------------------- | ------------------------------------------------------------ | +| H5FDread\_from\_selection | Performs a series of scalar reads. | +| H5FDread\_vector\_from\_selection | Performs a vector read if vector reads are supported, or a series of scalar reads, otherwise. | +| H5FDwrite\_from\_selection | Performs a series of scalar writes. | +| H5FDwrite\_vector\_from\_selection | Performs a vector write if vector writes are supported, or a series of scalar writes, otherwise. | +| H5Pget\_fapl\_ros3\_token | Returns session/security token of the ros3 file access property list | +| H5Pset\_fapl\_ros3\_token | Modifies the file access property list to use the H5FD\_ROS3 driver | + * [Compatibility report for Release 1.14.2 versus Release 1.14.1](https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDFGroup/hdf5doc/master/html/ADGuide/Compatibility_Report/hdf5-1.14.1-2-vs-hdf5-1.14.2-interface_compatibility_report.html) -* [Compatibility report for Release 1.14.1 versus Release 1.14.0](https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDFGroup/hdf5doc/master/html/ADGuide/Compatibility_Report/hdf5-1.14.0-vs-hdf5-1.14.1-interface_compatibility_report.html) +

    Release 1.14.1 versus Release 1.14.0

    -* [Compatibility report for Release 1.14.0 versus Release 1.12.2](https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDFGroup/hdf5doc/master/html/ADGuide/Compatibility_Report/hdf5-1.12.2-vs-hdf5-1.14.0-interface_compatibility_report.html) +This section lists interface-level changes and other user-visible changes in behavior in the transition from HDF5 Release 1.14.0 to Release 1.14.1. + +### New and Changed Functions, Classes, Subroutines, Wrappers, and Macros +Following are the new APIs introduced in HDF5-1.14.1. + +#### In the C Interface (main library) + +| Function | Description | +| -------------------------------- | ------------------------------------------------------------ | +| H5PGET\_MODIFY\_WRITE\_BUF | Retrieves the "modify write buffer" property | +| H5PGET\_NO\_SELECTION\_IO\_CAUSE | Retrieves the cause for not performing selection or vector I/O on the last parallel I/O call | +| H5PGET\_SELECTION\_IO | Retrieves the selection I/O mode | +| H5PSET\_MODIFY\_WRITE\_BUF | Allows the library to modify the contents of the write buffer | +| H5PSET\_SELECTION\_IO | Sets the selection I/O mode | + +#### In the Fortran Interface + +| Function | Description | +| -------------------------------- | ------------------------------------------------------------ | +| h5pget\_modify\_write\_buf\_f | Retrieves the "modify write buffer" property | +| h5pget\_selection\_io\_f | Retrieves the selection I/O mode | +| h5pset\_modify\_write\_buf\_f | Allows the library to modify the contents of the write buffer| +| h5pset\_selection\_io\_f | Sets the selection I/O mode | + + +

    Release 1.14.0 versus Release 1.12.2

    + +HDF5 version 1.14.0 introduces the following new features: + +* Asynchronous I/O operations +* New tools h5dwalk and h5delete +* Subfiling VFD +* Onion VFD +* Multi Dataset I/O + +Users might find these names familiar already and that is because they were introduced in the experimental series 1.13. + +In addition, this version provides many new C APIs and other user-visible changes in behavior in the transition from HDF5 Release 1.12.2 to Release 1.14.0. HDF5 1.14.0 adds no new API calls that require use of the API Compatibility Macros for the main library. Some calls have been removed or have had their signature change, however. + +New and Changed Functions, Classes, Subroutines, Wrappers, and Macros +In the C Interface (main library) + +Following are the new APIs introduced in HDF5-1.14.0. +| Function | Fortran | Description | +| ---------------------------------- | ------- | ---------------------------------------------------| +| H5A\_CLOSE\_ASYNC | N | Asynchronous version of H5Aclose | +| H5A\_CREATE\_ASYNC | N | Asynchronous version of H5Acreate | +| H5A\_CREATE\_BY\_NAME\_ASYNC | N | Asynchronous version of H5Acreate\_by\_name | +| H5A\_EXISTS\_ASYNC | N | Asynchronous version of H5Aexists | +| H5A\_EXISTS\_BY\_NAME\_ASYNC | N | Asynchronous version of H5Aexists\_by\_name | +| H5A\_OPEN\_ASYNC | N | Asynchronous version of H5Aopen | +| H5A\_OPEN\_BY\_IDX\_ASYNC | N | Asynchronous version of H5Aopen\_by\_idx | +| H5A\_OPEN\_BY\_NAME\_ASYNC | N | Asynchronous version of H5Aopen\_by\_name | +| H5A\_READ\_ASYNC | N | Asynchronous version of H5Aread | +| H5A\_RENAME\_ASYNC | N | Asynchronous version of H5Arename | +| H5A\_RENAME\_BY\_NAME\_ASYNC | N | Asynchronous version of H5Arename\_by\_name | +| H5A\_WRITE\_ASYNC | N | Asynchronous version of H5Awrite | +| H5D\_CHUNK\_ITER | N | Iterate over all chunks of a chunked dataset | +| H5D\_CLOSE\_ASYNC | N | Asynchronous version of H5Dclose | +| H5D\_CREATE\_ASYNC | N | Asynchronous version of H5Dcreate | +| H5D\_GET\_SPACE\_ASYNC | N | Asynchronous version of H5Dget\_space | +| H5D\_OPEN\_ASYNC | N | Asynchronous version of H5Dopen | +| H5D\_READ\_ASYNC | N | Asynchronous version of H5Dread | +| H5D\_READ\_MULTI | N | Reads raw data from a set of datasets into the provided buffers | +| H5D\_READ\_MULTI\_ASYNC | N | Asynchronous version of H5Dread\_multi | +| H5D\_SET\_EXTENT\_ASYNC | N | Asynchronous version of H5Dset\_extent | +| H5D\_WRITE\_ASYNC | N | Asynchronous version of H5Dwrite | +| H5D\_WRITE\_MULTI | N | Writes raw data from a set buffers to a set of datasets | +| H5D\_WRITE\_MULTI\_ASYNC | N | Asynchronous version of H5Dwrite\_multi | +| H5E\_APPEND\_STACK | N | Appends one error stack to another, optionally closing the source stack | +| H5ES\_CANCEL | N | Attempt to cancel operations in an event set | +| H5ES\_CLOSE | N | Terminates access to an event set | +| H5ES\_CREATE | N | Creates an event set | +| H5ES\_FREE\_ERR\_INFO | N | Convenience routine to free an array of H5ES\_err\_info\_t structs | +| H5ES\_GET\_COUNT | N | Retrieves number of events in an event set | +| H5ES\_GET\_ERR\_COUNT | N | Retrieves the number of failed operations | +| H5ES\_GET\_ERR\_INFO | N | Retrieves information about failed operations | +| H5ES\_GET\_ERR\_STATUS | N | Checks for failed operations | +| H5ES\_GET\_OP\_COUNTER | N | Retrieves the accumulative operation counter for an event set | +| H5ES\_REGISTER\_COMPLETE\_FUNC | N | Registers a callback to invoke when an operation completes within an event set | +| H5ES\_REGISTER\_INSERT\_FUNC | N | Registers a callback to invoke when a new operation is inserted into | +| H5ES\_WAIT | N | Waits for operations in event set to complete | +| H5FD\_ONION\_GET\_REVISION\_COUNT | N | Gets the number of revisions | +| H5P\_GET\_FAPL\_ONION | N | Gets the onion info from the file access property list | +| H5P\_SET\_FAPL\_ONION | N | Sets the onion info for the file access property list | +| H5F\_CLOSE\_ASYNC | N | Asynchronous version of H5Fclose | +| H5F\_CREATE\_ASYNC | N | Asynchronous version of H5Fcreate | +| H5F\_FLUSH\_ASYNC | N | Asynchronous version of H5Fflush | +| H5F\_OPEN\_ASYNC | N | Asynchronous version of H5Fopen | +| H5F\_REOPEN\_ASYNC | N | Asynchronous version of H5Freopen | +| H5G\_CLOSE\_ASYNC | N | Asynchronous version of H5Gclose | +| H5G\_CREATE\_ASYNC | N | Asynchronous version of H5Gcreate | +| H5G\_GET\_INFO\_ASYNC | N | Asynchronous version of H5Gget\_info | +| H5G\_GET\_INFO\_BY\_IDX\_ASYNC | N | Asynchronous version of H5Gget\_info\_by\_idx | +| H5G\_GET\_INFO\_BY\_NAME\_ASYNC | N | Asynchronous version of H5Gget\_info\_by\_name | +| H5G\_OPEN\_ASYNC | N | Asynchronous version of H5Gopen | +| H5I\_REGISTER\_FUTURE | N | Registers a "future" object under a type and returns an ID for it | +| H5L\_CREATE\_HARD\_ASYNC | N | Asynchronous version of H5Lcreate\_hard | +| H5L\_CREATE\_SOFT\_ASYNC | N | Asynchronous version of H5Lcreate\_soft | +| H5L\_DELETE\_ASYNC | N | Asynchronous version of H5Ldelete | +| H5L\_DELETE\_BY\_IDX\_ASYNC | N | Asynchronous version of H5Ldelete\_by\_idx | +| H5L\_EXISTS\_ASYNC | N | Asynchronous version of H5Lexists | +| H5L\_ITERATE\_ASYNC | N | Asynchronous version of H5Literate | +| H5O\_CLOSE\_ASYNC | N | Asynchronous version of H5Oclose | +| H5O\_COPY\_ASYNC | N | Asynchronous version of H5Ocopy | +| H5O\_FLUSH\_ASYNC | N | Asynchronous version of H5Oflush | +| H5O\_GET\_INFO\_BY\_NAME\_ASYNC | N | Asynchronous version of H5Oget\_info\_by\_name | +| H5O\_OPEN\_ASYNC | N | Asynchronous version of H5Oopen | +| H5O\_OPEN\_BY\_IDX\_ASYNC | N | Asynchronous version of H5Oopen\_by\_idx | +| H5O\_REFRESH\_ASYNC | N | Asynchronous version of H5Orefresh | +| H5P\_GET\_DRIVER\_CONFIG\_STR | N | Retrieves a string representation of the configuration for the driver set on the given FAPL | +| H5P\_GET\_VOL\_CAP\_FLAGS | N | Query the capability flags for the VOL connector that will be used with this file access property list (FAPL) | +| H5P\_SET\_DATASET\_IO\_HYPERSLAB\_SELECTION | N | Sets a hyperslab file selection for a dataset I/O operation | +| H5P\_SET\_DRIVER\_BY\_NAME | N | Sets a file driver according to a given driver name | +| H5P\_SET\_DRIVER\_BY\_VALUE | N | Sets a file driver according to a given driver value (ID) | +| H5\_ATCLOSE | N | Registers a callback for the library to invoke when it's closing | +| H5\_IS\_LIBRARY\_TERMINATING | N | Checks whether the HDF5 library is closing | +| H5R\_OPEN\_ATTR\_ASYNC | N | Asynchronous version of H5Ropen\_attr | +| H5R\_OPEN\_OBJECT\_ASYNC | N | Asynchronous version of H5Ropen\_object | +| H5R\_OPEN\_REGION\_ASYNC | N | Asynchronous version of H5Ropen\_region | +| H5T\_CLOSE\_ASYNC | N | Asynchronous version of H5Tclose | +| H5T\_COMMIT\_ASYNC | N | Asynchronous version of H5Tcommit2 | +| H5T\_OPEN\_ASYNC | N | Asynchronous version of H5Topen2 | diff --git a/documentation/index.md b/documentation/index.md index d4f2bf26..171af9da 100644 --- a/documentation/index.md +++ b/documentation/index.md @@ -14,13 +14,18 @@ Look for more content here soon. * [Documentation - 1.12.x](https://docs.hdfgroup.org/hdf5/v1_12/index.html) * [Documentation - 1.10.x](https://docs.hdfgroup.org/hdf5/v1_10/index.html) * [Documentation - 1.8.x](https://docs.hdfgroup.org/hdf5/v1_8/index.html) -* [Registered Filter Plugins](/documentation/hdf5-docs/registered_filter_plugins.html) +* [Registered Filter Plugins](https://github.com/HDFGroup/hdf5_plugins/blob/master/docs/RegisteredFilterPlugins.md) * [Registered Virtual File Drivers (VFDs)](/documentation/hdf5-docs/registered_virtual_file_drivers_vfds.html) * [Registered VOL Connectors](/documentation/hdf5-docs/registered_vol_connectors.html) +* [Chunking in HDF5](hdf5-docs/chunking_in_hdf5.md) * [Release Specific Information](hdf5-docs/release_specific_info.md) - \*\*\* UNDER CONSTRUCTION \*\*\* +* [Advanced Topics](hdf5-docs/advanced_topics_list.md) - \*\*\* UNDER CONSTRUCTION \*\*\* Can't find what you're looking for? Visit [docs.hdfgroup.org](https://docs.hdfgroup.org/hdf5/v1_14/index.html) for many other documents that previously resided on the portal. +## HDF5 Plugins +* [Registered Filter Plugins](https://github.com/HDFGroup/hdf5_plugins/blob/master/docs/RegisteredFilterPlugins.md) + ## HDF4 * [Reference Manual](/documentation/hdf4-docs/HDF4_Reference_Manual.pdf) * [User Guide](/documentation/hdf4-docs/HDF4_Users_Guide.pdf) diff --git a/downloads/hdf5/hdf5_1_14_4.md b/downloads/hdf5/hdf5_1_14_4.md index 7af8b99e..3dff022e 100644 --- a/downloads/hdf5/hdf5_1_14_4.md +++ b/downloads/hdf5/hdf5_1_14_4.md @@ -5,32 +5,28 @@ title: HDF5 Library and Tools 1.14.4 HDF5 Logo # HDF5 Library and Tools 1.14.4 -### *** UNDER CONSTRUCTION *** ## Release Information -| Version | HDF5 1.14.4 | -| Release Date | 03/31/24 | -| Additional Release Information | [Documentation](https://docs.hdfgroup.org/hdf5/v1_14/index.html) | -|| [Release Notes](https://github.com/HDFGroup/hdf5/blob/hdf5_1_14_4/release_docs/RELEASE.txt) | -|| Software Changes From Release to Release for HDF5-1.14 | -|| New Features in HDF5 Release 1.14 | -|| [Newsletter Announcement](https://www.hdfgroup.org/2023/10/release-of-hdf5-1-14-3-library-and-tools-newsletter-199/) | +| Version | HDF5 1.14.4.2 | +| Release Date | 04/15/24 | +| Additional Release Information | [Release Notes](https://github.com/HDFGroup/hdf5/blob/hdf5_1_14_4/release_docs/RELEASE.txt) | +|| [Software Changes from Release to Release](../../documentation/hdf5-docs/release_specifics/sw_changes_1.14.md)) | +|| [New Features in HDF5 Release 1.14](https://portal.hdfgroup.org/documentation/hdf5-docs/release_specifics/new_features_1_14.html) | +|| [Newsletter Announcement](https://www.hdfgroup.org/2024/04/release-of-hdf5-1-14-4-newsletter-202/) | || [Doxygen generated Reference Manual](https://docs.hdfgroup.org/hdf5/v1_14/index.html) | -|| [API Compatibility Report between 1.14.3 and 1.14.4](https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDFGroup/hdf5doc/master/html/ADGuide/Compatibility_Report/hdf5-1.14.3-vs-hdf5-1.14.4-interface_compatibility_report.html) | -|| [Java Interface Compatibility Report between 1.14.3 and 1.14.4](https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDFGroup/hdf5doc/master/html/ADGuide/Compatibility_Report/hdf5-1.14.3-vs-hdf5-1.14.4-java-interface_compatibility_report.html) | +|| [API Compatibility Report between 1.14.3 and 1.14.4](https://github.com/HDFGroup/hdf5/releases/download/hdf5_1.14.4.2/hdf5-1.14.4-2.html.abi.reports.tar.gz) | -## Files +## Download -| File | Type | Install Instructions | -| ---- | ---- | ---- | -| [hdf5-1.14.4.tar.gz](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/hdf5-1.14.3.tar.gz)
    ([sha256](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/hdf5-1.14.3.tar.gz.sha256)) | Source Release | Unix Gzipped source tar file.
    See Methods to obtain (below).
    See warning below about autotools builds.
    release_docs/ directory in source | -| [hdf5-1.14.4.tar.bz2](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/hdf5-1.14.3.tar.bz2.sha256)
    ([sha256](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/hdf5-1.14.3.tar.bz2.sha256)) | Source Release | Unix Gzipped source tar file.
    See warning below about autotools builds.
    release_docs/ directory in source | -| [hdf5-1.14.4.zip](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/hdf5-1.14.3.zip)
    ([sha256](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/hdf5-1.14.3.zip.sha256)) | Source Release | Windows zip file
    release_docs/ directory in source | -| [CMake-hdf5-1.14.4.tar.gz](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/CMake-hdf5-1.14.3.tar.gz)
    ([sha256](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/CMake-hdf5-1.14.3.tar.gz.sha256)) | CMake Source Release | [Building HDF5 with CMake](https://raw.githubusercontent.com/HDFGroup/hdf5/hdf5_1_14_3/release_docs/INSTALL_CMake.txt) | -| [CMake-hdf5-1.14.4.zip](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/CMake-hdf5-1.14.3.zip)
    ([sha256](https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_14_3/src/CMake-hdf5-1.14.3.zip.sha256)) | CMake Source Release | [Building HDF5 with CMake](https://raw.githubusercontent.com/HDFGroup/hdf5/hdf5_1_14_3/release_docs/INSTALL_CMake.txt) | | -| [Ready to use Binaries](https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/hdf5-1.14.3/bin/) | Pre-built binary distributions for Unix and Windows || +| File | Type | +| ---- | ---- | +| [hdf5-1.14.4-2.tar.gz](https://github.com/HDFGroup/hdf5/releases/download/hdf5_1.14.4.2/hdf5-1.14.4-2.tar.gz)
    | Source Release for Unix | +| [hdf5-1.14.4-2.zip](https://github.com/HDFGroup/hdf5/releases/download/hdf5_1.14.4.2/hdf5-1.14.4-2.zip)
    | Source Release for Windows | +| [Ready to use Binaries](https://github.com/HDFGroup/hdf5/releases/tag/hdf5_1.14.4.2) | Pre-built binary distributions for Unix and Windows || + +Please refer to [Build instructions](https://github.com/HDFGroup/hdf5/blob/hdf5_1.14.4.2/release_docs/INSTALL) for building with either CMake or Autotools. **Methods to obtain (gz file)** diff --git a/downloads/index.md b/downloads/index.md index 990ea02c..95b7de1b 100644 --- a/downloads/index.md +++ b/downloads/index.md @@ -13,7 +13,7 @@ This page is a temporary replacement for our download archive. We will be workin | Version | Usage | | ---- | ----| -| [HDF5 1.14.3](/hdf5/hdf5_1_14_3.md) | This should be the default choice for new users. | +| [HDF5 1.14.4](/hdf5/hdf5_1_14_4.md) | This should be the default choice for new users. | | [HDF5 1.12.3](/hdf5/hdf5_1_12_3.md) | This is the last release for HDF5 1.12. Users should move to HDF5 1.14. | ## HDFView diff --git a/ftp/hdf5/index.md b/ftp/hdf5/index.md new file mode 100644 index 00000000..81052bd9 --- /dev/null +++ b/ftp/hdf5/index.md @@ -0,0 +1 @@ +Destination of FTP contents?