Add vector list datatype & add configurable SPACE_DIRECTIONS_TYPE o…

…ption (#157) ## Changes * Add `int vector list` and `double vector list` datatypes that are lists of Numpy arrays or `None`. * These new types are similar to their `int matrix` and `double matrix` counterparts except they are **not** 2D Numpy matrices. * Add `nrrd.SPACE_DIRECTIONS_TYPE` to enable switching the datatype for the `space directions` field. * Valid options are `double matrix` or `double vector list`. The current default is `double matrix` for backwards compatibility but will be switched to `double vector list` in the next major release. * `double vector list` is superior over `double matrix` because it doesn't have the confusing row-of-NaN's representation and it doesn't imply an affine transform by being a matrix. * Support row-of-None in addition to row-of-NaN for `parse_optional_matrix` & `format_optional_matrix` in addition to new vector list parsing/formatting functions Fixes #148 Revises #149
mhe · Nov 5, 2024 · 62b75f8 · 62b75f8
1 parent bb38ec2
commit 62b75f8
Show file tree

Hide file tree

Showing 14 changed files with 410 additions and 14 deletions.
diff --git a/docs/source/background/datatypes.rst b/docs/source/background/datatypes.rst
@@ -88,4 +88,22 @@ double matrix
 :Python Datatype: (M,N) :class:`numpy.ndarray` of :class:`float`
 :Python Example: np.array([[2.54, 1.3, 0.0], [3.14, 0.3, 3.3], [np.nan, np.nan, np.nan], [0.0, -12.3, -3.3]])
 
-This datatype has the added feature where rows can be defined as empty by setting the vector as :code:`none`. In the NRRD specification, instead of the row, the :code:`none` keyword is used in it's place. This is represented in the Python NumPy array as a row of all NaN's. An example use case for this optional row matrix is for the 'space directions' field where one row may be empty because it is not a domain type.
+This datatype has the added feature where rows can be defined as empty by setting the vector as :code:`none`. In the NRRD specification, instead of the row, the :code:`none` keyword is used in it's place. This is represented in the Python NumPy array as a row of all NaN's. An example use case for this optional row matrix is for the 'space directions' field where one row may be empty because it is not a domain type.
+
+int vector list
+~~~~~~~~~~~~~~~~~~
+:NRRD Syntax: (<i>,<i>,...,<i>) (<i>,<i>,...,<i>) ... (<i>,<i>,...,<i>)
+:NRRD Example: (1,0,0) (0,1,0) none (0,0,1)
+:Python Datatype: (M,N) :class:`list` of (N,) :class:`numpy.ndarray` of :class:`int`
+:Python Example: [np.array([1, 0, 0]), np.array([0, 1, 0]), None, np.array([0, 0, 1])]
+
+This datatype is similar to `int matrix`_ except instead of returning a (M,N) :class:`numpy.ndarray`, it returns a list of (N,) :class:`numpy.ndarray`. Each row is optional and designated by :code:`none` in the NRRD specification and represented as :obj:`None` in this library.
+
+double vector list
+~~~~~~~~~~~~~~~~~~
+:NRRD Syntax: (<d>,<d>,...,<d>) (<d>,<d>,...,<d>) ... (<d>,<d>,...,<d>)
+:NRRD Example: (2.54, 1.3, 0.0) (3.14, 0.3, 3.3) none (0.05, -12.3, -3.3)
+:Python Datatype: (M,N) :class:`list` of (N,) :class:`numpy.ndarray` of :class:`float`
+:Python Example: [np.array([2.54, 1.3, 0.0]), np.array([3.14, 0.3, 3.3]), None, np.array([0.0, -12.3, -3.3])]
+
+This datatype is similar to `double matrix`_ except instead of returning a (M,N) :class:`numpy.ndarray`, it returns a list of (N,) :class:`numpy.ndarray`. Each row is optional and designated by :code:`none` in the NRRD specification and represented as :obj:`None` in this library.
diff --git a/docs/source/background/fields.rst b/docs/source/background/fields.rst
@@ -33,7 +33,7 @@ centerings_               :ref:`background/datatypes:string list`
 space_                    :ref:`background/datatypes:string`
 `space dimension`_        :ref:`background/datatypes:int`
 `space units`_            :ref:`background/datatypes:quoted string list`
-`space directions`_       :ref:`background/datatypes:double matrix`
+`space directions`_       :ref:`background/datatypes:double matrix` or :ref:`background/datatypes:double vector list` depending on :data:`nrrd.SPACE_DIRECTIONS_TYPE`
 `space origin`_           :ref:`background/datatypes:double vector`
 `measurement frame`_      :ref:`background/datatypes:int matrix`
 ========================  ==============================================

diff --git a/docs/source/reference/formatting.rst b/docs/source/reference/formatting.rst
@@ -9,8 +9,10 @@ Formatting NRRD fields
     nrrd.format_optional_vector
     nrrd.format_matrix
     nrrd.format_optional_matrix
+    nrrd.format_vector_list
+    nrrd.format_optional_vector_list
 
 .. automodule:: nrrd
-    :members: format_number, format_number_list, format_vector, format_optional_vector, format_matrix, format_optional_matrix
+    :members: format_number, format_number_list, format_vector, format_optional_vector, format_matrix, format_optional_matrix, format_vector_list, format_optional_vector_list
     :undoc-members:
     :show-inheritance:
diff --git a/docs/source/reference/parsing.rst b/docs/source/reference/parsing.rst
@@ -9,8 +9,10 @@ Parsing NRRD fields
     nrrd.parse_optional_vector
     nrrd.parse_matrix
     nrrd.parse_optional_matrix
+    nrrd.parse_vector_list
+    nrrd.parse_optional_vector_list
 
 .. automodule:: nrrd
-    :members: parse_number_auto_dtype, parse_number_list, parse_vector, parse_optional_vector, parse_matrix, parse_optional_matrix
+    :members: parse_number_auto_dtype, parse_number_list, parse_vector, parse_optional_vector, parse_matrix, parse_optional_matrix, parse_vector_list, parse_optional_vector_list
     :undoc-members:
     :show-inheritance:
diff --git a/docs/source/reference/reading.rst b/docs/source/reference/reading.rst
@@ -7,10 +7,12 @@ Reading NRRD files
     nrrd.read_header
     nrrd.read_data
     nrrd.reader.ALLOW_DUPLICATE_FIELD
+    nrrd.SPACE_DIRECTIONS_TYPE
 
 .. automodule:: nrrd
     :members: read, read_header, read_data
     :undoc-members:
     :show-inheritance:
 
 .. autodata:: nrrd.reader.ALLOW_DUPLICATE_FIELD
+.. autodata:: nrrd.SPACE_DIRECTIONS_TYPE
diff --git a/nrrd/__init__.py b/nrrd/__init__.py
@@ -1,11 +1,46 @@
+from typing_extensions import Literal
+
 from nrrd._version import __version__
 from nrrd.formatters import *
 from nrrd.parsers import *
 from nrrd.reader import read, read_data, read_header
 from nrrd.types import NRRDFieldMap, NRRDFieldType, NRRDHeader
 from nrrd.writer import write
 
+# TODO Change to 'double vector list' in next major release
+SPACE_DIRECTIONS_TYPE: Literal['double matrix', 'double vector list'] = 'double matrix'
+"""Datatype to use for 'space directions' field when reading/writing NRRD files
+
+The 'space directions' field can be represented in two different ways: as a matrix or as a list of vectors. Per the
+NRRD specification, the 'space directions' field is a per-axis definition that represents the direction and spacing of
+each axis. Non-spatial axes are represented as 'none'.
+
+The current default is to return a matrix, where each non-spatial axis is represented as a row of `NaN` in the matrix.
+In the next major release, this default option will change to return a list of optional vectors, where each non
+spatial axis is represented as `None`.
+
+Example:
+    Reading a NRRD file with space directions type set to 'double matrix' (the default).
+
+    >>> nrrd.SPACE_DIRECTIONS_TYPE = 'double matrix'
+    >>> data, header = nrrd.read('file.nrrd')
+    >>> print(header['space directions'])
+    [[1.5 0.  0. ]
+     [0.  1.5 0. ]
+     [0.  0.  1. ]
+     [nan nan nan]]
+
+    Reading a NRRD file with space directions type set to 'double vector list'.
+
+    >>> nrrd.SPACE_DIRECTIONS_TYPE = 'double vector list'
+    >>> data, header = nrrd.read('file.nrrd')
+    >>> print(header['space directions'])
+    [array([1.5, 0. , 0. ]), array([0. , 1.5, 0. ]), array([0., 0., 1.]), None]
+"""
+
 __all__ = ['read', 'read_data', 'read_header', 'write', 'format_number_list', 'format_number', 'format_matrix',
-           'format_optional_matrix', 'format_optional_vector', 'format_vector', 'parse_matrix',
-           'parse_number_auto_dtype', 'parse_number_list', 'parse_optional_matrix', 'parse_optional_vector',
-           'parse_vector', 'NRRDFieldType', 'NRRDFieldMap', 'NRRDHeader', '__version__']
+           'format_optional_matrix', 'format_optional_vector', 'format_vector', 'format_vector_list',
+           'format_optional_vector_list', 'parse_matrix', 'parse_number_auto_dtype', 'parse_number_list',
+           'parse_optional_matrix',
+           'parse_optional_vector', 'parse_vector', 'parse_vector_list', 'parse_optional_vector_list', 'NRRDFieldType',
+           'NRRDFieldMap', 'NRRDHeader', 'SPACE_DIRECTIONS_TYPE', '__version__']
diff --git a/nrrd/formatters.py b/nrrd/formatters.py
@@ -1,4 +1,4 @@
-from typing import Optional, Union
+from typing import List, Optional, Union
 
 import numpy as np
 import numpy.typing as npt
@@ -57,6 +57,7 @@ def format_vector(x: npt.NDArray) -> str:
     vector : :class:`str`
         String containing NRRD vector
     """
+    x = np.asarray(x)
 
     return '(' + ','.join([format_number(y) for y in x]) + ')'
 
@@ -80,10 +81,15 @@ def format_optional_vector(x: Optional[npt.NDArray]) -> str:
     vector : :class:`str`
         String containing NRRD vector
     """
+    # If vector is None, return none
+    if x is None:
+        return 'none'
+
+    x = np.asarray(x)
 
-    # If vector is None or all elements are NaN, then return none
+    # If all elements are None or NaN, then return none
     # Otherwise format the vector as normal
-    if x is None or np.all(np.isnan(x)):
+    if np.all(x == None) or np.all(np.isnan(x)):  # noqa: E711
         return 'none'
     else:
         return format_vector(x)
@@ -131,6 +137,8 @@ def format_optional_matrix(x: Optional[npt.NDArray]) -> str:
     matrix : :class:`str`
         String containing NRRD matrix
     """
+    # Convert to float dtype to convert None to NaN
+    x = np.asarray(x, dtype=float)
 
     return ' '.join([format_optional_vector(y) for y in x])
 
@@ -151,5 +159,49 @@ def format_number_list(x: npt.NDArray) -> str:
     list : :class:`str`
         String containing NRRD list
     """
+    x = np.asarray(x)
 
     return ' '.join([format_number(y) for y in x])
+
+
+def format_vector_list(x: List[npt.NDArray]) -> str:
+    """Format a :class:`list` of (N,) :class:`numpy.ndarray` into a NRRD vector list string
+
+    See :ref:`background/datatypes:int vector list` and :ref:`background/datatypes:double vector list` for more
+    information on the format.
+
+    Parameters
+    ----------
+    x : :class:`list` of (N,) :class:`numpy.ndarray`
+        Vector list to convert to NRRD vector list string
+
+    Returns
+    -------
+    vector_list : :class:`str`
+        String containing NRRD vector list
+    """
+
+    return ' '.join([format_vector(y) for y in x])
+
+
+def format_optional_vector_list(x: List[Optional[npt.NDArray]]) -> str:
+    """Format a :class:`list` of (N,) :class:`numpy.ndarray` or :obj:`None` into a NRRD optional vector list string
+
+    Function converts a :class:`list` of (N,) :class:`numpy.ndarray` or :obj:`None` into a string using
+    the NRRD vector list format.
+
+    See :ref:`background/datatypes:int vector list` and :ref:`background/datatypes:double vector list` for more
+    information on the format.
+
+    Parameters
+    ----------
+    x : :class:`list` of (N,) :class:`numpy.ndarray` or :obj:`None`
+        Vector list to convert to NRRD vector list string
+
+    Returns
+    -------
+    vector_list : :class:`str`
+        String containing NRRD vector list
+    """
+
+    return ' '.join([format_optional_vector(y) for y in x])
diff --git a/nrrd/parsers.py b/nrrd/parsers.py
@@ -1,4 +1,4 @@
-from typing import Optional, Type, Union
+from typing import List, Optional, Type, Union
 
 import numpy as np
 import numpy.typing as npt
@@ -212,6 +212,103 @@ def parse_number_list(x: str, dtype: Optional[Type[Union[int, float]]] = None) -
     return number_list
 
 
+def parse_vector_list(x: str, dtype: Optional[Type[Union[int, float]]] = None) -> List[npt.NDArray]:
+    """Parse NRRD vector list from string into a :class:`list` of (N,) :class:`numpy.ndarray`.
+
+    Parses input string to convert it into a list of Numpy arrays using the NRRD vector list format.
+
+    See :ref:`background/datatypes:int vector list` and :ref:`background/datatypes:double vector list` for more
+    information on the format.
+
+    Parameters
+    ----------
+    x : :class:`str`
+        String containing NRRD vector list
+    dtype : data-type, optional
+        Datatype to use for the resulting Numpy arrays. Datatype can be :class:`float`, :class:`int` or :obj:`None`. If
+        :obj:`dtype` is :obj:`None`, it will be automatically determined by checking any of the vector elements
+        for fractional numbers. If found, the vectors will be converted to :class:`float`, otherwise :class:`int`.
+        Default is to automatically determine datatype.
+
+    Returns
+    -------
+    vector_list : :class:`list` of (N,) :class:`numpy.ndarray`
+        List of vectors that are parsed from the :obj:`x` string
+    """
+
+    # Split input by spaces, convert each row into a vector
+    vector_list = [parse_vector(x, dtype=float) for x in x.split()]
+
+    # Get the size of each row vector and then remove duplicate sizes
+    # There should be exactly one value in the matrix because all row sizes need to be the same
+    if len(np.unique([len(x) for x in vector_list])) != 1:
+        raise NRRDError('Vector list should have same number of elements in each row')
+
+    # If using automatic datatype detection, then start by converting to float and determining if the number is whole
+    # Truncate to integer if dtype is int also
+    if dtype is None:
+        vector_list_trunc = [x.astype(int) for x in vector_list]
+        if np.all([np.array_equal(x, y) for x, y in zip(vector_list, vector_list_trunc)]):
+            vector_list = vector_list_trunc
+    elif dtype == int:
+        vector_list = [x.astype(int) for x in vector_list]
+    elif dtype != float:
+        raise NRRDError('dtype should be None for automatic type detection, float or int')
+
+    return vector_list
+
+
+def parse_optional_vector_list(x: str, dtype: Optional[Type[Union[int, float]]] = None) -> List[Optional[npt.NDArray]]:
+    """Parse optional NRRD vector list from string into :class:`list` of (N,) :class:`numpy.ndarray` of :class:`float`.
+
+    Function parses optional NRRD vector list from string into a list of (N,) :class:`numpy.ndarray` or :obj:`None`.
+    This function works the same as :meth:`parse_vector_list` except if a row vector in the list is none, the resulting
+    row in the returned list will be :obj:`None`.
+
+    See :ref:`background/datatypes:int vector list` and :ref:`background/datatypes:double vector list` for more
+    information on the format.
+
+    Parameters
+    ----------
+    x : :class:`str`
+        String containing NRRD vector list
+
+    Returns
+    -------
+    vector_list : :class:`list` of (N,) :class:`numpy.ndarray` or :obj:`None`
+        List of vectors that is parsed from the :obj:`x` string
+    """
+
+    # Split input by spaces to get each row and convert into a vector. The row can be 'none', in which case it will
+    # return None
+    vector_list = [parse_optional_vector(x, dtype=float) for x in x.split()]
+
+    # Get the size of each row vector, 0 if None
+    sizes = np.array([0 if x is None else len(x) for x in vector_list])
+
+    # Get sizes of each row vector removing duplicate sizes
+    # Since each row vector should be same size, the unique sizes should return one value for the row size or it may
+    # return a second one (0) if there are None vectors
+    unique_sizes = np.unique(sizes)
+
+    if len(unique_sizes) != 1 and (len(unique_sizes) != 2 or unique_sizes.min() != 0):
+        raise NRRDError('Vector list should have same number of elements in each row')
+
+    # If using automatic datatype detection, then start by converting to float and determining if the number is whole
+    # Truncate to integer if dtype is int also
+    if dtype is None:
+        vector_list_trunc = [x.astype(int) if x is not None else None for x in vector_list]
+
+        if np.all([np.array_equal(x, y) for x, y in zip(vector_list, vector_list_trunc)]):
+            vector_list = vector_list_trunc
+    elif dtype == int:
+        vector_list = [x.astype(int) if x is not None else None for x in vector_list]
+    elif dtype != float:
+        raise NRRDError('dtype should be None for automatic type detection, float or int')
+
+    return vector_list
+
+
 def parse_number_auto_dtype(x: str) -> Union[int, float]:
     """Parse number from string with automatic type detection.
 

diff --git a/nrrd/reader.py b/nrrd/reader.py
@@ -8,6 +8,7 @@
 from collections import OrderedDict
 from typing import IO, Any, AnyStr, Iterable, Tuple
 
+import nrrd
 from nrrd.parsers import *
 from nrrd.types import IndexOrder, NRRDFieldMap, NRRDFieldType, NRRDHeader
 
@@ -19,7 +20,7 @@
 
 _NRRD_REQUIRED_FIELDS = ['dimension', 'type', 'encoding', 'sizes']
 
-ALLOW_DUPLICATE_FIELD = False
+ALLOW_DUPLICATE_FIELD: bool = False
 """Allow duplicate header fields when reading NRRD files
 
 When there are duplicated fields in a NRRD file header, pynrrd throws an error by default. Setting this field as
@@ -109,7 +110,7 @@ def _get_field_type(field: str, custom_field_map: Optional[NRRDFieldMap]) -> NRR
     elif field in ['measurement frame']:
         return 'double matrix'
     elif field in ['space directions']:
-        return 'double matrix'
+        return nrrd.SPACE_DIRECTIONS_TYPE
     else:
         if custom_field_map and field in custom_field_map:
             return custom_field_map[field]
@@ -144,6 +145,10 @@ def _parse_field_value(value: str, field_type: NRRDFieldType) -> Any:
         # This is only valid for double matrices because the matrix is represented with NaN in the entire row
         # for none rows. NaN is only valid for floating point numbers
         return parse_optional_matrix(value)
+    elif field_type == 'int vector list':
+        return parse_optional_vector_list(value, dtype=int)
+    elif field_type == 'double vector list':
+        return parse_optional_vector_list(value, dtype=float)
     else:
         raise NRRDError(f'Invalid field type given: {field_type}')