Add entities objects+rules from bids v1.6.0-480-ge47283a2

ATM we will need only these two files to establish consistent order of the entities in the filenames. Later we might want to carry a more complete copy of the schema, or may be eventually some proper bidsschema python package would appear to provide desired functionality anyways
nipy · Feb 7, 2022 · 38032b5 · 38032b5
1 parent d54a928
commit 38032b5
Show file tree

Hide file tree

Showing 3 changed files with 389 additions and 0 deletions.
diff --git a/heudiconv/bids/data/schema/objects/entities.yaml b/heudiconv/bids/data/schema/objects/entities.yaml
@@ -0,0 +1,356 @@
+---
+# This file describes entities present in BIDS filenames.
+# WARNING: The entities are presented here in alphabetical order!
+# The appropriate order of entities in filenames is defined in `rules/entities.yaml`,
+# rather than this file (`objects/entities.yaml`).
+acquisition:
+  name: Acquisition
+  entity: acq
+  description: |
+    The `acq-<label>` key/value pair corresponds to a custom label the
+    user MAY use to distinguish a different set of parameters used for
+    acquiring the same modality.
+    For example this should be used when a study includes two T1w images - one
+    full brain low resolution and one restricted field of view but high
+    resolution.
+    In such case two files could have the following names:
+    `sub-01_acq-highres_T1w.nii.gz` and `sub-01_acq-lowres_T1w.nii.gz`, however
+    the user is free to choose any other label than `highres` and `lowres` as long
+    as they are consistent across subjects and sessions.
+    In case different sequences are used to record the same modality
+    (for example, `RARE` and `FLASH` for T1w)
+    this field can also be used to make that distinction.
+    At what level of detail to make the distinction (for example,
+    just between `RARE` and `FLASH`, or between `RARE`, `FLASH`, and `FLASHsubsampled`)
+    remains at the discretion of the researcher.
+  type: string
+  format: label
+ceagent:
+  name: Contrast Enhancing Agent
+  entity: ce
+  description: |
+    The `ce-<label>` key/value can be used to distinguish
+    sequences using different contrast enhanced images.
+    The label is the name of the contrast agent.
+    The key `"ContrastBolusIngredient"` MAY also be added in the JSON file,
+    with the same label.
+  type: string
+  format: label
+chunk:
+  name: Chunk
+  entity: chunk
+  description: |
+    The `chunk-<index>` key/value pair is used to distinguish between different
+    regions, 2D images or 3D volumes files, of the same physical sample with
+    different fields of view acquired in the same imaging experiment.
+  type: string
+  format: index
+density:
+  name: Density
+  entity: den
+  description: |
+    Density of non-parametric surfaces.
+    MUST have a corresponding `Density` metadata field to provide
+    interpretation.
+
+    This entity is only applicable to derivative data.
+  type: string
+  format: label
+description:
+  name: Description
+  entity: desc
+  description: |
+    When necessary to distinguish two files that do not otherwise have a
+    distinguishing entity, the `_desc-<label>` keyword-value SHOULD be used.
+
+    This entity is only applicable to derivative data.
+  type: string
+  format: label
+direction:
+  name: Phase-Encoding Direction
+  entity: dir
+  description: |
+    The `dir-<label>` key/value can be set to an arbitrary alphanumeric label
+    (for example, `dir-LR` or `dir-AP`) to distinguish different phase-encoding
+    directions.
+  type: string
+  format: label
+echo:
+  name: Echo
+  entity: echo
+  description: |
+    If files belonging to an entity-linked file collection are acquired at different
+    echo times, the `_echo-<index>` key/value pair MUST be used to distinguish
+    individual files.
+    This entity represents the `"EchoTime"` metadata field. Please note that the `<index>`
+    denotes the number/index (in the form of a nonnegative integer), not the
+    `"EchoTime"` value which needs to be stored in the field `"EchoTime"` of the separate
+    JSON file.
+  type: string
+  format: index
+flip:
+  name: Flip Angle
+  entity: flip
+  description: |
+    If files belonging to an entity-linked file collection are acquired at different
+    flip angles, the `_flip-<index>` key/value pair MUST be used to distinguish
+    individual files.
+    This entity represents the `"FlipAngle"` metadata field. Please note that the `<index>`
+    denotes the number/index (in the form of a nonnegative integer), not the `"FlipAngle"`
+    value which needs to be stored in the field `"FlipAngle"` of the separate JSON file.
+  type: string
+  format: index
+hemisphere:
+  name: Hemisphere
+  entity: hemi
+  description: |
+    The `hemi-<label>` entity indicates which hemibrain is described by the file.
+    Allowed label values for this entity are `L` and `R`, for the left and right
+    hemibrains, respectively.
+  type: string
+  format: label
+  enum:
+  - "L"
+  - "R"
+inversion:
+  name: Inversion Time
+  entity: inv
+  description: |
+    If files belonging to an entity-linked file collection are acquired at different
+    inversion times, the `_inv-<index>` key/value pair MUST be used to distinguish
+    individual files.
+    This entity represents the `"InversionTime` metadata field. Please note that the `<index>`
+    denotes the number/index (in the form of a nonnegative integer), not the `"InversionTime"`
+    value which needs to be stored in the field `"InversionTime"` of the separate JSON file.
+  type: string
+  format: index
+label:
+  name: Label
+  entity: label
+  description: |
+    Tissue-type label, following a prescribed vocabulary.
+    Applies to binary masks and probabilistic/partial volume segmentations
+    that describe a single tissue type.
+
+    This entity is only applicable to derivative data.
+  type: string
+  format: label
+modality:
+  name: Corresponding Modality
+  entity: mod
+  description: |
+    The `mod-<label>` key/value pair corresponds to modality label for defacing
+    masks, for example, T1w, inplaneT1, referenced by a defacemask image.
+    For example, `sub-01_mod-T1w_defacemask.nii.gz`.
+  type: string
+  format: label
+mtransfer:
+  name: Magnetization Transfer
+  entity: mt
+  description: |
+    If files belonging to an entity-linked file collection are acquired at different
+    magnetization transfer (MT) states, the `_mt-<label>` key/value pair MUST be used to
+    distinguish individual files.
+    This entity represents the `"MTState"` metadata field. Allowed label values for this
+    entity are `on` and `off`, for images acquired in presence and absence of an MT pulse,
+    respectively.
+  type: string
+  enum:
+  - "on"
+  - "off"
+part:
+  name: Part
+  entity: part
+  description: |
+    This entity is used to indicate which component of the complex
+    representation of the MRI signal is represented in voxel data.
+    The `part-<label>` key/value pair is associated with the DICOM Tag
+    `0008, 9208`.
+    Allowed label values for this entity are `phase`, `mag`, `real` and `imag`,
+    which are typically used in `part-mag`/`part-phase` or
+    `part-real`/`part-imag` pairs of files.
+
+    Phase images MAY be in radians or in arbitrary units.
+    The sidecar JSON file MUST include the units of the `phase` image.
+    The possible options are `"rad"` or `"arbitrary"`.
+
+    When there is only a magnitude image of a given type, the `part` key MAY be
+    omitted.
+  type: string
+  enum:
+  - mag
+  - phase
+  - real
+  - imag
+processing:
+  name: Processed (on device)
+  entity: proc
+  description: |
+    The proc label is analogous to rec for MR and denotes a variant of
+    a file that was a result of particular processing performed on the device.
+
+    This is useful for files produced in particular by Elekta's MaxFilter
+    (for example, `sss`, `tsss`, `trans`, `quat` or `mc`),
+    which some installations impose to be run on raw data because of active
+    shielding software corrections before the MEG data can actually be
+    exploited.
+  type: string
+  format: label
+reconstruction:
+  name: Reconstruction
+  entity: rec
+  description: |
+    The `rec-<label>` key/value can be used to distinguish
+    different reconstruction algorithms (for example `MoCo` for the ones using motion
+    correction).
+  type: string
+  format: label
+recording:
+  name: Recording
+  entity: recording
+  description: |
+    More than one continuous recording file can be included (with different
+    sampling frequencies).
+    In such case use different labels.
+    For example: `_recording-contrast`, `_recording-saturation`.
+  type: string
+  format: label
+resolution:
+  name: Resolution
+  entity: res
+  description: |
+    Resolution of regularly sampled N-dimensional data.
+    MUST have a corresponding `"Resolution"` metadata field to provide
+    interpretation.
+
+    This entity is only applicable to derivative data.
+  type: string
+  format: label
+run:
+  name: Run
+  entity: run
+  description: |
+    If several scans with the same acquisition parameters are acquired in the same session,
+    they MUST be indexed with the [`run-<index>`](../99-appendices/09-entities.md#run) entity:
+    `_run-1`, `_run-2`, `_run-3`, and so on (only nonnegative integers are allowed as
+    run labels).
+
+    If different entities apply,
+    such as a different session indicated by [`ses-<label>`](../99-appendices/09-entities.md#ses),
+    or different acquisition parameters indicated by
+    [`acq-<label>`](../99-appendices/09-entities.md#acq),
+    then `run` is not needed to distinguish the scans and MAY be omitted.
+  type: string
+  format: index
+sample:
+  name: Sample
+  entity: sample
+  description: |
+    A sample pertaining to a subject such as tissue, primary cell
+    or cell-free sample.
+    The `sample-<label>` key/value pair is used to distinguish between different
+    samples from the same subject.
+    The label MUST be unique per subject and is RECOMMENDED to be unique
+    throughout the dataset.
+  type: string
+  format: label
+session:
+  name: Session
+  entity: ses
+  description: |
+    A logical grouping of neuroimaging and behavioral data consistent across
+    subjects.
+    Session can (but doesn't have to) be synonymous to a visit in a
+    longitudinal study.
+    In general, subjects will stay in the scanner during one session.
+    However, for example, if a subject has to leave the scanner room and then
+    be re-positioned on the scanner bed, the set of MRI acquisitions will still
+    be considered as a session and match sessions acquired in other subjects.
+    Similarly, in situations where different data types are obtained over
+    several visits (for example fMRI on one day followed by DWI the day after)
+    those can be grouped in one session.
+    Defining multiple sessions is appropriate when several identical or similar
+    data acquisitions are planned and performed on all -or most- subjects,
+    often in the case of some intervention between sessions
+    (for example, training).
+  type: string
+  format: label
+space:
+  name: Space
+  entity: space
+  description: |
+    The space entity can be used to indicate
+    the way in which electrode positions are interpreted
+    (for EEG/MEG/iEEG data) or
+    the spatial reference to which a file has been aligned (for MRI data).
+    The space `<label>` MUST be taken from one of the modality specific lists in
+    [Appendix VIII](../99-appendices/08-coordinate-systems.md).
+    For example for iEEG data, the restricted keywords listed under
+    [iEEG Specific Coordinate Systems](../99-appendices/08-coordinate-systems.md#ieeg-specific-coordinate-systems)
+    are acceptable for `<label>`.
+
+    For EEG/MEG/iEEG data, this entity can be applied to raw data, but
+    for other data types, it is restricted to derivative data.
+  type: string
+  format: label
+split:
+  name: Split
+  entity: split
+  description: |
+    In the case of long data recordings that exceed a file size of 2Gb, the
+    .fif files are conventionally split into multiple parts.
+    Each of these files has an internal pointer to the next file.
+    This is important when renaming these split recordings to the BIDS
+    convention.
+
+    Instead of a simple renaming, files should be read in and saved under their
+    new names with dedicated tools like [MNE-Python](https://mne.tools/),
+    which will ensure that not only the file names, but also the internal file
+    pointers will be updated.
+    It is RECOMMENDED that .fif files with multiple parts use the
+    `split-<index>` entity to indicate each part.
+    If there are multiple parts of a recording and the optional `scans.tsv` is provided,
+    remember to list all files separately in `scans.tsv` and that the entries for the
+    `acq_time` column in `scans.tsv` MUST all be identical, as described in
+    [Scans file](../03-modality-agnostic-files.md#scans-file).
+  type: string
+  format: index
+stain:
+  name: Stain
+  entity: stain
+  description: |
+    The `stain-<label>` key/pair values can be used to distinguish image files
+    from the same sample using different stains or antibodies for contrast enhancement.
+    Stains SHOULD be indicated in the `"SampleStaining"` key in the sidecar JSON file,
+    although the label may be different.
+    Description of antibodies SHOULD also be indicated in `"SamplePrimaryAntibodies"`
+    and/or `"SampleSecondaryAntobodies"` as appropriate.
+  type: string
+  format: label
+subject:
+  name: Subject
+  entity: sub
+  description: |
+    A person or animal participating in the study.
+  type: string
+  format: label
+task:
+  name: Task
+  entity: task
+  description: |
+    Each task has a unique label that MUST only consist of letters and/or
+    numbers (other characters, including spaces and underscores, are not
+    allowed).
+    Those labels MUST be consistent across subjects and sessions.
+  type: string
+  format: label
+tracer:
+  name: Tracer
+  entity: trc
+  description: |
+    The `trc-<label>` key/value can be used to distinguish
+    sequences using different tracers.
+    The key `"TracerName"` MUST also be included in the associated JSON file,
+    although the label may be different.
+  type: string
+  format: label
diff --git a/heudiconv/bids/data/schema/rules/entities.yaml b/heudiconv/bids/data/schema/rules/entities.yaml
@@ -0,0 +1,30 @@
+---
+# This file simply defines the order in which entities should appear within filenames.
+# Entity definitions appear in the `objects/entities.yaml` file.
+- subject
+- session
+- sample
+- task
+- acquisition
+- ceagent
+- tracer
+- stain
+- reconstruction
+- direction
+- run
+- modality
+- echo
+- flip
+- inversion
+- mtransfer
+- part
+- processing
+- hemisphere
+- space
+- split
+- recording
+- chunk
+- resolution
+- density
+- label
+- description
diff --git a/setup.py b/setup.py
@@ -59,6 +59,9 @@ def findsome(subdir, extensions):
                         op.join('data', '*.dcm'),
                         op.join('data', '*', '*.dcm')
             ],
+            'heudiconv.bids.data.schema': [
+                '*/*.yaml'
+            ],
         }
     )