Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Deno validator #111

Open
12 of 16 tasks
jungheejung opened this issue Sep 11, 2024 · 5 comments
Open
12 of 16 tasks

[BUG] Deno validator #111

jungheejung opened this issue Sep 11, 2024 · 5 comments

Comments

@jungheejung
Copy link
Collaborator

jungheejung commented Sep 11, 2024

Which module is this from?

datalad

What is the issue?

Deno validator warning and errors

What was your expected behavior?

Full pass with no errors

How can we reproduce this?

Code: https://github.com/bids-standard/bids-validator/issues/2129

Any additional context?

  • 1 TODO ISSUEJSON sidecar
	[WARNING] JSON_KEY_RECOMMENDED A JSON file is missing a key listed as recommended.
		DatasetType
		/dataset_description.json

		GeneratedBy
		/dataset_description.json

		SourceDatasets
		/dataset_description.json

	Please visit https://neurostars.org/search?q=JSON_KEY_RECOMMENDED for existing conversations about this issue.
	[WARNING] SIDECAR_KEY_RECOMMENDED A data file's JSON sidecar is missing a key listed as recommended.
	Please visit https://neurostars.org/search?q=SIDECAR_KEY_RECOMMENDED for existing conversations about this issue.
	[WARNING] HED_WARNING The validation on this HED string returned a warning.
		/sub-0001/ses-01/func/sub-0001_ses-01_task-alignvideo_acq-mb8_run-01_events.tsv - WARNING: [UNITS_MISSING] No unit specified. Using "m" as the default - "X-position/45.62". TSV line: 11. (For more information on this HED warning, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#units-missing.)
		/sub-0001/ses-01/func/sub-0001_ses-01_task-alignvideo_acq-mb8_run-01_events.tsv - WARNING: [UNITS_MISSING] No unit specified. Using "m" as the default - "X-position/35.0". TSV line: 12. (For more information on this HED warning, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#units-missing.)

		75994 more files with the same issue

	Please visit https://neurostars.org/search?q=HED_WARNING for existing conversations about this issue.
  • 4 DONEAdd column descriptor in metadata .json to match events.tsv → double check by running deno
    - [x] task-shortvideo
    - [x] task-fractional
	[WARNING] TSV_ADDITIONAL_COLUMNS_UNDEFINED A TSV file has extra columns which are not defined in its associated JSON sidecar
		response_label
		/sub-0001/ses-03/func/sub-0001_ses-03_task-shortvideo_acq-mb8_run-01_events.tsv
		/sub-0133/ses-03/func/sub-0133_ses-03_task-shortvideo_acq-mb8_run-01_events.tsv

		97 more files with the same issue

		subtask_type
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-01_events.tsv
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv

		199 more files with the same issue

		event_type
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-01_events.tsv
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv

		196 more files with the same issue

		value
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-01_events.tsv
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractionaltomsaxe_acq-mb8_run-01_events.tsv

		101 more files with the same issue

		response_accuracy
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-01_events.tsv
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv

		196 more files with the same issue

		question
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv
		/sub-0133/ses-04/func/sub-0133_ses-04_task-fractional_acq-mb8_run-02_events.tsv

		47 more files with the same issue

		participant_response
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv
		/sub-0133/ses-04/func/sub-0133_ses-04_task-fractional_acq-mb8_run-02_events.tsv

		46 more files with the same issue

		normative_response
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv
		/sub-0133/ses-04/func/sub-0133_ses-04_task-fractional_acq-mb8_run-02_events.tsv

		46 more files with the same issue

		button_press
		/sub-0133/ses-04/func/sub-0133_ses-04_task-fractional_acq-mb8_run-01_events.tsv
		/sub-0003/ses-04/func/sub-0003_ses-04_task-fractional_acq-mb8_run-01_events.tsv

		98 more files with the same issue

		cue_location
		/sub-0003/ses-04/func/sub-0003_ses-04_task-fractional_acq-mb8_run-01_events.tsv
		/sub-0009/ses-04/func/sub-0009_ses-04_task-fractional_acq-mb8_run-01_events.tsv

		46 more files with the same issue

		target_location
		/sub-0003/ses-04/func/sub-0003_ses-04_task-fractional_acq-mb8_run-01_events.tsv
		/sub-0009/ses-04/func/sub-0009_ses-04_task-fractional_acq-mb8_run-01_events.tsv

		46 more files with the same issue

		trial_index
		/sub-0003/ses-04/func/sub-0003_ses-04_task-fractional_acq-mb8_run-01_events.tsv
		/sub-0009/ses-04/func/sub-0009_ses-04_task-fractional_acq-mb8_run-01_events.tsv

		46 more files with the same issue

	Please visit https://neurostars.org/search?q=TSV_ADDITIONAL_COLUMNS_UNDEFINED for existing conversations about this issue.
  • 5 DONE task-fractional onset order sort → updated with commit 085cf88

	[WARNING] EVENT_ONSET_ORDER The onset column in events.tsv files should be sorted.

		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-01_events.tsv
		/sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv

		97 more files with the same issue

	Please visit https://neurostars.org/search?q=EVENT_ONSET_ORDER for existing conversations about this issue.
	[WARNING] SUSPICIOUSLY_SHORT_EVENT_DESIGN The onset of the last event is less than half the total duration of the corresponding scan.
This design is suspiciously short.

		/sub-0133/ses-04/func/sub-0133_ses-04_task-fractional_acq-mb8_run-01_bold.nii.gz
		/sub-0003/ses-01/func/sub-0003_ses-01_task-social_acq-mb8_run-01_bold.nii.gz

		66 more files with the same issue

	Please visit https://neurostars.org/search?q=SUSPICIOUSLY_SHORT_EVENT_DESIGN for existing conversations about this issue.

To solve this issue above, I created a code to print list of runs that deviate from standard TR length. https://github.com/spatialtopology/spacetop-prep/blob/88a560c28dd8109df4c75f61a671f9c724f038bf/spacetop_prep/datalad/identify_shorterTR.py
Some runs are indeed shorter than expected, due to partial data collection (e.g. participant had issue with trackball, scanner failure etc)
Q. What's the best way moving forward? Adding info in scans.tsv?
@yarikoptic

sub-0001_ses-02_task-narratives_acq-mb8_run-01_bold.json has a dcmmeta_shape shorter than the standard. Value: 937
sub-0001_ses-02_task-narratives_acq-mb8_run-02_bold.json has a dcmmeta_shape shorter than the standard. Value: 1059
sub-0005_ses-04_task-fractional_acq-mb8_run-01_bold.json has a dcmmeta_shape shorter than the standard. Value: 5
sub-0013_ses-04_task-fractional_acq-mb8_run-01_bold.json has a dcmmeta_shape shorter than the standard. Value: 1234
sub-0055_ses-02_task-narratives_acq-mb8_run-04_bold.json has a dcmmeta_shape shorter than the standard. Value: 1126
sub-0069_ses-02_task-narratives_acq-mb8_run-03_bold.json has a dcmmeta_shape shorter than the standard. Value: 652
	[WARNING] SUSPICIOUSLY_LONG_EVENT_DESIGN The onset of the last event is after the total duration of the corresponding scan.
This design is suspiciously long.

		/sub-0004/ses-02/func/sub-0004_ses-02_task-faces_acq-mb8_run-01_bold.nii.gz
		/sub-0005/ses-04/func/sub-0005_ses-04_task-fractional_acq-mb8_run-01_bold.nii.gz

		12 more files with the same issue

	Please visit https://neurostars.org/search?q=SUSPICIOUSLY_LONG_EVENT_DESIGN for existing conversations about this issue.
  • 8 TODO sub-0009 task-narratives, no events data, just func. How to fix?
	[WARNING] TSV_VALUE_INCORRECT_TYPE_NONREQUIRED A value in a column did match the acceptable type for that column headers specified format.
		response_time
		/sub-0009/ses-02/func/sub-0009_ses-02_task-narratives_acq-mb8_run-01_events.tsv - '"n/a"'
		/sub-0009/ses-02/func/sub-0009_ses-02_task-narratives_acq-mb8_run-02_events.tsv - '"n/a"'

		2 more files with the same issue

		stim_file
		/sub-0009/ses-02/func/sub-0009_ses-02_task-narratives_acq-mb8_run-01_events.tsv - '"n/a"'
		/sub-0009/ses-02/func/sub-0009_ses-02_task-narratives_acq-mb8_run-02_events.tsv - '"n/a"'

		2 more files with the same issue

	Please visit https://neurostars.org/search?q=TSV_VALUE_INCORRECT_TYPE_NONREQUIRED for existing conversations about this issue.
  • 9 TODO task-fractional. Either name the task differently or temporarily remove the files
	[ERROR] HED_ERROR The validation on this HED string returned an error.
		/task-fractionalmemory_events.json - TypeError: Cannot convert undefined or null to object
		/task-fractionaltomsaxe_events.json - ERROR: [PLACEHOLDER_INVALID] HED value string "Property/Data-property/Data-marker/Temporal-marker/Onset" is missing a required placeholder. Sidecar key: "onset". (For more information on this HED error, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#placeholder-invalid.)

		11251 more files with the same issue

	Please visit https://neurostars.org/search?q=HED_ERROR for existing conversations about this issue.
  • 10 DONE
	[ERROR] NOT_INCLUDED Files with such naming scheme are not part of BIDS specification. This error is most commonly caused by typos in file names that make them not BIDS compatible. Please consult the specification and make sure your files are named correctly. If this is not a file naming issue (for example when including files not yet covered by the BIDS specification) you should include a ".bidsignore" file in your dataset (see https://github.com/bids-standard/bids-validator#bidsignore for details). Please note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS or behavioural logs in proprietary formats) should be placed in the /sourcedata folder.
		/code/spacetop-prep/.github/ISSUE_TEMPLATE/
		/code/spacetop-prep/spacetop_prep/

		1 more files with the same issue

	Please visit https://neurostars.org/search?q=NOT_INCLUDED for existing conversations about this issue.
  • 11 TODO update Intended for. Need to run populate_indended_for
	[ERROR] INTENDED_FOR 'IntendedFor' field needs to point to an existing file.
Files must be subject-relative paths or BIDS URIs.

		/sub-0001/ses-01/fmap/sub-0001_ses-01_acq-mb8_dir-ap_run-01_epi.nii.gz
		/sub-0001/ses-01/fmap/sub-0001_ses-01_acq-mb8_dir-pa_run-01_epi.nii.gz

		192 more files with the same issue

	Please visit https://neurostars.org/search?q=INTENDED_FOR for existing conversations about this issue.
  • NOT this issue TODO only the first trial has the task-alignvideo folder declared. Fix
  • [ ]
	[ERROR] STIMULUS_FILE_MISSING A stimulus file was declared but not found in the dataset.

		/sub-0001/ses-01/func/sub-0001_ses-01_task-alignvideo_acq-mb8_run-01_events.tsv
		/sub-0001/ses-01/func/sub-0001_ses-01_task-alignvideo_acq-mb8_run-02_events.tsv

		175 more files with the same issue

	Please visit https://neurostars.org/search?q=STIMULUS_FILE_MISSING for existing conversations about this issue.
  • 13 Not sure. Is it because it has 219 trials?
	[ERROR] HED_INTERNAL_ERROR An internal error occurred during HED validation.
		/sub-0001/ses-02/func/sub-0001_ses-02_task-faces_acq-mb8_run-01_events.tsv - ERROR: [GENERIC_ERROR] Internal error - message: "Attempting to access the onset of a TSV row without one.". (For more information on this HED error, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#generic-error.)
		/sub-0001/ses-02/func/sub-0001_ses-02_task-faces_acq-mb8_run-02_events.tsv - ERROR: [GENERIC_ERROR] Internal error - message: "Attempting to access the onset of a TSV row without one.". (For more information on this HED error, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#generic-error.)

		314 more files with the same issue

	Please visit https://neurostars.org/search?q=HED_INTERNAL_ERROR for existing conversations about this issue.
  • 14 IGNORE
	[ERROR] TSV_VALUE_INCORRECT_TYPE A value in a column did match the acceptable type for that column headers specified format.
		onset
		/sub-0009/ses-02/func/sub-0009_ses-02_task-narratives_acq-mb8_run-01_events.tsv - '"n/a"'
		/sub-0009/ses-02/func/sub-0009_ses-02_task-narratives_acq-mb8_run-02_events.tsv - '"n/a"'

		2 more files with the same issue

		duration
		/sub-0009/ses-02/func/sub-0009_ses-02_task-narratives_acq-mb8_run-01_events.tsv - '"n/a"'
		/sub-0009/ses-02/func/sub-0009_ses-02_task-narratives_acq-mb8_run-02_events.tsv - '"n/a"'

		2 more files with the same issue

	Please visit https://neurostars.org/search?q=TSV_VALUE_INCORRECT_TYPE for existing conversations about this issue.
  • 15 IGNORE
	[ERROR] EMPTY_FILE Empty files not allowed.
		/sourcedata/d_beh/

	Please visit https://neurostars.org/search?q=EMPTY_FILE for existing conversations about this issue.
  • 16 delete
	[ERROR] SIDECAR_WITHOUT_DATAFILE A json sidecar file was found without a corresponding data file
		/task-fractionalmemory_events.json

	Please visit https://neurostars.org/search?q=SIDECAR_WITHOUT_DATAFILE for existing conversations about this issue.
      Summary:                           Available Tasks:        Available Modalities:
      26363 Files, 2.22 TB               alignvideo              MRI                  
      117 - Subjects 4 - Sessions        faces                                        
                                         fractional                                   
                                         narratives                                   
                                         shortvideo                                   
                                         social                                       

If you have any questions, please post on https://neurostars.org/tags/bids.
@jungheejung
Copy link
Collaborator Author

jungheejung commented Sep 11, 2024

@Zizhuang-Miao Could I get your help on resolving this HED warning? It's the 3rd issue in the output above

	[WARNING] HED_WARNING The validation on this HED string returned a warning.
		/sub-0001/ses-01/func/sub-0001_ses-01_task-alignvideo_acq-mb8_run-01_events.tsv - WARNING: [UNITS_MISSING] No unit specified. Using "m" as the default - "X-position/45.62". TSV line: 11. (For more information on this HED warning, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#units-missing.)
		/sub-0001/ses-01/func/sub-0001_ses-01_task-alignvideo_acq-mb8_run-01_events.tsv - WARNING: [UNITS_MISSING] No unit specified. Using "m" as the default - "X-position/35.0". TSV line: 12. (For more information on this HED warning, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#units-missing.)

		75994 more files with the same issue

	Please visit https://neurostars.org/search?q=HED_WARNING for existing conversations about this issue.

Currently, this is the key and value pair in the task-alignvideo_events.json file

    "response_value": {
        "LongName": "The value of the rating",
        "Description": "This value ranges from 0 ('Barely at all') to 100 ('Strongest imaginable'). Note that if the 'duration' of one rating event was 'n/a', the response value would also be 'n/a'.",
        "HED": "(X-position/#, Agent-action, (Press, Mouse-button, Scroll-wheel))"
    }

Moving forward, it would be nice to validate the HED tags on a HED validator, since running bids-validator on the entire data dataset can be inefficient for debugging purposes.

@Zizhuang-Miao
Copy link
Contributor

@jungheejung I looked into this issue and now I do not think this warning could be elegantly avoided. HED expects a tag that takes values to have a specified unit followed it (https://hed-specification.readthedocs.io/en/latest/03_HED_formats.html#tags-that-take-values; please also see examples in the 3.2.2 section right above it). In our experiments the rating values are either without units (a relative number between 0 and 100) or in the unit of pixels, while pixel is not in the list of allowed units in HED (https://hed-specification.readthedocs.io/en/latest/Appendix_A.html#a-1-1-unit-classes-and-units). I will suggest that we ignore this warning for now.

@jungheejung
Copy link
Collaborator Author

Awesome, appreciate you taking a look into this HED warning @Zizhuang-Miao . In that case, we'll ignore for now.

@yarikoptic
Copy link
Collaborator

re 6 -- you say

Some runs are indeed shorter than expected, due to partial data collection (e.g. participant had issue with trackball, scanner failure etc)

but it seems not "scanner failure" since it is events file shorter than data file so data collection was fine.
Overall, after you handle it (what about the other 60 ?) -- could be added to ignored I guess

@jungheejung
Copy link
Collaborator Author

jungheejung commented Oct 11, 2024

  • Error 1
        [ERROR] HED_ERROR The validation on this HED string returned an error.
                /task-social_events.json - ERROR: [TAG_EXTENSION_INVALID] "Data-property" appears as "Property/Data-property" and cannot be used as an extension. 
Indices ([object Object], ). (For more information on this HED error, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#tag-extension-invalid
.)
                /task-social_events.json - ERROR: [TAG_EXTENSION_INVALID] "Data-property" appears as "Property/Data-property" and cannot be used as an extension. 
Indices ([object Object], ). (For more information on this HED error, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#tag-extension-invalid
.)

                9269 more files with the same issue

        Please visit https://neurostars.org/search?q=HED_ERROR for existing conversations about this issue.
  • ERROR 2
        [ERROR] NOT_INCLUDED Files with such naming scheme are not part of BIDS specification. This error is most commonly caused by typos in file names that make them not BIDS compatible. Please consult the specification and make sure your files are named correctly. If this is not a file naming issue (for example when including files not yet covered by the BIDS specification) you should include a ".bidsignore" file in your dataset (see https://github.com/bids-standard/bids-validator#bidsignore for details). Please note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS or behavioural logs in proprietary formats) should be placed in the /sourcedata folder.
                /code/spacetop-prep/.github/ISSUE_TEMPLATE/
                /code/spacetop-prep/spacetop_prep/

                1 more files with the same issue

        Please visit https://neurostars.org/search?q=NOT_INCLUDED for existing conversations about this issue.
  • ERROR 3
        [ERROR] INTENDED_FOR 'IntendedFor' field needs to point to an existing file.
Files must be subject-relative paths or BIDS URIs.

                /sub-0001/ses-01/fmap/sub-0001_ses-01_acq-mb8_dir-ap_run-01_epi.nii.gz
                /sub-0001/ses-01/fmap/sub-0001_ses-01_acq-mb8_dir-pa_run-01_epi.nii.gz

                188 more files with the same issue
        Please visit https://neurostars.org/search?q=INTENDED_FOR for existing conversations about this issue.
  • ERROR 4
        [ERROR] HED_INTERNAL_ERROR An internal error occurred during HED validation.
                /sub-0001/ses-02/func/sub-0001_ses-02_task-faces_acq-mb8_run-01_events.tsv - ERROR: [GENERIC_ERROR] Internal error - message: "Attempting to access the onset of a TSV row without one.". (For more information on this HED error, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#generic-error.)
                /sub-0001/ses-02/func/sub-0001_ses-02_task-faces_acq-mb8_run-02_events.tsv - ERROR: [GENERIC_ERROR] Internal error - message: "Attempting to access the onset of a TSV row without one.". (For more information on this HED error, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#generic-error.)

                318 more files with the same issue

        Please visit https://neurostars.org/search?q=HED_INTERNAL_ERROR for existing conversations about this issue.
  • ERROR 5
        [ERROR] STIMULUS_FILE_MISSING A stimulus file was declared but not found in the dataset.

                /sub-0001/ses-03/func/sub-0001_ses-03_task-shortvideo_acq-mb8_run-01_events.tsv
                /sub-0133/ses-03/func/sub-0133_ses-03_task-shortvideo_acq-mb8_run-01_events.tsv

                121 more files with the same issue

        Please visit https://neurostars.org/search?q=STIMULUS_FILE_MISSING for existing conversations about this issue.

SOLUTION: git grep -l task-shortvideos/ | xargs sed -i -e 's,task-shortvideos/,task-shortvideo/,g'

  • ERROR 6
        [ERROR] EMPTY_FILE Empty files not allowed.
                /sourcedata/d_beh/

        Please visit https://neurostars.org/search?q=EMPTY_FILE for existing conversations about this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants