-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bash *
wildcard for extensions — Regex compliance for YAML
#1055
Comments
Just an initial thought but we now have |
@tsalo thank you for the rapid response :)
Another suggestion I've heard floating around (can't remember where) is that perhaps all YAML schema entries should have dedicated regex representations which can then be referenced for regex matching. |
But we do have a variable here, so the name and the search pattern can be dissociated (preferably unlike the rest of the suffixes).
I would argue that literal matching should be assumed unless there is a .*:
name: Any Extension
description: |
Any extension is allowed.
pattern: '\.[0-9a-zA-Z]+'
.ave:
name: AVE # not sure what ave stands for
description: |
File containing data averaged by segments of interest.
Used by KIT, Yokogawa, and Ricoh MEG systems. I'd like to know what others think of this though. |
Use regex, but only good regex. |
I think we are already doing this now. |
I recently ran into this https://github.com/bids-standard/bids-specification/blob/master/src/schema/rules/datatypes/meg.yaml#L63 which messes with regex (
re.error: nothing to repeat at position 172
). Well, the position part being irrelevant.The issue arises from the fact that
*
means different things in Bash and Regex. In Regex it's a modifier for the preceding character meaning zero or more occurrences.Interestingly this also potentially compounds the confusion arising from the leading period in extensions (discussed here). This is because in Bash,
.*
means period followed by anything, whereas in Regex it means any character occurring zero or more times (so basically anything).Luckily I had already implemented a period-safe parsing for the extensions so instead of false negatives the validator gives an outright error. Honestly this was a pretty solid dodging of a bullet since this is the sort of thing that could have gone unnoticed for a very long time.
I ofc understand that bash syntax makes the YAML more legible for casual reference, but perhaps its main utility lies more in programmatic usability than manual lookup.
So I guess what I'm asking is whether you agree that it would make sense to make YAML enries regex-comliant. Given that we do not generally use special characters, that would basically just be this one entry, and also involve a small change to the
tools/schemacode/schemacode/render.py
file which I had on the roadmap anyway.It would also be a great opportunity to address the leading periods, though I'm unsure whether it could be said that we agree on that.
@yarikoptic @tsalo
The text was updated successfully, but these errors were encountered: