Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCUSSION] A more concise CLI that is well behaved #113

Open
fabianegli opened this issue Apr 5, 2022 · 0 comments
Open

[DISCUSSION] A more concise CLI that is well behaved #113

fabianegli opened this issue Apr 5, 2022 · 0 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@fabianegli
Copy link
Collaborator

When looking at the sdrf-pipelines package there are many different terms that have a range of meanings and uses.

The string sdrf-pipelines is only used for the installation and the import of the package. I think that is fine.

Now for the CLI. It introduces the command parse_sdrf. I think this is inherently not a bad name if what the tool does is parsing one or more SDRF files. But the tool actually does more. It validates SDRF files when called with parse_sdrf validate-sdrf ... and converts parse_sdrf convert-openms ... them to input files for other tools based. Validation requires parsing and conversion as well so the parse in the parse_sdrf seems redundant and even somewhat misleading because the tool advertises the parsing but goes much further than parsing SDRF files. Thats why I think the name of the command line tool is not ideal.

Also note that the "conversion" it is not actually a pure conversion of the information in the in the SDRF. In the case of the MaxQuant output it also an enrichment. parse_sdrf as a command name doesn't do that justice.

Because of all the above I propose to adopt a new CLI naming and behaviour as follows:

  • a command called sdrf which can be used to validate and write SDRF files.
  • sdrf validate only validates sdrf files. There might be something like a --strict flag to make it only validate byte-perfect SDRF files and would complain about trailing whitespaces and other errors ignored by a the permissive parser.

The CLI should be a well-behaved. Some specific properties that come to mind are pipes. It would be great if input and output can be piped.

sdrf is NOT a

  • debian/ubuntu/redhat package name
  • Python package name
  • MacPorts package
  • bioconda package
  • biocontainer

I know there is already a sdrf-pipelines/sdrf_parse/convert-openms/convert-maxquant/... But I think that the tool is still relatively new and would profit from a change in the long run. Implementing the proposals above would also not require that the current syntax would break immediately.

For the conversion, there could be an analog sdrf command:

  • sdrf convert --from-format [input-format] --in [file] --to-format [output format] --out [file] [additional configurations] This should also handle the --strict flag mentioned above and convert from and to the SDRF file format. The format only has to be specified for the non-SDRF file.

Thoughts and discussions are welcome.

@fabianegli fabianegli added help wanted Extra attention is needed question Further information is requested labels Apr 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant