Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #18: "schema_input.json" does not accept uncompressed contigs #21

Merged

Conversation

sgsutcliffe
Copy link
Collaborator

@sgsutcliffe sgsutcliffe commented Jul 25, 2024

The input file for running staramrnf had a restrictive pattern for contigs column in the schema_input.json such that only compressed files with .gz file extension could be included. To allow for uncompressed files I modified the regex pattern.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.

@sgsutcliffe
Copy link
Collaborator Author

To test the change I ran the nextflow run . -profile test,docker --outdir results --input samplesheet.csv using uncompressed versions of the test/genomes with extensions .fna, .fasta and .fa using a modified samplesheet.csv

sample,contigs,species
GCA_000008105,tests/genomes/salmonella/GCA_000008105.1_ASM810v1_genomic.fasta,Salmonella
GCA_000947975,tests/genomes/ecoli/GCA_000947975.1_ASM94797v1_genomic.fa,Escherichia coli
GCF_000196035,tests/genomes/listeria/GCF_000196035.1_ASM19603v1_genomic.fna,Listeria monocytogenes

@sgsutcliffe
Copy link
Collaborator Author

I will include one of the uncompressed genomes to use as test for nf-test for CI

@sgsutcliffe
Copy link
Collaborator Author

sgsutcliffe commented Aug 6, 2024

The genome file paths in the tests/assets/test_samplesheet.csv and assets/samplesheet.csv used by nf-test and nextflow run phac-nml/staramrnf -profile docker,test, respectively, point to genomes via URLs on dev branch. I updated the files to uncompressed versions on the dev branch. See commits to dev to allow for CI to pass for this PR 52a476a 05b4a3c df69014

Copy link

@kylacochrane kylacochrane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great Steven 😄

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to include this uncompressed fasta file anymore since it's already been pushed to the dev branch. However, it shouldn't cause any issues, and keeping it might actually make merging easier, as the branches will align more closely... up to you 😆

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd made the commit prior to including the uncompressed genome on dev and just decided to do less work i.e. roll back the commit, but it does appear like redundant work 😆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as it all works (e.g., no merge conflicts when merging back into dev) then you can leave as-is.

Copy link
Member

@apetkau apetkau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great Steven. Thanks so much. Just one small change on the error message being printed out (given in-line below).

assets/schema_input.json Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as it all works (e.g., no merge conflicts when merging back into dev) then you can leave as-is.

@sgsutcliffe sgsutcliffe changed the title Solves Issue #18: "schema_input.json" does not accept uncompressed contigs Fixes #18: "schema_input.json" does not accept uncompressed contigs Aug 7, 2024
Copy link
Member

@apetkau apetkau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks great to me. Thanks so much 😄

assets/schema_input.json Outdated Show resolved Hide resolved
Copy link
Member

@emarinier emarinier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks Steven.

@sgsutcliffe sgsutcliffe merged commit aa9f55f into dev Aug 8, 2024
4 checks passed
@sgsutcliffe sgsutcliffe deleted the 18-schema_inputjson-does-not-accept-uncompressed-contigs branch August 8, 2024 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants