adding fastq_screen module #19

FranBonath · 2024-08-21T11:18:41Z

PR checklist

github-actions · 2024-08-21T11:20:12Z

`nf-core lint` overall result: Failed ❌

Posted for pipeline commit 2e58a3c

+| ✅ 173 tests passed       |+
!| ❗  21 tests had warnings |!
-| ❌   2 tests failed       |-

❌ Test failures:

template_strings - Found a Jinja template string in /home/runner/work/seqinspector/seqinspector/modules/nf-core/fastqscreen/references/genome_ecoli/genome.rev.1.bt2 L20005: �{{©_Ö·ªªÿ�]Êû�XmU}}
merge_markers - Merge marker '<<<<<<<' in /home/runner/work/seqinspector/seqinspector/modules/nf-core/fastqscreen/references/genome_cerevisiae/genome.4.bt2: Ïó°Ïü0ÃÌÌ÷�üÌ0��ÃÃ<ÏÀ0Ï�

❗ Test warnings:

readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
pipeline_todos - TODO string in main.nf: Remove this line if you don't need a FASTA file
pipeline_todos - TODO string in nextflow.config: Specify your pipeline's command line flags
pipeline_todos - TODO string in README.md: TODO nf-core:
pipeline_todos - TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core
pipeline_todos - TODO string in README.md: Fill in short bullet-pointed list of the default steps in the pipeline
pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
pipeline_todos - TODO string in ci.yml: You can customise CI pipeline run tests as required
pipeline_todos - TODO string in base.config: Check the defaults for all processes
pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
pipeline_todos - TODO string in test.config: Specify the paths to your test data on nf-core/test-datasets
pipeline_todos - TODO string in test.config: Give any required params for the test so that command line flags are not needed
pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-seqinspector_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-seqinspector_logo_light.png
files_exist - File found: docs/images/nf-core-seqinspector_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: modules.json
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: docs/images/nf-core-seqinspector_logo.png
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/NfcoreTemplate.groovy
files_exist - File not found check: lib/Utils.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: lib/WorkflowMain.groovy
files_exist - File not found check: lib/WorkflowSeqinspector.groovy
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: Singularity
files_exist - File not found check: lib/nfcore_external_java_deps.jar
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 1.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
nextflow_config - Config default value correct: params.custom_config_version= master
nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Config default value correct: params.max_cpus= 16
nextflow_config - Config default value correct: params.max_memory= 128.GB
nextflow_config - Config default value correct: params.max_time= 240.h
nextflow_config - Config default value correct: params.publish_dir_mode= copy
nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
nextflow_config - Config default value correct: params.validate_params= true
nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-seqinspector_logo_light.png matches the template
files_unchanged - docs/images/nf-core-seqinspector_logo_light.png matches the template
files_unchanged - docs/images/nf-core-seqinspector_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
pipeline_name_conventions - Name adheres to nf-core convention
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: download_pipeline.yml
actions_schema_validation - Workflow validation passed: release-announcements.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: ci.yml
modules_json - Only installed modules found in modules.json
multiqc_config - assets/multiqc_config.yml found and not ignored.
multiqc_config - assets/multiqc_config.yml contains report_section_order
multiqc_config - assets/multiqc_config.yml contains export_plots
multiqc_config - assets/multiqc_config.yml contains report_comment
multiqc_config - assets/multiqc_config.yml follows the ordering scheme of the minimally required plugins.
multiqc_config - assets/multiqc_config.yml contains a matching 'report_comment'.
multiqc_config - assets/multiqc_config.yml contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
base_config - conf/base.config found and not ignored.
modules_config - conf/modules.config found and not ignored.
modules_config - FASTQC found in conf/modules.config and Nextflow scripts.
modules_config - MULTIQC_GLOBAL found in conf/modules.config and Nextflow scripts.
modules_config - MULTIQC_PER_TAG found in conf/modules.config and Nextflow scripts.
nfcore_yml - Repository type in .nf-core.yml is valid: pipeline
nfcore_yml - nf-core version in .nf-core.yml is set to the latest version: 2.14.1

Run details

nf-core/tools version 2.14.1
Run at 2024-09-18 09:20:34

FranBonath · 2024-08-21T11:20:29Z

@nf-core-bot fix linting, please :)

MatthiasZepper · 2024-08-21T16:49:42Z

I only had a quick glance (so no formal review yet), but I would prefer that we start using git lfs on this repo for managing the references and other large files. I am sure that we will have more modules that require large reference data:

git lfs install
git lfs track "*.bt2" 
git lfs track "*.fa" 
git add .gitattributes
git commit

Ideally, you would in the process also edit the history on your branch, because the previous commands would only apply to future files, but not the ones that you already committed. For this, it is necessary to rewrite the history and that will lead to diverging branches with your origin:

On branch dev
Your branch and 'origin/dev' have diverged,
and have 1 and 1 different commits each, respectively.
(use "git pull" if you want to integrate the remote branch with yours)

Instead of pulling in the origin, however, (which would bring back the data we want to prune from history), we force push the rewritten history to the remote origin on GitHub. Then your origin dev branch is clean again and can be safely merged to the upstream repo in nf-core after approval of this PR:

git checkout -b "backup_dev"
git checkout dev
git lfs install
pip install git-filter-repo
git lfs migrate import --include="*.bt2"  --include="*.fa" 
git lfs track "*.bt2" 
git lfs track "*.fa" 
git add .
git commit --amend
git reflog expire --expire=now --single-worktree
git gc --prune=now --aggressive
git push --force-with-lease

alneberg · 2024-08-22T06:37:36Z

The suggestion by @MatthiasZepper seems to be a bit of work but I think I agree. Probably worth getting git lfs up and running right away.

MatthiasZepper · 2024-08-22T09:27:43Z

The suggestion by @MatthiasZepper seems to be a bit of work but I think I agree. Probably worth getting git lfs up and running right away.

The downside of git lfs is, that it is not right away supported by Nextflow :-/ ... sounds like a good feature for a plugin but...alas.

FranBonath · 2024-09-09T09:01:46Z

As per our discussion in the dev meeting, we bench the git lfs implementation for now, right? An I will instead pivot to use iGenome references for the tests, correct? @alneberg @MatthiasZepper

alneberg · 2024-09-09T09:05:35Z

Yes, git lfs is not an option at the moment I think. If a suitable genome is already present in iGenomes that would be perfect I think.

FranBonath · 2024-09-18T15:29:30Z

I tried to provide the references for fastq screen test profile via igenomes, which is in a S3 bucket. Problem is, fastqscreen cannot read it. We have a few options, none I really like:

) ship the pipeline with bowtie2build and make our own bowtie index. This is what the module test uses. I don't like it because we add a tool to our pipeline just so we get the tests to run. A tool that can break and that has no impact on actually running the pipeline for real
) what I am currently doing and providing the reference as part of the module. We can chose for it to be only one, very small, references, for example PhiX. I hate hard coded anything.
) don't have test :P

FranBonath · 2024-09-18T15:30:29Z

I tried to provide the references for fastq screen test profile via igenomes, which is in a S3 bucket. Problem is, fastqscreen cannot read it. We have a few options, none I really like:

) ship the pipeline with bowtie2build and make our own bowtie index. This is what the module test uses. I don't like it because we add a tool to our pipeline just so we get the tests to run. A tool that can break and that has no impact on actually running the pipeline for real

) what I am currently doing and providing the reference as part of the module. We can chose for it to be only one, very small, references, for example PhiX. I hate hard coded anything.

) don't have test :P

I had a lengthly discussion with @maxulysse about this, but we couldn't really agree.

Aratz · 2024-09-18T15:38:31Z

I'd say 2, if you choose a tiny reference.

I think in general bad tests are better than no tests, I don't think it's such a big issue that the reference is unrealistically small, the tests will still catch eg if fastq_screen crashes because of some config error.

maxulysse · 2024-09-19T09:11:16Z

I agree with @Aratz, it's better to have bad tests than no tests.
I think option 2 works well for profile test, but for profile test_full we're going to need more than that, so why not going all the way already?

FranBonath added 2 commits August 20, 2024 17:39

troubleshooting fastqscreen WIP

3007c48

adding fastq screen to the pipeline

a099c04

FranBonath requested review from alneberg and MatthiasZepper August 21, 2024 11:18

Merge branch 'nf-core:dev' into dev

2e58a3c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding fastq_screen module #19

adding fastq_screen module #19

FranBonath commented Aug 21, 2024

github-actions bot commented Aug 21, 2024 •

edited

Loading

❌ Test failures:

❗ Test warnings:

✅ Tests passed:

Run details

FranBonath commented Aug 21, 2024

MatthiasZepper commented Aug 21, 2024

alneberg commented Aug 22, 2024

MatthiasZepper commented Aug 22, 2024

FranBonath commented Sep 9, 2024

alneberg commented Sep 9, 2024

FranBonath commented Sep 18, 2024

FranBonath commented Sep 18, 2024 •

edited

Loading

Aratz commented Sep 18, 2024

maxulysse commented Sep 19, 2024

adding fastq_screen module #19

Are you sure you want to change the base?

adding fastq_screen module #19

Conversation

FranBonath commented Aug 21, 2024

PR checklist

github-actions bot commented Aug 21, 2024 • edited Loading

nf-core lint overall result: Failed ❌

❌ Test failures:

❗ Test warnings:

✅ Tests passed:

Run details

FranBonath commented Aug 21, 2024

MatthiasZepper commented Aug 21, 2024

alneberg commented Aug 22, 2024

MatthiasZepper commented Aug 22, 2024

FranBonath commented Sep 9, 2024

alneberg commented Sep 9, 2024

FranBonath commented Sep 18, 2024

FranBonath commented Sep 18, 2024 • edited Loading

Aratz commented Sep 18, 2024

maxulysse commented Sep 19, 2024

github-actions bot commented Aug 21, 2024 •

edited

Loading

`nf-core lint` overall result: Failed ❌

FranBonath commented Sep 18, 2024 •

edited

Loading