Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add consensus peaks subworkflow #37

Merged
merged 53 commits into from
Nov 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
4f98105
feat: add consensus peaks process [wip]
kelly-sovacool Nov 21, 2023
fec6ce4
chore: Merge branch 'main' into consensus-peaks
kelly-sovacool Nov 22, 2023
f0360dd
fix: escape backslashes for nextflow template script
kelly-sovacool Nov 22, 2023
a66b209
chore: Merge branch 'main' into consensus-peaks
kelly-sovacool Nov 22, 2023
18ff6af
fix: escape dollar signs for nf template
kelly-sovacool Nov 22, 2023
5e6f65d
test: use macs2 peaks as test data
kelly-sovacool Nov 22, 2023
d7b2f07
fix: metadata for consensuspeaks
kelly-sovacool Nov 22, 2023
1a5554b
test: add real md5sums for custom/consensuspeaks
kelly-sovacool Nov 22, 2023
3aca9c5
fix: meta syntax
kelly-sovacool Nov 22, 2023
c538767
fix: write versions
kelly-sovacool Nov 22, 2023
a19ae45
chore: Merge branch 'main' into consensus-peaks
kelly-sovacool Nov 22, 2023
721a740
chore: Merge branch 'main' into consensus-peaks
kelly-sovacool Nov 22, 2023
861460f
feat: adapt bedtools modules from nf-core
kelly-sovacool Nov 22, 2023
4f0e5a2
refactor: minor cosmetic changes
kelly-sovacool Nov 22, 2023
01855f7
chore: update changelog
kelly-sovacool Nov 22, 2023
514ebfd
fix: import platform
kelly-sovacool Nov 27, 2023
2d0ddb3
fix: switch nf-core -> ccbr
kelly-sovacool Nov 27, 2023
73a74a6
fix: use ccbr base docker
kelly-sovacool Nov 27, 2023
707b093
test: don't check md5sum of versions file
kelly-sovacool Nov 27, 2023
73705db
test: stub for custom/consensuspeaks module
kelly-sovacool Nov 27, 2023
b716bca
docs: add bedtools modules
kelly-sovacool Nov 27, 2023
cbf2d59
feat: add cat modules
kelly-sovacool Nov 27, 2023
075450e
feat: add bedops/bedmap module
kelly-sovacool Nov 28, 2023
624f56c
style: align includes
kelly-sovacool Nov 28, 2023
dff08c9
test: add test for bedops/bedmap
kelly-sovacool Nov 28, 2023
b45f529
feat: set default prefixes for bedtools processes
kelly-sovacool Nov 28, 2023
35157e0
fix: propagate metadata correctly
kelly-sovacool Nov 28, 2023
6967a9f
chore: Merge branch 'main' into consensus-peaks
kelly-sovacool Nov 28, 2023
8664362
test: update test data config paths
kelly-sovacool Nov 28, 2023
d46e529
refactor: fix ref bed name
kelly-sovacool Nov 28, 2023
8e31f34
refactor: don't delete intermediate files
kelly-sovacool Nov 28, 2023
dbc2219
feat: keep ref meta.id in bedmap outfiles
kelly-sovacool Nov 29, 2023
e11988d
feat: create custom/combinepeaks module for consensus subwf
kelly-sovacool Nov 29, 2023
ed45eb8
fix: template is nxf command, not bash
kelly-sovacool Nov 29, 2023
09f7898
fix: define vars for Rscript template
kelly-sovacool Nov 29, 2023
9f5040a
feat: sort bed file to match original custom/consensuspeaks.py
kelly-sovacool Nov 29, 2023
9639bb7
test: create test yml for custom/combinepeaks
kelly-sovacool Nov 29, 2023
1d65a57
refactor: rename 'combinepeaks' -> 'combinepeakcounts'
kelly-sovacool Nov 29, 2023
2b5f3f7
feat: create module to normalize consensus peaks
kelly-sovacool Nov 29, 2023
097db63
refactor: set nxf variables as defaults
kelly-sovacool Nov 29, 2023
01f7c4d
docs: update metadata for peak-related modules
kelly-sovacool Nov 29, 2023
3e97ce2
docs: fix typo
kelly-sovacool Nov 29, 2023
80f9ac2
feat: create consensus_peaks subwf
kelly-sovacool Nov 29, 2023
76110b2
docs: add consensus_peaks subwf to changelog
kelly-sovacool Nov 29, 2023
8692e33
docs: point users to consensus_peaks subwf from legacy module
kelly-sovacool Nov 29, 2023
02bbaca
chore: Merge branch 'main' into consensus-peaks
kelly-sovacool Nov 29, 2023
fe9a7da
docs: move new modules & subwfs for next dev version
kelly-sovacool Nov 29, 2023
f81e205
docs: license must be an array
kelly-sovacool Nov 29, 2023
7cbeb70
test: creat test yml for consensus_peaks subwf
kelly-sovacool Nov 29, 2023
0773762
fix: typo in param name
kelly-sovacool Nov 29, 2023
5dea69a
test: add test for normalized & mixed consensus groups
kelly-sovacool Nov 29, 2023
74d899d
fix: mix process versions channel
kelly-sovacool Nov 29, 2023
8917fd9
test: fix workflow entry name to mirrow new module name
kelly-sovacool Nov 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,20 @@

### New modules

- bedops/bedmap (#37)
- bedtools/map (#37)
- bedtools/merge (#37)
- bedtools/sort (#37)
- cat/cat (#37)
- cat/fastq (#37)
- custom/combinepeakcounts (#37)
- custom/consensuspeaks (#37)
- custom/normalizepeaks (#37)

### New subworkflows

- consensus_peaks (#37)

## nf-modules 0.1.0

Our documentation website is now live: <https://ccbr.github.io/nf-modules/> (#16)
Expand All @@ -24,4 +36,4 @@ Our documentation website is now live: <https://ccbr.github.io/nf-modules/> (#16

### New subworkflows

- custom/filter_blacklist (#17,#27)
- filter_blacklist (#17,#27)
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@ Want to **contribute** to this project? Check out the [contributing guidelines](
Many of the modules and subworkflows in this project reuse and adapt code from [nf-core/modules](https://github.com/nf-core/modules).
In those cases, credit is noted in the `meta.yml` file of the module/subworkflow and also listed here:

- [bedtools](modules/CCBR/bedtools) adapts the [nf-core bedtools module](https://github.com/nf-core/modules/tree/fff2c3fc7cdcb81a2a37c3263b8ace9b353af407/modules/nf-core/bedtools)
- [bwa](modules/CCBR/bwa) adapts the [nf-core bwa module](https://github.com/nf-core/chipseq/tree/51eba00b32885c4d0bec60db3cb0a45eb61e34c5/modules/nf-core/modules/bwa)
- [cat](modules/cat) adapts the [nf-core cat module](https://github.com/nf-core/modules/tree/9326d73af3fbe2ee90d9ce0a737461a727c5118e/modules/nf-core/cat)
- [cutadapt](modules/CCBR/cutadapt) adapts the [nf-core cutadapt module](https://github.com/nf-core/modules/tree/ef007b1ce5316506b8c27c3e7a62482409c6153c/modules/nf-core/cutadapt)
- [khmer](modules/CCBR/khmer) adapts the [nf-core khmer module](https://github.com/nf-core/modules/tree/b48a1efc8e067502e1a9bafbac788c1e0abdfc6a/modules/nf-core/khmer)
- [picard/samtofastq](modules/picard/samtofastq) adapts the [nf-core gatk4 samtofastq module](https://github.com/nf-core/modules/tree/ef007b1ce5316506b8c27c3e7a62482409c6153c/modules/nf-core/gatk4/samtofastq)
Expand Down
41 changes: 41 additions & 0 deletions modules/CCBR/bedops/bedmap/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
process BEDOPS_BEDMAP {
tag "${meta.id}.${refmeta.id}"
label 'process_single'
container 'nciccbr/ccbr_ubuntu_base_20.04:v6.1'

input:
tuple val(meta), path(mapbed), val(refmeta), path(refbed)

output:
tuple val(meta), path("*.mapped.bed"), emit: bed
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
"""
bedmap \\
--delim '\t' \\
--echo-ref-name \\
--count \\
${refbed} \\
${mapbed} \\
> ${meta.id}.${refmeta.id}.mapped.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
bedops: \$(echo \$(bedops --version 2>&1 | grep version | sed 's/version: //'))
END_VERSIONS
"""

stub:
"""
touch ${meta.id}.${refmeta.id}.mapped.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
bedops: \$(echo \$(bedops --version 2>&1 | grep version | sed 's/version: //'))
END_VERSIONS
"""
}
55 changes: 55 additions & 0 deletions modules/CCBR/bedops/bedmap/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
name: "bedops_bedmap"
description: The bedmap program is used to retrieve and process signal or other features over regions of interest in BED files
keywords:
- bedops
- bedmap
- bed
- intervals
tools:
- bedops:
description: |
fast, highly scalable and easily-parallelizable genome analysis toolkit
documentation: https://bedops.readthedocs.io/
tool_dev_url: https://github.com/bedops/bedops
licence: ["GPLv2"]
doi: 10.1093/bioinformatics/bts277

input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. `[ id:'test', single_end:false ]`

- refbed:
type: file
description: BED file
pattern: "*.bed"
- mapbed:
type: file
description: BED file
pattern: "*.bed"

output:
#Only when we have meta
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. `[ id:'test', single_end:false ]`

- mapped_bed:
type: file
description: BED file
pattern: "*.bed"

- versions:
type: file
description: File containing software versions
pattern: "versions.yml"

authors:
- "@kelly-sovacool"
maintainers:
- "@kelly-sovacool"
55 changes: 55 additions & 0 deletions modules/CCBR/bedtools/map/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
process BEDTOOLS_MAP {
tag { meta.id }
label 'process_single'

container 'nciccbr/ccbr_ubuntu_base_20.04:v6.1'

input:
tuple val(meta), path(intervals1), path(intervals2)
tuple val(meta2), path(chrom_sizes)

output:
tuple val(meta), path("*.${extension}"), emit: map
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}.mapped"
extension = intervals1.getExtension()
def sizes = chrom_sizes ? "-g ${chrom_sizes}" : ''
if ("$intervals1" == "${prefix}.${extension}" ||
"$intervals2" == "${prefix}.${extension}")
error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!"
"""
bedtools \\
map \\
-a ${intervals1} \\
-b ${intervals2} \\
${args} \\
${sizes} \\
> ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

stub:
def prefix = task.ext.prefix ?: "${meta.id}.mapped"
extension = intervals1.getExtension()
if ("${intervals1}" == "${prefix}.${extension}" ||
"${intervals2}" == "${prefix}.${extension}")
error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!"
"""
touch ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""
}
55 changes: 55 additions & 0 deletions modules/CCBR/bedtools/map/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
name: bedtools_map
description: Allows one to screen for overlaps between two sets of genomic features. Adapted from https://github.com/nf-core/modules/tree/fff2c3fc7cdcb81a2a37c3263b8ace9b353af407/modules/nf-core/bedtools
keywords:
- bed
- vcf
- gff
- map
- bedtools
tools:
- bedtools:
description: |
A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.
documentation: https://bedtools.readthedocs.io/en/latest/content/tools/map.html
licence: ["MIT"]
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- intervals1:
type: file
description: BAM/BED/GFF/VCF
pattern: "*.{bed|gff|vcf}"
- intervals2:
type: file
description: BAM/BED/GFF/VCF
pattern: "*.{bed|gff|vcf}"
- meta2:
type: map
description: |
Groovy Map containing reference chromosome sizes
e.g. [ id:'test' ]
- chrom_sizes:
type: file
description: Chromosome sizes file
pattern: "*{.sizes,.txt}"
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- map:
type: file
description: File containing the description of overlaps found between the features in A and the features in B, with statistics
pattern: "*.${extension}"
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
authors:
- "@kelly-sovacool"
maintainers:
- "@kelly-sovacool"
44 changes: 44 additions & 0 deletions modules/CCBR/bedtools/merge/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
process BEDTOOLS_MERGE {
tag { meta.id }
label 'process_single'

container 'nciccbr/ccbr_ubuntu_base_20.04:v6.1'

input:
tuple val(meta), path(bed)

output:
tuple val(meta), path('*.bed'), emit: bed
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}.merged"
if ("$bed" == "${prefix}.bed") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!"
"""
bedtools \\
merge \\
-i ${bed} \\
${args} \\
> ${prefix}.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

stub:
def prefix = task.ext.prefix ?: "${meta.id}.merged"
"""
touch ${prefix}.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""
}
41 changes: 41 additions & 0 deletions modules/CCBR/bedtools/merge/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: bedtools_merge
description: combines overlapping or “book-ended” features in an interval file into a single feature which spans all of the combined features. Adapted from https://github.com/nf-core/modules/tree/fff2c3fc7cdcb81a2a37c3263b8ace9b353af407/modules/nf-core/bedtools
keywords:
- bed
- merge
- bedtools
- overlapped bed
tools:
- bedtools:
description: |
A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.
documentation: https://bedtools.readthedocs.io/en/latest/content/tools/merge.html
licence: ["MIT"]
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bed:
type: file
description: Input BED file
pattern: "*.{bed}"
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bed:
type: file
description: Overlapped bed file with combined features
pattern: "*.{bed}"
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
authors:
- "@kelly-sovacool"
maintainers:
- "@kelly-sovacool"
51 changes: 51 additions & 0 deletions modules/CCBR/bedtools/sort/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
process BEDTOOLS_SORT {
tag { meta.id }
label 'process_single'

container 'nciccbr/ccbr_ubuntu_base_20.04:v6.1'

input:
tuple val(meta), path(intervals)
path genome_file

output:
tuple val(meta), path("*.${extension}"), emit: sorted
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}.sorted"
def genome_cmd = genome_file ? "-g $genome_file" : ""
extension = task.ext.suffix ?: intervals.extension
if ("$intervals" == "${prefix}.${extension}") {
error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!"
}
"""
bedtools \\
sort \\
-i ${intervals} \\
${genome_cmd} \\
${args} \\
> ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

stub:
def prefix = task.ext.prefix ?: "${meta.id}.sorted"
extension = task.ext.suffix ?: intervals.extension
"""
touch ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""
}
Loading
Loading