Skip to content

Commit

Permalink
chore: Merge branch 'main' of https://github.com/CCBR/XAVIER
Browse files Browse the repository at this point in the history
  • Loading branch information
kelly-sovacool committed Aug 16, 2024
2 parents fc8fe10 + 315ef23 commit f87ca6d
Show file tree
Hide file tree
Showing 37 changed files with 144 additions and 590 deletions.
48 changes: 20 additions & 28 deletions .github/workflows/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,71 +23,63 @@ jobs:
run: |
docker run -v $PWD:/opt2 snakemake/snakemake:v7.32.4 \
/opt2/bin/xavier run --input \
/opt2/.tests/Sample10_ARK1_S37.R1.fastq.gz /opt2/.tests/Sample10_ARK1_S37.R2.fastq.gz \
/opt2/.tests/Sample11_ACI_158_S38.R1.fastq.gz /opt2/.tests/Sample11_ACI_158_S38.R2.fastq.gz \
/opt2/.tests/Sample4_CRL1622_S31.R1.fastq.gz /opt2/.tests/Sample4_CRL1622_S31.R2.fastq.gz \
/opt2/tests/data/WES_NC_N_1_sub.R1.fastq.gz /opt2/tests/data/WES_NC_N_1_sub.R2.fastq.gz \
/opt2/tests/data/WES_NC_T_1_sub.R1.fastq.gz /opt2/tests/data/WES_NC_T_1_sub.R2.fastq.gz \
--output /opt2/output_tn_fqs --targets /opt2/resources/Agilent_SSv7_allExons_hg38.bed \
--pairs /opt2/.tests/pairs.tsv --genome hg38 --mode local --ffpe --cnv --runmode init
--pairs /opt2/tests/data/pairs.tsv --genome hg38 --mode local --ffpe --cnv --runmode init
docker run -v $PWD:/opt2 snakemake/snakemake:v7.32.4 \
/opt2/bin/xavier run --input \
/opt2/.tests/Sample10_ARK1_S37.R1.fastq.gz /opt2/.tests/Sample10_ARK1_S37.R2.fastq.gz \
/opt2/.tests/Sample11_ACI_158_S38.R1.fastq.gz /opt2/.tests/Sample11_ACI_158_S38.R2.fastq.gz \
/opt2/.tests/Sample4_CRL1622_S31.R1.fastq.gz /opt2/.tests/Sample4_CRL1622_S31.R2.fastq.gz \
/opt2/tests/data/WES_NC_N_1_sub.R1.fastq.gz /opt2/tests/data/WES_NC_N_1_sub.R2.fastq.gz \
/opt2/tests/data/WES_NC_T_1_sub.R1.fastq.gz /opt2/tests/data/WES_NC_T_1_sub.R2.fastq.gz \
--output /opt2/output_tn_fqs --targets /opt2/resources/Agilent_SSv7_allExons_hg38.bed \
--pairs /opt2/.tests/pairs.tsv --genome hg38 --mode local --ffpe --cnv --runmode dryrun
--pairs /opt2/tests/data/pairs.tsv --genome hg38 --mode local --ffpe --cnv --runmode dryrun
- name: Tumor-only FastQ Dry Run
run: |
docker run -v $PWD:/opt2 snakemake/snakemake:v7.32.4 \
/opt2/bin/xavier run --input \
/opt2/.tests/Sample10_ARK1_S37.R1.fastq.gz /opt2/.tests/Sample10_ARK1_S37.R2.fastq.gz \
/opt2/.tests/Sample11_ACI_158_S38.R1.fastq.gz /opt2/.tests/Sample11_ACI_158_S38.R2.fastq.gz \
/opt2/.tests/Sample4_CRL1622_S31.R1.fastq.gz /opt2/.tests/Sample4_CRL1622_S31.R2.fastq.gz \
/opt2/tests/data/WES_NC_N_1_sub.R1.fastq.gz /opt2/tests/data/WES_NC_N_1_sub.R2.fastq.gz \
/opt2/tests/data/WES_NC_T_1_sub.R1.fastq.gz /opt2/tests/data/WES_NC_T_1_sub.R2.fastq.gz \
--output /opt2/output_tonly_fqs --targets /opt2/resources/Agilent_SSv7_allExons_hg38.bed \
--genome hg38 --mode local --ffpe --runmode init
docker run -v $PWD:/opt2 snakemake/snakemake:v7.32.4 \
/opt2/bin/xavier run --input \
/opt2/.tests/Sample10_ARK1_S37.R1.fastq.gz /opt2/.tests/Sample10_ARK1_S37.R2.fastq.gz \
/opt2/.tests/Sample11_ACI_158_S38.R1.fastq.gz /opt2/.tests/Sample11_ACI_158_S38.R2.fastq.gz \
/opt2/.tests/Sample4_CRL1622_S31.R1.fastq.gz /opt2/.tests/Sample4_CRL1622_S31.R2.fastq.gz \
/opt2/tests/data/WES_NC_N_1_sub.R1.fastq.gz /opt2/tests/data/WES_NC_N_1_sub.R2.fastq.gz \
/opt2/tests/data/WES_NC_T_1_sub.R1.fastq.gz /opt2/tests/data/WES_NC_T_1_sub.R2.fastq.gz \
--output /opt2/output_tonly_fqs --targets /opt2/resources/Agilent_SSv7_allExons_hg38.bed \
--genome hg38 --mode local --ffpe --runmode dryrun
- name: Tumor-normal BAM Dry Run
run: |
docker run -v $PWD:/opt2 snakemake/snakemake:v7.32.4 \
/opt2/bin/xavier run --input \
/opt2/.tests/Sample10_ARK1_S37.recal.bam \
/opt2/.tests/Sample11_ACI_158_S38.recal.bam \
/opt2/.tests/Sample4_CRL1622_S31.recal.bam \
/opt2/tests/data/WES_NC_N_1_sub.bam \
/opt2/tests/data/WES_NC_T_1_sub.bam \
--output /opt2/output_tn_bams --targets /opt2/resources/Agilent_SSv7_allExons_hg38.bed \
--pairs /opt2/.tests/pairs.tsv --genome hg38 --mode local --ffpe --cnv --runmode init
--pairs /opt2/tests/data/pairs.tsv --genome hg38 --mode local --ffpe --cnv --runmode init
docker run -v $PWD:/opt2 snakemake/snakemake:v7.32.4 \
/opt2/bin/xavier run --input \
/opt2/.tests/Sample10_ARK1_S37.recal.bam \
/opt2/.tests/Sample11_ACI_158_S38.recal.bam \
/opt2/.tests/Sample4_CRL1622_S31.recal.bam \
/opt2/tests/data/WES_NC_N_1_sub.bam \
/opt2/tests/data/WES_NC_T_1_sub.bam \
--output /opt2/output_tn_bams --targets /opt2/resources/Agilent_SSv7_allExons_hg38.bed \
--pairs /opt2/.tests/pairs.tsv --genome hg38 --mode local --ffpe --cnv --runmode dryrun
--pairs /opt2/tests/data/pairs.tsv --genome hg38 --mode local --ffpe --cnv --runmode dryrun
- name: Tumor-only BAM Dry Run
run: |
docker run -v $PWD:/opt2 snakemake/snakemake:v7.32.4 \
/opt2/bin/xavier run --input \
/opt2/.tests/Sample10_ARK1_S37.recal.bam \
/opt2/.tests/Sample11_ACI_158_S38.recal.bam \
/opt2/.tests/Sample4_CRL1622_S31.recal.bam \
/opt2/tests/data/WES_NC_N_1_sub.bam \
/opt2/tests/data/WES_NC_T_1_sub.bam \
--output /opt2/output_tonly_bams --targets /opt2/resources/Agilent_SSv7_allExons_hg38.bed \
--genome hg38 --mode local --ffpe --runmode init
docker run -v $PWD:/opt2 snakemake/snakemake:v7.32.4 \
/opt2/bin/xavier run --input \
/opt2/.tests/Sample10_ARK1_S37.recal.bam \
/opt2/.tests/Sample11_ACI_158_S38.recal.bam \
/opt2/.tests/Sample4_CRL1622_S31.recal.bam \
/opt2/tests/data/WES_NC_N_1_sub.bam \
/opt2/tests/data/WES_NC_T_1_sub.bam \
--output /opt2/output_tonly_bams --targets /opt2/resources/Agilent_SSv7_allExons_hg38.bed \
--genome hg38 --mode local --ffpe --runmode dryrun
Expand Down
5 changes: 0 additions & 5 deletions .tests/README.md

This file was deleted.

Empty file.
Empty file.
Empty file removed .tests/Sample10_ARK1_S37.recal.bam
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
3 changes: 0 additions & 3 deletions .tests/pairs.tsv

This file was deleted.

2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
- Previously, `xavier_gui` (with an underscore) was a command in the `ccbrpipeliner` module.
- Provide default exome targets for hg38 and mm10, which can be overridden by the optional `--targets` argument. (#102, @kelly-sovacool)
- Previously, the `--targets` argument was required with no defaults.
- Increased memory for rules: BWA mem, qualimap, kraken. gatk_contamination is not localrule. (#89, @samarth8392)
- Added new human test dataset for github workflow (#27, @samarth8392)

## XAVIER 3.0.3

Expand Down
4 changes: 1 addition & 3 deletions bin/redirect
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,11 @@ fi
# - snakemake
# are in PATH
if [[ $ISBIOWULF == true ]];then
# module purge
load_module_if_needed singularity
load_module_if_needed snakemake
load_module_if_needed snakemake/7
elif [[ $ISFRCE == true ]];then
# snakemake module on FRCE does not work as expected
# use the conda installed version of snakemake instead
# module purge
load_module_if_needed load singularity
export PATH="/mnt/projects/CCBR-Pipelines/bin:$PATH"
fi
Expand Down
12 changes: 8 additions & 4 deletions config/cluster.biowulf.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,17 @@
"threads": "2",
"time": "4:00:00"
},

"kraken": {
"mem": "64G"
},
"strelka": {
"threads": "16",
"time": "16:00:00",
"mem": "32G"
},

"qualimap_bamqc": {
"mem": "32G"
},
"strelka_filter": {
"threads": "4",
"time": "8:00:00",
Expand Down Expand Up @@ -57,7 +61,7 @@
"mem": "32G"
},

"merge_somatic_callers": {
"somatic_merge_callers": {
"threads": "16",
"time": "18:00:00",
"mem": "32G"
Expand Down Expand Up @@ -116,7 +120,7 @@
},
"bwa_mem": {
"threads": "24",
"mem": "32G"
"mem": "64G"
},
"picard_headers": {
"threads": "2",
Expand Down
12 changes: 8 additions & 4 deletions config/cluster.frce.json
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,17 @@
"threads": "2",
"time": "4:00:00"
},

"kraken": {
"mem": "64G"
},
"strelka": {
"threads": "16",
"time": "16:00:00",
"mem": "32G"
},

"qualimap_bamqc": {
"mem": "32G"
},
"strelka_filter": {
"threads": "4",
"time": "8:00:00",
Expand Down Expand Up @@ -56,7 +60,7 @@
"mem": "32G"
},

"merge_somatic_callers": {
"somatic_merge_callers": {
"threads": "16",
"time": "18:00:00",
"mem": "32G"
Expand Down Expand Up @@ -115,7 +119,7 @@
},
"bwa_mem": {
"threads": "24",
"mem": "32G"
"mem": "64G"
},
"picard_headers": {
"threads": "2",
Expand Down
17 changes: 13 additions & 4 deletions docs/usage/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@ Each of the following arguments are required. Failure to provide a required argu
>
> One or more FastQ files can be provided. The pipeline does NOT support single-end WES data. Please provide either a set of FastQ files or a set of BAM files. The pipeline does NOT support processing a mixture of FastQ files and BAM files. From the command-line, each input file should separated by a space. Globbing is supported! This makes selecting FastQ files easy. Input FastQ files should be gzipp-ed.
>
> **_Example:_** `--input .tests/*.R?.fastq.gz`
> **_Example:_** `--input tests/data/*.R?.fastq.gz`
>
> **_Example:_** `--input /data/CCBR_Pipeliner/testdata/XAVIER/human_subset/*.R?.fastq.gz`
---

Expand Down Expand Up @@ -251,15 +253,15 @@ module purge
module load ccbrpipeliner

# Step 2A.) Initialize the all resources to the output folder
xavier run --input .tests/*.R?.fastq.gz \
xavier run --input tests/data/*.R?.fastq.gz \
--output /data/$USER/xavier_hg38 \
--genome hg38 \
--targets Agilent_SSv7_allExons_hg38.bed \
--mode slurm \
--runmode init

# Step 2B.) Dry-run the pipeline
xavier run --input .tests/*.R?.fastq.gz \
xavier run --input tests/data/*.R?.fastq.gz \
--output /data/$USER/xavier_hg38 \
--genome hg38 \
--targets Agilent_SSv7_allExons_hg38.bed \
Expand All @@ -269,11 +271,18 @@ xavier run --input .tests/*.R?.fastq.gz \
# Step 2C.) Run the XAVIER pipeline
# The slurm mode will submit jobs to the cluster.
# It is recommended running xavier in this mode.
xavier run --input .tests/*.R?.fastq.gz \
xavier run --input tests/data/*.R?.fastq.gz \
--output /data/$USER/xavier_hg38 \
--genome hg38 \
--targets Agilent_SSv7_allExons_hg38.bed \
--mode slurm \
--runmode run

```

The example dataset in `tests/data` in this repository is a very small
subsampled dataset, and some steps of the pipeline fail due to the small size
(CNV callling, somalier, etc).
We have a larger subsample (25% of a full human dataset) available on Biowulf if
you would like to test the full functionality of the pipeline:
`/data/CCBR_Pipeliner/testdata/XAVIER/human_subset/*.R?.fastq.gz`
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ classifiers = [
requires-python = ">=3.11"
dependencies = [
"argparse",
"ccbr_tools@git+https://github.com/CCBR/Tools",
"Click >= 8.1.3",
"PySimpleGui < 5",
"snakemake >= 7, < 8",
Expand All @@ -63,7 +64,7 @@ Repository = "https://github.com/CCBR/XAVIER"
xavier = "."

[tool.setuptools.package-data]
"*" = ["CITATION.cff", "LICENSE", "VERSION", "docker/**", "resources/**", "bin/**", "config/**", "resources/**", "workflow/**", "tests/**", ".tests/**"]
"*" = ["CITATION.cff", "LICENSE", "VERSION", "docker/**", "resources/**", "bin/**", "config/**", "resources/**", "workflow/**", "tests/**"]

[tool.setuptools.dynamic]
version = {file = "VERSION"}
Expand Down
22 changes: 7 additions & 15 deletions src/xavier/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,28 +36,20 @@
"""

# Python standard library
from __future__ import print_function
import sys, os, subprocess, re, json, textwrap


# 3rd party imports from pypi
import argparse # potential python3 3rd party package, added in python/3.5
from ccbr_tools.pipeline.util import err, exists, fatal, permissions, require
from ccbr_tools.pipeline.cache import check_cache

# Local imports
from .run import init, setup, bind, dryrun, runner, run
from .shells import bash
from .options import genome_options
from .util import (
err,
exists,
fatal,
permissions,
check_cache,
require,
get_version,
get_genomes_list,
)
from .gui import launch_gui
from .util import xavier_base, get_version

__version__ = get_version()
__email__ = "[email protected]"
Expand Down Expand Up @@ -228,7 +220,7 @@ def parsed_arguments():
FastQ files or a set of BAM files. The pipeline does
NOT support processing a mixture of FastQ files and
BAM files.
Example: --input .tests/*.R?.fastq.gz
Example: --input tests/data/*.R?.fastq.gz
--output OUTPUT
Path to an output directory. This location is where
the pipeline will create all of its output files, also
Expand Down Expand Up @@ -264,15 +256,15 @@ def parsed_arguments():
# Step 2A.) Initialize the pipeline
xavier run \\
--runmode init \\
--input .tests/*.R?.fastq.gz \\
--input tests/data/*.R?.fastq.gz \\
--output /data/$USER/xavier_hg38 \\
--genome hg38 \\
--targets resources/Agilent_SSv7_allExons_hg38.bed
# Step 2B.) Dry-run the pipeline
xavier run \\
--runmode dryrun \\
--input .tests/*.R?.fastq.gz \\
--input tests/data/*.R?.fastq.gz \\
--output /data/$USER/xavier_hg38 \\
--genome hg38 \\
--targets resources/Agilent_SSv7_allExons_hg38.bed \\
Expand All @@ -283,7 +275,7 @@ def parsed_arguments():
# It is recommended running xavier in this mode.
xavier run \\
--runmode run \\
--input .tests/*.R?.fastq.gz \\
--input tests/data/*.R?.fastq.gz \\
--output /data/$USER/xavier_hg38 \\
--genome hg38 \\
--targets resources/Agilent_SSv7_allExons_hg38.bed \\
Expand Down
Loading

0 comments on commit f87ca6d

Please sign in to comment.