Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add suggestions from code review #153

Merged
merged 2 commits into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,5 @@ lint:
- conf/test.config
- conf/test_full.config
files_unchanged:
- lib/NfcoreTemplate.groovy # Introduced a change ahead of the nf-core/tools release
- .github/PULL_REQUEST_TEMPLATE.md
nf_core_version: "2.14.1"
14 changes: 7 additions & 7 deletions docs/usage/screening.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,23 +48,23 @@ MAGeCK count which is the main alignment software used is normally able to autom

### bowtie2

The MAGeCK count module supports bam files, which allows you to align with bowtie2 first. If you wish to do so (for instance to allow library with mismatches or to set the aligner with specific flags) you can provide a fasta file with `--fasta`. Currently, you also still need to provide the library file.
The MAGeCK count module supports bam files, which allows you to align with bowtie2 first. If you wish to do so (for instance to allow mapping reads to the library with mismatches or to set the aligner with specific flags) you can provide a fasta file with `--fasta` encoding the library. Currently, you also still need to provide the tab-separated library file with `--library`.

### library

If you are running the pipeline with fastq files and wish to obtain a count table, the library parameter is needed. The library table has three mandatory columns : id, target transcript (or gRNA sequence) and gene symbol.
An [example](https://github.com/nf-core/test-datasets/blob/crisprseq/testdata/brunello_target_sequence.txt) has been provided with the pipeline. Many libraries can be found on [addgene](https://www.addgene.org/).

After the alignment step, if you are performing KO (Knock-Out) screens, you can choose to correction of gene independent cell responses to CRISPR-cas9 targeting using crisprcleanr. If you are performing a CRISPR interference or activation screen, this step is not needed.
After the alignment step, if you are performing KO (Knock-Out) screens, you can choose to correct gene-independent cell responses to CRISPR-Cas9 targeting using CRISPRcleanR. If you are performing a CRISPR interference or activation screen, this step is not needed.

The pipeline currently supports 3 algorithms to detect gene essentiality, MAGeCK rra, MAGeCK mle and BAGEL2. MAGeCK MLE (Maximum Likelihood Estimation) and MAGeCK RRA (Robust Ranking Aggregation) are two different methods provided by the MAGeCK software package to analyze CRISPR-Cas9 screens. BAGEL2 identifies gene essentiality through Bayesian Analysis.
The pipeline currently supports 3 algorithms to detect gene essentiality, MAGeCK RRA, MAGeCK MLE and BAGEL2. MAGeCK MLE (Maximum Likelihood Estimation) and MAGeCK RRA (Robust Ranking Aggregation) are two different methods provided by the MAGeCK software package to analyze CRISPR-Cas9 screens. BAGEL2 identifies gene essentiality through Bayesian Analysis.
We recommend to run MAGeCK MLE and BAGEL2 as these are the most used and most recent algorithms to determine gene essentiality.

### Running CRISPRcleanR

CRISPRcleanR is used for gene count normalization and the removal of biases for genomic segments for which copy numbers are amplified. Currently, the pipeline supports annotation libraries already present in the R package or a annotation file the user can provide.
[CRISPRcleanR](https://github.com/francescojm/CRISPRcleanR) is used for gene count normalization and the removal of biases for genomic segments for which copy numbers are amplified. Currently, the pipeline supports annotation libraries already present in the R package or user-provided annotation files.
Most used library already have an annotation dataset which you can find [here](https://github.com/francescojm/CRISPRcleanR/blob/master/Reference_Manual.pdf). To use CRISPRcleanR normalization, use `--crisprcleanr library`, `library` being the exact name as the library in the CRISPRcleanR documentation (e.g: "AVANA_Library").
Otherwise, if you wish to provide your own file, please provide it in csv form, and make sure it follows the following format, with the comma in front of "CODE" included :
Otherwise, if you wish to provide your own file, please provide it in CSV format, and make sure it follows the following format (with the comma in front of "CODE" included):

| ,CODE | GENES | EXONE | CHRM | STRAND | STARTpos | ENDpos |
| -------------------- | ----------- | ------------- | ---- | ------ | -------- | -------- |
Expand All @@ -89,7 +89,7 @@ Running MAGeCK MLE and BAGEL2 with a contrast file will also output a Venn diagr

### Running MAGeCK RRA only

MAGeCK RRA performs robust ranking aggregation to identify genes that are consistently ranked highly across multiple replicate screens. To run MAGeCK rra, you can define the contrasts as previously stated in the last section (with a `.txt` extension) and also specify `--rra` .
MAGeCK RRA performs robust ranking aggregation to identify genes that are consistently ranked highly across multiple replicate screens. To run MAGeCK RRA, you can define the contrasts as previously stated in the last section (with a `.txt` extension) and also specify `--rra`.

### Running MAGeCK MLE only

Expand All @@ -112,7 +112,7 @@ BAGEL2 uses the same contrasts from `--contrasts`.

### MAGECKFlute

The downstream analysis involves distinguishing essential, non-essential, and target-associated genes. Additionally, it encompasses conducting biological functional category analysis and pathway enrichment analysis for these genes. Furthermore, the function provides visualization of genes within pathways, enhancing user exploration of screening data. MAGECKFlute is run automatically after MAGeCK MLE and for each MLE design matrice. If you have used the `--day0_label`, MAGeCKFlute will be ran on all the other conditions. Please note that the DepMap data is used for these plots.
The downstream analysis involves distinguishing essential, non-essential, and target-associated genes. Additionally, it encompasses conducting biological functional category analysis and pathway enrichment analysis for these genes. Furthermore, it provides visualization of genes within pathways, enhancing user exploration of screening data. MAGECKFlute is run automatically after MAGeCK MLE and for each MLE design matrice. If you have used the `--day0_label`, MAGeCKFlute will be ran on all the other conditions. Please note that the DepMap data is used for these plots.

Note that the pipeline will create the following files in your working directory:

Expand Down
8 changes: 4 additions & 4 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,11 @@ include { CRISPRSEQ_SCREENING } from './workflows/crisprseq_screening'
workflow NFCORE_CRISPRSEQ {

take:
reads_targeted // channel: fastqc files read in from --input
reads_targeted // channel: fastqc files read in from --input
reads_screening // channel: fastqc files read in from --input
reference // channel: reference sequence read from --input
protospacer // channel: protospacer sequence read from --input
template // channel: template sequence read from --input
reference // channel: reference sequence read from --input
protospacer // channel: protospacer sequence read from --input
template // channel: template sequence read from --input

main:
//
Expand Down
2 changes: 1 addition & 1 deletion nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@
},
"fasta": {
"type": "string",
"description": "Fasta file in case you want to map with bowtie2 and then MAGeCK count"
"description": "Library in fasta file format in case you want to map with bowtie2 and then MAGeCK count"
},
"day0_label": {
"type": "string",
Expand Down
8 changes: 4 additions & 4 deletions subworkflows/local/utils_nfcore_crisprseq_pipeline/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,11 @@ workflow PIPELINE_INITIALISATION {
} else {
files = [ fastq_1 ]
}
reads_targeted: [ meta.id, meta - meta.subMap('condition') + [ single_end:fastq_2?false:true, self_reference:reference?false:true, template:template?true:false ], files ]
reads_targeted: [ meta.id, meta - meta.subMap('condition') + [ single_end : fastq_2 ? false : true, self_reference : reference ? false : true, template : template ? true : false ], files ]
reads_screening:[ meta + [ single_end:fastq_2?false:true ], files ]
reference: [meta - meta.subMap('condition') + [ single_end:fastq_2?false:true, self_reference:reference?false:true, template:template?true:false ], reference]
protospacer: [meta - meta.subMap('condition') + [ single_end:fastq_2?false:true, self_reference:reference?false:true, template:template?true:false ], protospacer]
template: [meta - meta.subMap('condition') + [ single_end:fastq_2?false:true, self_reference:reference?false:true, template:template?true:false ], template]
reference: [meta - meta.subMap('condition') + [ single_end : fastq_2 ? false : true, self_reference : reference ? false : true, template : template ? true : false ], reference]
protospacer: [meta - meta.subMap('condition') + [ single_end : fastq_2 ? false : true, self_reference : reference ? false : true, template : template ? true : false ], protospacer]
template: [meta - meta.subMap('condition') + [ single_end : fastq_2 ? false : true, self_reference : reference ? false : true, template : template ? true : false ], template]
}
.set { ch_input }

Expand Down