Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify input parameters to integrate with IRIDA-Next UI #15

Merged
merged 28 commits into from
Jul 25, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
7f4ddab
Changed the name of assembled genomes from 'contigs' to 'fastq_1' to …
sgsutcliffe Jun 28, 2024
80ffe12
Revert "Changed the name of assembled genomes from 'contigs' to 'fast…
sgsutcliffe Jun 28, 2024
4c8f4f4
Contig input set to file-path to trigger autopopulating sample contig…
sgsutcliffe Jul 2, 2024
d7bd653
Merge branch 'dev' into modify-parameters-iridanext-UI
sgsutcliffe Jul 3, 2024
abcc9d2
pointfinder_database added to IRIDA-Next to use on all samples
sgsutcliffe Jul 4, 2024
a816aa8
Fix linting issues
sgsutcliffe Jul 4, 2024
5c41d29
Adding a parameter to pipeline that seems to be breaking nextflow.con…
sgsutcliffe Jul 5, 2024
8781a5d
Fixed the syntax issues in previous commit
sgsutcliffe Jul 5, 2024
56b5cfc
Added additional CLI arguments
sgsutcliffe Jul 9, 2024
2abef26
prettier fix
sgsutcliffe Jul 9, 2024
8b8ec64
Fixed boolean CLI arguments
sgsutcliffe Jul 10, 2024
7595839
Added default parameters of staramr as pipeline parameters defaults
sgsutcliffe Jul 16, 2024
5cf60c3
Added nf-test to check that all commandline parameters are run
sgsutcliffe Jul 17, 2024
e11ea6d
Changed description of --pointfinder_database parameter
sgsutcliffe Jul 17, 2024
753835d
Added min/max thresholds to parameters
sgsutcliffe Jul 19, 2024
2ac5509
Removed duplicate minimum_contig_length
sgsutcliffe Jul 19, 2024
bdecbc7
Limit memory usage for nf-test
sgsutcliffe Jul 19, 2024
6b41683
Change StAMR Database options
sgsutcliffe Jul 19, 2024
5b34fb4
Change MLST scheme default from None to Automatic
sgsutcliffe Jul 19, 2024
49ca184
Prettier fix
sgsutcliffe Jul 19, 2024
95f3665
Missed a change of None to Automatic
sgsutcliffe Jul 19, 2024
e4114fd
Fix linting issue and limits on max genome size
sgsutcliffe Jul 19, 2024
f48e4a8
Fixed typos
sgsutcliffe Jul 19, 2024
2292e35
Modified parameter descriptions
sgsutcliffe Jul 19, 2024
b4d1e5d
Deleted: Template iGenomes parameter from nf-core
sgsutcliffe Jul 22, 2024
2879bf7
Modifcations (as seen in iridanextexample PR#14) to allow the templat…
sgsutcliffe Jul 22, 2024
247defc
Remove genome params
sgsutcliffe Jul 22, 2024
23278eb
Added all the MLST schemeas available on https://github.com/tseemann/…
sgsutcliffe Jul 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
},
"contigs": {
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.f(ast|n)?a\\.gz$",
"errorMessage": "FASTA file containing assembled contigs, cannot contain spaces and must have extension '.fa.gz' or '.fasta.gz'"
},
Expand Down
60 changes: 56 additions & 4 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,18 @@ process {
def point_db_arg = {String database -> " --pointfinder-organism ${database} " }
def plasmid_db_arg = {String database -> " --plasmidfinder-database-type ${database} " }
def mlst_arg = {String scheme -> " --mlst-scheme ${scheme} " }
def minimum_contig_length_arg = {String min_length -> " --minimum-contig-length ${min_length} "}
def genome_size_lower_bound_arg = {String min_genome -> " --genome-size-lower-bound ${min_genome} "}
def genome_size_upper_bound_arg = {String max_genome -> " --genome-size-upper-bound ${max_genome} "}
def minimum_N50_value_arg = {String min_n50 -> " --minimum-N50-value ${min_n50} "}
def unacceptable_number_contigs_arg = {String min_length -> " --unacceptable-number-contigs ${min_length} "}
def pid_threshold_arg = {String min_pid -> " --pid-threshold ${min_pid} "}
def percent_length_overlap_plasmidfinder_arg = {String min_overlap -> " --percent-length-overlap-plasmidfinder ${min_overlap} "}
def percent_length_overlap_resfinder_arg = {String min_overlap -> " --percent-length-overlap-resfinder ${min_overlap} "}
def percent_length_overlap_pointfinder_arg = {String min_overlap -> " --percent-length-overlap-pointfinder ${min_overlap} "}
def no_exclude_genes_arg = " --no-exclude-genes"
def exclude_negatives_arg = " --exclude-negatives"
def exclude_resistance_phenotypes_arg = " --exclude-resistance-phenotypes"

// Check to see if the database name is valid:
def valid_point_db = {String database -> pointfinder_databases.contains(database)}
Expand All @@ -58,8 +70,8 @@ process {
ext.args = {
[
// Pointfinder database:
params.pointfinder_database && valid_point_db(params.pointfinder_database) ?
point_db_arg(params.pointfinder_database) :
params.pointfinder_database && valid_point_db(convert(params.pointfinder_database)) ?
point_db_arg(convert(params.pointfinder_database)) :
meta.species && valid_point_db(convert(meta.species)) ?
point_db_arg(convert(meta.species)) : "",

Expand All @@ -68,8 +80,48 @@ process {
? plasmid_db_arg(params.plasmidfinder_database) : "",

// MLST scheme:
params.mlst_scheme
? mlst_arg(params.mlst_scheme) : ""
params.mlst_scheme && (params.mlst_scheme != "None")
? mlst_arg(params.mlst_scheme) : "",

// Additional parameters
params.minimum_contig_length
? minimum_contig_length_arg(params.minimum_contig_length.toString()) : "",

params.genome_size_lower_bound
? genome_size_lower_bound_arg(params.genome_size_lower_bound.toString()) : "",

params.genome_size_upper_bound
? genome_size_upper_bound_arg(params.genome_size_upper_bound.toString()) : "",

params.minimum_N50_value
? minimum_N50_value_arg(params.minimum_N50_value.toString()) : "",

params.minimum_contig_length
? minimum_contig_length_arg(params.minimum_contig_length.toString()) : "",

params.unacceptable_number_contigs
? unacceptable_number_contigs_arg(params.unacceptable_number_contigs.toString()) : "",

params.pid_threshold
? pid_threshold_arg(params.pid_threshold.toString()) : "",

params.percent_length_overlap_plasmidfinder
? percent_length_overlap_plasmidfinder_arg(params.percent_length_overlap_plasmidfinder.toString()) : "",

params.percent_length_overlap_resfinder
? percent_length_overlap_resfinder_arg(params.percent_length_overlap_resfinder.toString()) : "",

params.percent_length_overlap_pointfinder
? percent_length_overlap_pointfinder_arg(params.percent_length_overlap_pointfinder.toString()) : "",

params.no_exclude_genes
? no_exclude_genes_arg : "",

params.exclude_negatives
? exclude_negatives_arg : "",

params.exclude_resistance_phenotypes
? exclude_resistance_phenotypes_arg : ""
].join(" ")
}
}
Expand Down
24 changes: 20 additions & 4 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,26 @@ params {
validationShowHiddenParams = false
validate_params = true

//StarAMR options
pointfinder_database = null
plasmidfinder_database = null
mlst_scheme = null
// StarAMR options

// Databases
pointfinder_database = "Automatic Selection"
plasmidfinder_database = "All"
apetkau marked this conversation as resolved.
Show resolved Hide resolved
mlst_scheme = "None"

// Additional CLI arguments
genome_size_lower_bound = 4000000
genome_size_upper_bound = 6000000
emarinier marked this conversation as resolved.
Show resolved Hide resolved
minimum_N50_value = 10000
minimum_contig_length = 300
unacceptable_number_contigs = 1000
pid_threshold = 98
emarinier marked this conversation as resolved.
Show resolved Hide resolved
percent_length_overlap_plasmidfinder = 60
percent_length_overlap_resfinder = 60
percent_length_overlap_pointfinder = 95
no_exclude_genes = false
emarinier marked this conversation as resolved.
Show resolved Hide resolved
exclude_negatives = false
exclude_resistance_phenotypes = false

}

Expand Down
132 changes: 120 additions & 12 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,119 @@
}
}
},
"database": {
"title": "Databases",
"type": "object",
"description": "Select databases to be run on all samples.",
"fa_icon": "fas fa-terminal",
"properties": {
"pointfinder_database": {
"enum": [
"Automatic Selection",
"Enterococcus faecium",
"Enterococcus faecalis",
"Helicobacter pylori",
"Salmonella",
"Campylobacter",
"Escherichia coli"
emarinier marked this conversation as resolved.
Show resolved Hide resolved
],
"description": "Select a single Pointfinder database to use on all samples (overriding metadata option). Validated Organisms: Enterococcus faecium, Enterococcus faecalis, Helicobacter pylori, Salmonella, Campylobacter, Escherichia coli"
},
"plasmidfinder_database": {
"enum": ["All", "gram_positive", "enterobacteriales"],
"description": "The database type to use for plasmidfinder {gram_positive, enterobacteriales}. Defaults to using all available database types to search for plasmids. [All]."
},
"mlst_scheme": {
"type": "string",
apetkau marked this conversation as resolved.
Show resolved Hide resolved
"description": "Specify scheme name, visit https://github.com/tseemann/mlst/tree/master/db/pubmlst for supported scheme genera available. [None]",
apetkau marked this conversation as resolved.
Show resolved Hide resolved
"default": "None"
apetkau marked this conversation as resolved.
Show resolved Hide resolved
}
}
},
"additional_settings": {
"title": "Additional Settings",
"type": "object",
"description": "For advanced changes to staramr",
"properties": {
"genome_size_lower_bound": {
apetkau marked this conversation as resolved.
Show resolved Hide resolved
"type": "integer",
"description": "The lower bound for our genome size for the quality metrics [Default 4000000]",
"default": 4000000,
emarinier marked this conversation as resolved.
Show resolved Hide resolved
"minimum": 1,
"maximum" : 14000000
},
"genome_size_upper_bound": {
"type": "integer",
"description": "The upper bound for our genome size for the quality metrics [Default 6000000].",
"default": 6000000,
"minimum": 1,
"maximum" : 14000000
},
"minimum_N50_value": {
"type": "integer",
"description": "The minimum N50 value for the quality metrics [Defaults 10000]",
"default": 10000,
"minimum": 1,
"maximum" : 14000000

},
"minimum_contig_length": {
"type": "integer",
"description": "The minimum contig length for the quality metrics [Default 300 bp]",
"default": 300,
"minimum": 1,
"maximum" : 14000000
},
"unacceptable_number_contigs": {
"type": "integer",
"description": "The minimum, unacceptable number of contigs which are equal to or above the minimum contig length for our quality metrics [Default 1000]",
"default": 1000,
"minimum": 1,
"maximum" : 500000
},
"pid_threshold": {
"type": "integer",
"description": "BLAST percent identity threshold [Default 98]",
"default": 98,
emarinier marked this conversation as resolved.
Show resolved Hide resolved
"minimum": 1,
"maximum" : 100
},
"percent_length_overlap_plasmidfinder": {
"type": "integer",
"description": "The percent length overlap for resfinder results [Default 60.0]",
emarinier marked this conversation as resolved.
Show resolved Hide resolved
"default": 60,
"minimum": 1,
"maximum" : 100
},
"percent_length_overlap_resfinder": {
"type": "integer",
"description": "The percent length overlap for resfinder results [Default 60.0]",
"default": 60,
"minimum": 1,
"maximum" : 100
},
"percent_length_overlap_pointfinder": {
"type": "integer",
"description": "The percent length overlap for pointfinder results [Default 95.0]",
"default": 95,
"minimum": 1,
"maximum" : 100
},
"no_exclude_genes": {
"type": "boolean",
"description": "Disable the default exclusion of some genes from ResFinder/PointFinder/PlasmidFinder [Default False]"
},
"exclude_negatives": {
"type": "boolean",
"description": "Exclude negative results (those susceptible to antimicrobials) [DefaultFalse]"
apetkau marked this conversation as resolved.
Show resolved Hide resolved
},
"exclude_resistance_phenotypes": {
"type": "boolean",
"description": "Exclude predicted antimicrobial resistances [Default False]."
}
},
"fa_icon": "fas fa-terminal"
},
"reference_genome_options": {
"title": "Reference genome options",
"type": "object",
Expand Down Expand Up @@ -232,6 +345,12 @@
{
"$ref": "#/definitions/input_output_options"
},
{
"$ref": "#/definitions/database"
},
{
"$ref": "#/definitions/additional_settings"
},
{
"$ref": "#/definitions/reference_genome_options"
},
Expand All @@ -244,16 +363,5 @@
{
"$ref": "#/definitions/generic_options"
}
],
"properties": {
"pointfinder_database": {
"type": "string"
},
"plasmidfinder_database": {
"type": "string"
},
"mlst_scheme": {
"type": "string"
}
}
]
}
20 changes: 20 additions & 0 deletions tests/main.nf.test
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ nextflow_pipeline {
params {
input = "$baseDir/tests/assets/test_samplesheet.csv"
outdir = "$baseDir/tests/results"
max_memory = "4.GB"
}
}

Expand Down Expand Up @@ -51,6 +52,25 @@ nextflow_pipeline {
assert ecoli_metadata."Scheme" == "ecoli_achtman_4"
assert ecoli_metadata."Sequence Type" == "678"

// Check the commandline parameters
apetkau marked this conversation as resolved.
Show resolved Hide resolved
// Salmonella
assert path("$baseDir/tests/results/staramr/GCA_000008105_results/GCA_000008105_settings.txt").exists()
def salmonella_settings = new File("$baseDir/tests/results/staramr/GCA_000008105_results/GCA_000008105_settings.txt")
def salmonella_cmd = salmonella_settings.readLines().get(0)
assert salmonella_cmd == "command_line = /usr/local/bin/staramr search --pointfinder-organism salmonella --minimum-contig-length 300 --genome-size-lower-bound 4000000 --genome-size-upper-bound 6000000 --minimum-N50-value 10000 --minimum-contig-length 300 --unacceptable-number-contigs 1000 --pid-threshold 98 --percent-length-overlap-plasmidfinder 60 --percent-length-overlap-resfinder 60 --percent-length-overlap-pointfinder 95 --nprocs 1 -o GCA_000008105_results GCA_000008105.fasta"

// Ecoli
assert path("$baseDir/tests/results/staramr/GCA_000947975_results/GCA_000947975_settings.txt").exists()
def ecoli_settings = new File("$baseDir/tests/results/staramr/GCA_000947975_results/GCA_000947975_settings.txt")
def ecoli_cmd = ecoli_settings.readLines().get(0)
assert ecoli_cmd == "command_line = /usr/local/bin/staramr search --pointfinder-organism escherichia_coli --minimum-contig-length 300 --genome-size-lower-bound 4000000 --genome-size-upper-bound 6000000 --minimum-N50-value 10000 --minimum-contig-length 300 --unacceptable-number-contigs 1000 --pid-threshold 98 --percent-length-overlap-plasmidfinder 60 --percent-length-overlap-resfinder 60 --percent-length-overlap-pointfinder 95 --nprocs 1 -o GCA_000947975_results GCA_000947975.fasta"

// Listeria
assert path("$baseDir/tests/results/staramr/GCF_000196035_results/GCF_000196035_settings.txt").exists()
def listeria_settings = new File("$baseDir/tests/results/staramr/GCF_000196035_results/GCF_000196035_settings.txt")
def listeria_cmd = listeria_settings.readLines().get(0)
assert listeria_cmd == "command_line = /usr/local/bin/staramr search --minimum-contig-length 300 --genome-size-lower-bound 4000000 --genome-size-upper-bound 6000000 --minimum-N50-value 10000 --minimum-contig-length 300 --unacceptable-number-contigs 1000 --pid-threshold 98 --percent-length-overlap-plasmidfinder 60 --percent-length-overlap-resfinder 60 --percent-length-overlap-pointfinder 95 --nprocs 1 -o GCF_000196035_results GCF_000196035.fasta"

// Check CSVTK_concat output (merged_*) files

// merged_detailed_summary.tsv
Expand Down
Loading