https://docs.google.com/document/d/1VQN_7j0omnzKNc566jfe39stZVhXsfG2feNEkaJ4cFc/edit
- Optimize GFF format definition and usability
- Detect methodology accuracy due to tools and some experimental step in the protocols.
https://www.ncbi.nlm.nih.gov/pubmed/30010675
http://www.biorxiv.org/content/biorxiv/early/2017/05/17/113050.full.pdf
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=whipakmajrwprcv&acc=GSE94586
https://academic.oup.com/nar/article/47/5/2630/5271499
https://www.biorxiv.org/content/10.1101/445437v3
Still to be published, another study to compare protocols using the mirxplor sample.
Evaluation of methodologies for microRNA biomarker detection by next generation sequencing https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6161688/ issue: wrong ID number to download data. Contacted the author to get the data.
Systematic assessment of commercially available low-input miRNA library preparation kits https://www.biorxiv.org/content/10.1101/702456v1.full No data yet.
Trimming was done with the smadann nexflow pipeline.
The following command was used for each study and type of data:
nextflow run mirtop/smadann --csv totrim.csv -c ../../om-profile.config --outdir trimmed -qs 10
Analysis was done with [bowtie] + [mirtop] in a [snakemake] file located in each study and data type.
snakemake -p -s run.snakefile
Mirxplor reference was parsed to use only synthetic with an edit distance of 4 or more, and the alignments were filtered to keep only reads that mapped to those unique synthetic with a maximum of 4 changes. Code used for this is at analysis folder.
Data is available for anyone at aws mirtop space.
Currently contains: tewari, wrigth, kim and dsrg data.
For human data we use miRBase22 to map all sequences. Same filtering step were used here.
Data is available for anyone at aws mirtop space.
- bcbio smallRNA-seq pipeline + isomiRs - On charge Lorena Pantano
- isomiR-SEA - On charge Gianvito Urgese
- ChimiRa, miRge - On charge Marck Halushka
- sRNAbench - On charge Michael Hackenberg
- Prost - Thomas Desvignes
- miRGe - Marc K. Halushka
- (Add your tool here and person will do it)
- Reproducibility of replicates
- Reproducibility of protocols
- Reproducibility of tools
Updated report can be found here
- Select random public data
- Run with all the tools listed above
- Put data in common space
- Adapt output tools to GFF format
Sample SRR5756178 is a whole blood small RNA-seq run from this manuscript https://academic.oup.com/nar/article/4080663 and is part of project PRJNA391912. It has ~ 2.8 million reads, of which ~2.6 million are miRNAs.
Benchmark was done with synthetic isomiRs for one human miRNA, see results.