You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think we need to revisit our trimming parameters for ThruPlex NGS16S libraries (more an issue for the clampi pipeline than this one). If it's straightforward to do, it might be helpful to preserve the reads filtered out at filterAndTrim for troubleshooting. I thought there was a flag in dada2 that would do this, but I didn't see one on re-reading the docs so this would not be nearly as simple as I was thinking, and depending upon the effort involved might not be worth it (ie, watching what remains may serve just as well). I talked with Chris and Noah about this last week, and said I'd follow up in github, but from my perspective no need to jump on this right away.
This is likely due to minimum length requirements we've set in the dada2 parameters. When I spoke with Chris and Noah last week, I was somewhat mystified because I'd done an analysis where I reduced minLen from 100 to 20, but the filtering numbers didn't budge https://gitlab.labmed.uw.edu/molmicro/ops/-/issues/1912#note_127852.
Hi @marykstewart , I have a PR to include filter_and_trim filtered and dropped sequences in the dada2-nf output. I will coordinate with @dhoogest to make sure they are be available for you in 16S pipeline to analyze
I think we need to revisit our trimming parameters for ThruPlex NGS16S libraries (more an issue for the clampi pipeline than this one). If it's straightforward to do, it might be helpful to preserve the reads filtered out at filterAndTrim for troubleshooting. I thought there was a flag in dada2 that would do this, but I didn't see one on re-reading the docs so this would not be nearly as simple as I was thinking, and depending upon the effort involved might not be worth it (ie, watching what remains may serve just as well). I talked with Chris and Noah about this last week, and said I'd follow up in github, but from my perspective no need to jump on this right away.
Chris, the lines I'd commented out to preserve the work/ directory are 742-746 https://gitlab.labmed.uw.edu/molmicro/clampi-ngs/-/blob/dev/SConstruct. Doing so still preserves the directory, but it looks to be empty now.
The Background:
We're observing significant read loss at the dada2 filter with ThruPlex NGS16S library preps, compare this ThruPlex/standard pair: https://share.labmed.uw.edu/molmicro/markergene/24N0316_SRS16S/report/
https://share.labmed.uw.edu/molmicro/markergene/24N0300_NGS16S/report/
This is likely due to minimum length requirements we've set in the dada2 parameters. When I spoke with Chris and Noah last week, I was somewhat mystified because I'd done an analysis where I reduced minLen from 100 to 20, but the filtering numbers didn't budge https://gitlab.labmed.uw.edu/molmicro/ops/-/issues/1912#note_127852.
I read the dada2 docs again and realized that while we do have a minimum length of 100 set in the pipeline, it's not actually doing anything b/c it operates after truncLen. We have truncLen set to R1=240 and R2=220 (https://gitlab.labmed.uw.edu/molmicro/clampi-ngs/-/blob/dev/data/dada_params_16S.json), and reads shorter than truncLen-trimLeft(=20) will be discarded (https://www.bioconductor.org/packages/release/bioc/manuals/dada2/man/dada2.pdf). So heavy losses at filterAndTrim are not surprising, given the adapter positions in reads of ThruPlex NGS16S libraries, see examples
https://share.labmed.uw.edu/molmicro/markergene/24N0316_SRS16S/24R215_fastqc/multiqc_report.html#fastqc_adapter_content
https://share.labmed.uw.edu/molmicro/markergene/24N0315_SRS16S/fastqc_24R213/multiqc_report.html#fastqc_adapter_content
https://share.labmed.uw.edu/molmicro/markergene/24N0329_SRS16S/fastqc_24R224/multiqc_report.html#fastqc_adapter_content
The text was updated successfully, but these errors were encountered: