coverage-based instead of counter-based normalisation #71

MarkusHaak · 2021-01-29T11:32:23Z

This pull request is to address normalisation problems we encountered while experimenting with sequencing SARS-CoV2 using long amplicons (https://www.biorxiv.org/content/10.1101/2020.05.28.122648v3) and rapid sequencing kits. In these cases, the amplicon coverage essentially follows a normal distribution and counter-based normalisation often leads to low coverage terminal regions close to the overlaps of two amplicons.

Instead of simply counting the number of reads for each primer pair, the coverage of both strands is tracked in terms of start and end points of alignments. A read is dropped only if the strand-specific coverage of every position in the aligned region is already equal to or above the requested normalisation threshold. In most cases, this should only marginally influence the behaviour of the align_trim script in that it makes the normalisation threshold a lower boundary instead of an upper boundary.

While the coverage is tracked for each strand individually, it is currently not tracked individually for each amplicon in overlap regions. Even though I cannot think of a scenario where this might be problematic, I wanted to mention this in case this is of importance in any use case.

…ormalisation Instead of simply counting the number of reads for each primer pair, the coverage of both strands is tracked in terms of start and end points of alignments. A read is dropped only if the strand-specific coverage of every position in the aligned region is already equal to or above the requested normalisation threshold. In most cases, this should only marginally influence the behaviour of the align_trim script in that it makes the normalisation threshold a lower boundary instead of an upper boundary. But this is of importance for sequencing experiments with long amplicons and a rapid sequencing kit, where amplicon coverage essentially follows a normal distribution.

Markus Haak and others added 2 commits January 29, 2021 11:35

Merge branch '1.3.0-dev' into feature/mappingBasedNormalisation

d5bc599

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coverage-based instead of counter-based normalisation #71

coverage-based instead of counter-based normalisation #71

MarkusHaak commented Jan 29, 2021

coverage-based instead of counter-based normalisation #71

Are you sure you want to change the base?

coverage-based instead of counter-based normalisation #71

Conversation

MarkusHaak commented Jan 29, 2021