Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebase hetzy mod to 1.2.1 upstream #77

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

macieksk
Copy link

@macieksk macieksk commented Apr 1, 2021

This is an adjusted rebased version of pull request #58

The original artic_vcf_filter --medaka (used in Artic Nanopore Medaka pipeline) filters out heterozygotic variants completely. This causes omissions of otherwise good mosaic variants present in sequenced virus samples. For example, a proper variant present in only 70% of reads used to be filtered out. This patch adds options for a more precise control of heterozygotic variants filtering with moderately permissive defaults, which should filter out nanopore homopolymer false positives.
Old behavior can be enabled with `--hetmf Inf'.

usage: artic_vcf_filter [-h] [--nanopolish] [--medaka]
                        [--no-frameshifts]
                        [--heterozygotic-min-fraction HETMF]
                        [--heterozygotic-min-reads HETMR]
                        inputvcf output_pass_vcf output_fail_vcf

positional arguments:
  inputvcf
  output_pass_vcf
  output_fail_vcf

optional arguments:
  -h, --help            show this help message and exit
  --nanopolish
  --medaka
  --no-frameshifts
  --heterozygotic-min-fraction HETMF, --hetmf HETMF
                        minimal fraction of alternate allele reads for a
                        heterozygotic variant to be accepted (for medaka filter) (default: 0.5)
  --heterozygotic-min-reads HETMR, --hetmr HETMR
                        minimal number of alternate allele reads for a
                        heterozygotic variant to be accepted (for medaka filter) (default: 12)

An example of hetereozygotic variant accepted with the default parameters.
MN908947.3 24872 . G T 500.0 PASS DP=400;AC=120,227;AM=53;MC=0;MF=0.0;MB=0.0;AQ=11.48;GM=1;PH=6.02,6.02,6.02,6.02;SC =None; GT:GQ:PS:UG:UQ 0/1:147.24:.:0/1:147.24

An example of filtered out homopolymer false positive.
> MN908947.3 10527 . C CT 96.06 PASS DP=398;AC=130,59;AM=209;MC=0;MF=0.0;MB=0.0;AQ=7.4;GM=1;PH=6.02,6.02,6.02,6.02;SC=None; GT:GQ:PS:UG:UQ 0/1:96.06:.:0/1:96.06

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant