You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Looks like there's a bit of a circular issue emerging from interplay between definition of the named .fasta set (which has duplicate records within a genome dropped https://github.com/nhoffman/ya16sdb/blob/master/SConstruct#L496), and the logic which adds all type strain records back to the 'trusted' .fasta output (and BLAST db). The outcome of this is that the trusted BLASTdb contains dropped duplicate alleles for some seqs within is_type genomes, and these records lack info about the nearest type strain, since the named fa is used as a target in https://github.com/nhoffman/ya16sdb/blob/master/SConstruct#L737
Possible solutions:
build 'named_type_hits' from the trusted fa, either instead of or in addition to the current vsearch output
add logic to prevent drop of duplicate alleles from type records
No records for this allele in the 'named' set, its a duplicate allele of another for the NZ_CP056776 genome
Examining the 'trusted' set confirms the presence of the record seqs.fasta (which would also be the target for BLAST db used in the pipeline output where bug was detected).
Looks like there's a bit of a circular issue emerging from interplay between definition of the
named
.fasta set (which has duplicate records within a genome dropped https://github.com/nhoffman/ya16sdb/blob/master/SConstruct#L496), and the logic which adds all type strain records back to the 'trusted' .fasta output (and BLAST db). The outcome of this is that the trusted BLASTdb contains dropped duplicate alleles for some seqs withinis_type
genomes, and these records lack info about the nearest type strain, since thenamed
fa is used as a target in https://github.com/nhoffman/ya16sdb/blob/master/SConstruct#L737Possible solutions:
The third option seems easiest implementation-wise.
The text was updated successfully, but these errors were encountered: