You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Out of curiosity, I was running Camilo's small mixed dataset (mixed.csv), ran into an issue with the fungal sample (see issue 65), and I replaced the fusarium short read accession with a long read accession, just to see what would happen.
Not surprisingly, it is treating the raw long read data as a short read sample since I didn't include the nanopore or pacbio columns containing file paths in the metadata sheet.
Is there a way to include a detection step so that if the reads pulled from sra happen to be long reads, it will run the necessary 150 bp seqkit read trim step, even if a user didn't explicitly include nanopore or pacbio columns?
Fastqc was run on the long read sample as well--which may be ok, given recent updates to that program, but may require more research.
This isn't so much a bug, but something that users might encounter and ask about unless we are explicit in the documentation that the sra option is best used if pulling short read data only.
In short, it would be nice to have a feature to detect what types of reads are downloaded from SRA.
The text was updated successfully, but these errors were encountered:
This should be working now in the dev branch. When the user specifies an NCBI accession, but not a sequence type, it will be looked up. The sequence type is still required when specifying a local file as input.
Description of feature
Out of curiosity, I was running Camilo's small mixed dataset (mixed.csv), ran into an issue with the fungal sample (see issue 65), and I replaced the fusarium short read accession with a long read accession, just to see what would happen.
Not surprisingly, it is treating the raw long read data as a short read sample since I didn't include the nanopore or pacbio columns containing file paths in the metadata sheet.
Is there a way to include a detection step so that if the reads pulled from sra happen to be long reads, it will run the necessary 150 bp seqkit read trim step, even if a user didn't explicitly include nanopore or pacbio columns?
Fastqc was run on the long read sample as well--which may be ok, given recent updates to that program, but may require more research.
This isn't so much a bug, but something that users might encounter and ask about unless we are explicit in the documentation that the sra option is best used if pulling short read data only.
In short, it would be nice to have a feature to detect what types of reads are downloaded from SRA.
The text was updated successfully, but these errors were encountered: