If I want to include a long read accession from NCBI, is there a way to detect that reads are long and not short? #68

masudermann · 2024-05-02T00:17:38Z

Description of feature

Out of curiosity, I was running Camilo's small mixed dataset (mixed.csv), ran into an issue with the fungal sample (see issue 65), and I replaced the fusarium short read accession with a long read accession, just to see what would happen.

Not surprisingly, it is treating the raw long read data as a short read sample since I didn't include the nanopore or pacbio columns containing file paths in the metadata sheet.

Is there a way to include a detection step so that if the reads pulled from sra happen to be long reads, it will run the necessary 150 bp seqkit read trim step, even if a user didn't explicitly include nanopore or pacbio columns?

Fastqc was run on the long read sample as well--which may be ok, given recent updates to that program, but may require more research.

This isn't so much a bug, but something that users might encounter and ask about unless we are explicit in the documentation that the sra option is best used if pulling short read data only.

In short, it would be nice to have a feature to detect what types of reads are downloaded from SRA.

zachary-foster · 2024-07-02T16:29:30Z

This should be working now in the dev branch. When the user specifies an NCBI accession, but not a sequence type, it will be looked up. The sequence type is still required when specifying a local file as input.

masudermann added the enhancement New feature or request label May 2, 2024

zachary-foster closed this as completed Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If I want to include a long read accession from NCBI, is there a way to detect that reads are long and not short? #68

If I want to include a long read accession from NCBI, is there a way to detect that reads are long and not short? #68

masudermann commented May 2, 2024 •

edited

Loading

zachary-foster commented Jul 2, 2024

If I want to include a long read accession from NCBI, is there a way to detect that reads are long and not short? #68

If I want to include a long read accession from NCBI, is there a way to detect that reads are long and not short? #68

Comments

masudermann commented May 2, 2024 • edited Loading

Description of feature

zachary-foster commented Jul 2, 2024

masudermann commented May 2, 2024 •

edited

Loading