Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bioawk does not stop parsing a file on nextfile #45

Open
RamRS opened this issue Feb 27, 2023 · 1 comment
Open

bioawk does not stop parsing a file on nextfile #45

RamRS opened this issue Feb 27, 2023 · 1 comment

Comments

@RamRS
Copy link

RamRS commented Feb 27, 2023

Scenario: I have 4 FASTQ files in my current directory, and I want to print the first 10 seq names of each file. Ideally, the following command should work very quickly:

bioawk -c fastx 'FNR<11 {print $name} FNR==11{nextfile}' *.fastq

For example, with regular awk, the following works extremely fast:

awk 'FNR==1{print} FNR>1{nextfile}' *.fastq

But bioawk does not stop processing the file when it encounters nextfile. Instead, it seems to continue parsing the whole file, which is crazy when dealing with gigantic files. Plus, it seems to skip a file when running into the nextfile command, which is weird.

@malcook
Copy link

malcook commented Oct 11, 2023

Agreed. I concur. Calling nextfile as in the example above causes every other .fastq file to be processed (e.g. files in even positions in ARGV will be skipped).

This is rather troublesome.

Might this be the consequence of stated Potential limitations @lh3 ?

bioawk --version
awk version 20110810

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants