bioawk does not stop parsing a file on `nextfile` #45

RamRS · 2023-02-27T19:49:14Z

Scenario: I have 4 FASTQ files in my current directory, and I want to print the first 10 seq names of each file. Ideally, the following command should work very quickly:

bioawk -c fastx 'FNR<11 {print $name} FNR==11{nextfile}' *.fastq

For example, with regular awk, the following works extremely fast:

awk 'FNR==1{print} FNR>1{nextfile}' *.fastq

But bioawk does not stop processing the file when it encounters nextfile. Instead, it seems to continue parsing the whole file, which is crazy when dealing with gigantic files. Plus, it seems to skip a file when running into the nextfile command, which is weird.

The text was updated successfully, but these errors were encountered:

malcook · 2023-10-11T16:27:40Z

Agreed. I concur. Calling nextfile as in the example above causes every other .fastq file to be processed (e.g. files in even positions in ARGV will be skipped).

This is rather troublesome.

Might this be the consequence of stated Potential limitations @lh3 ?

bioawk --version
awk version 20110810

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bioawk does not stop parsing a file on `nextfile` #45

bioawk does not stop parsing a file on `nextfile` #45

RamRS commented Feb 27, 2023

malcook commented Oct 11, 2023 •

edited

Loading

bioawk does not stop parsing a file on nextfile #45

bioawk does not stop parsing a file on nextfile #45

Comments

RamRS commented Feb 27, 2023

malcook commented Oct 11, 2023 • edited Loading

bioawk does not stop parsing a file on `nextfile` #45

bioawk does not stop parsing a file on `nextfile` #45

malcook commented Oct 11, 2023 •

edited

Loading