Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeating error mesage: [W::hts_idx_load3] The index file is older than the data file: #244

Open
zli-lilly opened this issue Sep 23, 2024 · 10 comments
Labels
input data Issue is caused by input data question Further information is requested

Comments

@zli-lilly
Copy link

Thank you for developing such a great tool. Just got an error message hope to get your help with.
The error message repeatively shows below line seemingly without progressing to the next step. Could you please recommend the best practice of handling this error? Thank you.
[W::hts_idx_load3] The index file is older than the data file:

@andrewprzh
Copy link
Collaborator

andrewprzh commented Sep 23, 2024

@zli-lilly

Thanks for the feedback!

This message itself is not a problem. It simply means your .bai index has an older modification date than the BAM file itself.
It can happen if the files were copied from another location, and the index file was copied first.
If you want to get rid of this message, simply rebuild the index files with samtools index.

Best
Andrey

@zli-lilly
Copy link
Author

Thank you, Andrey. Much appreciated.

@andrewprzh andrewprzh added question Further information is requested input data Issue is caused by input data labels Sep 23, 2024
@zli-lilly
Copy link
Author

Hey Andrey. I ignored the warning messages as you suggested. The pipeline was terminated abruptly after several hours. Attached is the log file. Could you please help me take a look? Your help is greatly apprecaited.
isoquant.log

@andrewprzh
Copy link
Collaborator

@zli-lilly

The cause of the error is unknown, looks like one the threads was killed, i.e. possibly due to RAM consumption of CPU quotas on the server. There is no failure in IsoQuant itself.

On the other topic, you have other warning about your annotation
Gene LOC102142360 has no exons / transcripts, check your input annotation
This suggest something is wrong with your GTF, could send me a few examples, e.g. with this particular gene?

@zli-lilly
Copy link
Author

Hey Andrey,
Below are the CPU core and memory for the batch header of the run. Would you suggest other settings?
#SBATCH -c 4
#SBATCH --mem 100G # Total size of memory
I also attached the LOC102142360 example which is a peusdo gene from a Cyno monkey assembly. Any input would be super helpful.
LOC102142360.txt

@andrewprzh
Copy link
Collaborator

Typically 100G should be enough. The error does not tell anything meaningful, could you re-run IsoQuant to see if it reproduces?

Regarding the GTF file, are there any transcripts/exons belonging to this gene?

@zli-lilly
Copy link
Author

I just resubmitted the job yesterday before commenting the error. And the new log showed the same error message.
About LOC102142360, there are no transcripts/exons available for this gene.

@andrewprzh
Copy link
Collaborator

Could you show me how do you submit the job? Could you send the second log as well?

About LOC102142360, there are no transcripts/exons available for this gene.

That's a bit odd, IsoQuant expects genes to have transcripts and exons, otherwise it cannot process them.

@zli-lilly
Copy link
Author

I've attached the sh (as txt to meet upload requirement) and log file here. I would simply "sbatch" the sh job to HPC.
Nanopore_isoquant.txt
isoquant.log
The assembly is a Cyno species with lots of peudo genes. Would IsoQuant simply ignore those or I should manually remove them from the GTF?

@andrewprzh
Copy link
Collaborator

I see that now the error message occurred in a different moment, so something kills one of the IsoQuant processes. The problem doesn't seem be on IsoQuant side. You may try requesting more RAM or contacting your system administrator, system logs might have some information.

The assembly is a Cyno species with lots of peudo genes. Would IsoQuant simply ignore those or I should manually remove them from the GTF?

Not a problem, they will be simply ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
input data Issue is caused by input data question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants