Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError when running spladder #157

Open
swu13 opened this issue Mar 8, 2022 · 7 comments
Open

IndexError when running spladder #157

swu13 opened this issue Mar 8, 2022 · 7 comments

Comments

@swu13
Copy link

swu13 commented Mar 8, 2022

  • spladder version:3.0.1
  • Python version:3.6
  • Operating System:CentOS

Description

I use spladder to detect splicing events in one sample with bam file, The command is: spladder build --parallel 20 -b *bam -a gencode.v32.annotation.gtf -o .

But it shows index error below. And one of the file (merge_graphs_mult_exon_skip_C3.confirmed.txt.gz) has incomplete information. The last line is only "chr2 + mult_exon_skip.4684 3 ENSG00000064012.21" and contains no alternative splicing events.

What I Did

command(s) 
1. STAR --genomeDir ./STAR_index/  --readFilesIn XX.R1.fastq XX.R2.fastq --runThreadN 20 --outFilterMultimapScoreRange 1 --outFilterMultimapNmax 20 --outFilterMismatchNmax 10 --alignIntronMax 500000 --alignMatesGapMax 1000000 --sjdbScore 2 --alignSJDBoverhangMin 1 --genomeLoad NoSharedMemory --limitBAMsortRAM 70000000000 --readFilesCommand cat --outFilterMatchNminOverLread 0.33 --outFilterScoreMinOverLread 0.33 --sjdbOverhang 100 --outSAMstrandField intronMotif --outSAMattributes NH HI NM MD AS XS --sjdbGTFfile gencode.v32.annotation.gtf --limitSjdbInsertNsj 2000000 --outSAMunmapped None --outSAMtype BAM SortedByCoordinate --outSAMheaderHD @HD VN:1.4  --twopassMode Basic --outSAMmultNmax 1   --quantMode TranscriptomeSAM   --outFileNamePrefix .
2. spladder build --parallel 20 -b *bam -a gencode.v32.annotation.gtf -o .



If there was a crash, please include the traceback here.
``Reporting confirmed mult_exon_skip events:
writing mult_exon_skip events in gff3 format to .//merge_graphs_mult_exon_skip_C3.confirmed.gff3
writing mult_exon_skip events in flat txt format to .//merge_graphs_mult_exon_skip_C3.confirmed.txt.gz
Traceback (most recent call last):
  File "/data/myang9/software/miniconda3/bin/spladder", line 8, in <module>
    sys.exit(main())
  File "/data/myang9/software/miniconda3/lib/python3.9/site-packages/spladder/spladder.py", line 229, in main
    options.func(options)
  File "/data/myang9/software/miniconda3/lib/python3.9/site-packages/spladder/spladder_build.py", line 163, in spladder
    analyze_events(event_type, options.bam_fnames, options)
  File "/data/myang9/software/miniconda3/lib/python3.9/site-packages/spladder/alt_splice/analyze.py", line 191, in analyze_events
    write_events_txt(fn_out_conf_txt, options.samples[sample_idx], events_all, fn_out_count, event_idx=confirmed_idx)
  File "/data/myang9/software/miniconda3/lib/python3.9/site-packages/spladder/alt_splice/write.py", line 89, in write_events_txt
    counts = event_counts_chunk[:, :, i - chunk_idx_event[0]]
IndexError: index 1344 is out of bounds for axis 2 with size 1113`
@lunazhaoxxx
Copy link

I have met the same error using command: spladder build -b xxx.bam -a gencode.v39.annotation.gtf -o xxx
which says:
Reporting confirmed exon_skip events:
writing exon_skip events in gff3 format to xxx/merge_graphs_exon_skip_C3.confirmed.gff3
writing exon_skip events in flat txt format to xxx/merge_graphs_exon_skip_C3.confirmed.txt.gz
Traceback (most recent call last):
File "/usr/local/bin/spladder", line 11, in
load_entry_point('spladder==3.0.2', 'console_scripts', 'spladder')()
File "/usr/local/lib/python3.8/dist-packages/spladder-3.0.2-py3.8.egg/spladder/spladder.py", line 229, in main
options.func(options)
File "/usr/local/lib/python3.8/dist-packages/spladder-3.0.2-py3.8.egg/spladder/spladder_build.py", line 163, in spladder
analyze_events(event_type, options.bam_fnames, options)
File "/usr/local/lib/python3.8/dist-packages/spladder-3.0.2-py3.8.egg/spladder/alt_splice/analyze.py", line 191, in analyze_events
write_events_txt(fn_out_conf_txt, options.samples[sample_idx], events_all, fn_out_count, event_idx=confirmed_idx)
File "/usr/local/lib/python3.8/dist-packages/spladder-3.0.2-py3.8.egg/spladder/alt_splice/write.py", line 90, in write_events_txt
psi = psi_chunk[:, i - chunk_idx_psi[0]]
IndexError: index 2354 is out of bounds for axis 1 with size 1015

@huguesfontenelle
Copy link

I'm running into this as well, unfortunately.
I placed a breakpoint() around the

counts = event_counts_chunk[:, :, i - chunk_idx_event[0]]

line, but can't really figure out how it works.

@akahles @warrenmcg @izcram @ratsch
What would you need to reproduce this? Would the following files be sufficient:

  • merge_graphs_exon_skip_C3.confirmed.pickle
  • merge_graphs_exon_skip_C3.counts.hdf5
  • merge_graphs_exon_skip_C3.pickle

Thanks!

@dlsoltero
Copy link

I got this same problem while testing a sample of a bam, but when I tried the complete bam it worked ok. I just mention it because it may help to figure out the problem.

@riasc
Copy link

riasc commented Jul 20, 2023

Has anyone resolved this? I have it both on small datasets and the whole BAM file. I discovered that the problem occurs mainly when aligning with STAR, but not when using BWA. Is there something that needs to be considered?

It occurs on exon_skip events:

Reporting confirmed exon_skip events:
writing exon_skip events in gff3 format to testspladderSTAR_exonskip/merge_graphs_exon_skip_C3.confirmed.gff3
writing exon_skip events in flat txt format to testspladderSTAR_exonskip/merge_graphs_exon_skip_C3.confirmed.txt.gz
Traceback (most recent call last):
File "/home/sej9799/mambaforge/envs/spladder/bin/spladder", line 8, in
sys.exit(main())
File "/home/sej9799/mambaforge/envs/spladder/lib/python3.10/site-packages/spladder/spladder.py", line 229, in main
options.func(options)
File "/home/sej9799/mambaforge/envs/spladder/lib/python3.10/site-packages/spladder/spladder_build.py", line 163, in spladder
analyze_events(event_type, options.bam_fnames, options)
File "/home/sej9799/mambaforge/envs/spladder/lib/python3.10/site-packages/spladder/alt_splice/analyze.py", line 191, in analyze_events
write_events_txt(fn_out_conf_txt, options.samples[sample_idx], events_all, fn_out_count, event_idx=confirmed_idx)
File "/home/sej9799/mambaforge/envs/spladder/lib/python3.10/site-packages/spladder/alt_splice/write.py", line 89, in write_events_txt
counts = event_counts_chunk[:, :, i - chunk_idx_event[0]]
IndexError: index 7879 is out of bounds for axis 2 with size 2545

@huguesfontenelle
Copy link

I solved that for "my" case.

Context: I use a BAM file with a single chromosome for testing.
Possible cause: The GTF annotation file refers chromosomes / contigs that are not present in the BAM file.
Solution: slice the GTF to include only the chromosomes / contigs present in the BAM.

@akahles
Copy link
Member

akahles commented Jul 22, 2023

Thanks a lot for reporting this. I have difficulty reproducing the issue. Would one of you be able to provide a minimal failing example? This would greatly speed up the debugging on my end. Usually, it should not be a problem if the chromosome/contig sets in alignment and annotation files are not the same. As long as the intersection is not empty, SplAdder should generate output.

Best,
Andre

@riasc
Copy link

riasc commented May 19, 2024

So this basiclally means that no output can be generated? Is there a way to generate normal normal even when no output is going to be generated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants