Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core dumped during sorting KMC database #25

Closed
shenwei356 opened this issue Apr 13, 2023 · 4 comments
Closed

core dumped during sorting KMC database #25

shenwei356 opened this issue Apr 13, 2023 · 4 comments

Comments

@shenwei356
Copy link

Hi @jnalanko , I tried to test themisto with 101 microbial genomes, but it crashed in the step of Sorting KMC database. The disk has enough space. And the same error occurred in the second trial.

$ fd fna.gz$ 101genomes/ > t.txt

$ themisto build -t 16 --temp-dir t -f -i t.txt -k 21 -o themisto 

42.4310 Thu Apr 13 09:16:31 2023 Themisto-3.0.0-13-gfea4f59
42.5140 Thu Apr 13 09:16:31 2023 Maximum k-mer length (size of the de Bruijn graph node labels): 31
43.0990 Thu Apr 13 09:16:31 2023 Build configuration:
Sequence file = t.txt
Index de Bruijn graph output file = themisto.tdbg
Index coloring output file = themisto.tcolors
Temporary directory = t
k = 21
Reverse complements = true
Number of threads = 16
Memory gigabytes = 2
Manual colors = false
Sequence colors = false
File colors = true
Load DBG = false
Handling of non-ACGT characters = delete
Coloring structure type: sdsl-hybrid
Verbosity = normal
43.1500 Thu Apr 13 09:16:31 2023 Starting
43.1670 Thu Apr 13 09:16:31 2023 Running GGCAT
Allocator initialized: mem: 2 GiB chunks: 8192 log2: 18
Started phase: reads bucketing prev stats: 
Temp buckets files size: 2.01 MiB
Finished phase: reads bucketing. phase duration: 2.87s gtime: 2.87s
Started phase: kmers merge prev stats: 
Processing bucket 295 of [1024[R:9599]]  ptime: 10.52s gtime: 13.39s phase eta: 27s est. tot: 37s
Processing bucket 576 of [1024[R:18999]]  ptime: 20.72s gtime: 23.60s phase eta: 16s est. tot: 37s
Processing bucket 847 of [1024[R:28284]]  ptime: 30.93s gtime: 33.80s phase eta: 6s est. tot: 37s
Total color subsets: 124451
Finished phase: kmers merge. phase duration: 38.44s gtime: 41.31s
Started phase: hashes sorting prev stats: 
Finished phase: hashes sorting. phase duration: 4.78s gtime: 46.09s
Started phase: links compaction prev stats: 
Iteration: 2
Remaining: 68628492  ptime: 14.27s gtime: 60.35s
Iteration: 6
Remaining: 15103529  ptime: 22.37s gtime: 68.45s
Completed compaction with 23 iters
Finished phase: links compaction. phase duration: 27.39s gtime: 73.48s
Started phase: reads reorganization prev stats: 
Finished phase: reads reorganization. phase duration: 6.92s gtime: 80.40s
Started phase: unitigs building prev stats: 
Finished phase: unitigs building. phase duration: 11.54s gtime: 91.94s
Started phase: maximal unitigs links building [step 1] prev stats: 
Finished phase: maximal unitigs links building [step 1]. phase duration: 657.82ms gtime: 92.60s
Started phase: maximal unitigs links building [step 2] prev stats: 
Finished phase: maximal unitigs links building [step 2]. phase duration: 1.05s gtime: 93.65s
Started phase: maximal unitigs links building [step 3] prev stats: 
Finished phase: maximal unitigs links building [step 3]. phase duration: 3.97s gtime: 97.62s
Compacted De Bruijn graph construction completed.
TOTAL TIME: 97.62s
Final stats:
        phase: reads bucketing  => 2.87s
        phase: kmers merge      => 38.44s
        phase: hashes sorting   => 4.78s
        phase: links compaction         => 27.39s
        phase: reads reorganization     => 6.92s
        phase: unitigs building         => 11.54s
        phase: maximal unitigs links building [step 1]  => 657.82ms
        phase: maximal unitigs links building [step 2]  => 1.05s
        phase: maximal unitigs links building [step 3]  => 3.97s
113923.5230 Thu Apr 13 09:18:25 2023 Building SBWT
113923.5740 Thu Apr 13 09:18:25 2023 Running KMC counter
**********************************************************************************************************************************
Stage 1: 100%
Stage 2: 100%


135340.1850 Thu Apr 13 09:18:46 2023 Sorting KMC database
in1: 0% Illegal instruction (core dumped)

$ ll t/
total 15G
-rw-r--r-- 1 shenwei shenwei 601K Apr 13 09:33 f26cCPUbWf.colors.dat
-rw-r--r-- 1 shenwei shenwei 3.2G Apr 13 09:34 f26cCPUbWf.fa
-rw-r--r-- 1 shenwei shenwei 1.5G Apr 13 09:34 iYDXFd2eZn.fa
-rw-r--r-- 1 shenwei shenwei 5.1M Apr 13 09:34 kmers1WiOuZI6Ad.kmc_pre
-rw-r--r-- 1 shenwei shenwei 9.5G Apr 13 09:34 kmers1WiOuZI6Ad.kmc_suf

$ ll  themisto.t*
-rw-r--r-- 1 shenwei shenwei 0 Apr 13 09:32 themisto.tcolors
-rw-r--r-- 1 shenwei shenwei 0 Apr 13 09:32 themisto.tdbg
@jnalanko
Copy link
Collaborator

Hello!

This is probably related to this issue: #24

I assume you are using the pre-compiled binaries? There is a known issue with those binaries that explains this error message. I am putting out new official binaries today, which should fix the issue.

@jnalanko
Copy link
Collaborator

Version 3.1 binaries are now out: https://github.com/algbio/themisto/releases/tag/v3.1.0

We have made changes to the build, which fixes the unknown instruction issue on our machines. Could you try the binaries in the new release?

@shenwei356
Copy link
Author

It works, and the query is fast!

@jnalanko
Copy link
Collaborator

Glad to hear! Issue closed :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants