-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test on memory map approach using mmappickle.mmapdict
on /dev/shm
instead of PFS on Clusters
#5
Comments
#4 runs on the parallel file system on Noctua clusters with uses lustre. After a discussion with them, it turns out that lustre has a very bad memory management when it comes to memory mapped file. Therefore, storing the memory mapped files in |
Update: the write on memory mapped pickle dictionary in Way faster than lustre but very slow overall. |
Alternate solution, create a B+ tree implementation in C++ |
Update: Might not be needing this approach if using domain specific datasets under issue #9 |
The memory map approach is taking a lot of time to process the index dictionary in the memory mapped file. It took$3$ days to process $41,602$ entities out of $5,037,674$ in a chunk of 10 million triples.
The text was updated successfully, but these errors were encountered: