Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition when reading speclib from two instances #180

Open
mschwoer opened this issue Jun 12, 2024 · 7 comments
Open

Race condition when reading speclib from two instances #180

mschwoer opened this issue Jun 12, 2024 · 7 comments
Assignees

Comments

@mschwoer
Copy link
Contributor

Describe the bug
When two alphaDIA instance access the same speclib file (for reading) in the same instance of time, alphabase throws an error (see below).

Expected behavior
No error, as the file is just opened for reading in this cases
I think the problem is that files are opened for 'a' here (hdf.py)

class HDF_File(HDF_Group):
    def __init__():
...
        mode = "w" if delete_existing else "a"
        with h5py.File(file_name, mode):  # , swmr=True):
            pass

Logs

[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO - 0:00:00.022445 INFO: Running DynamicLoader
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO - 0:00:00.027390 INFO: Loading .hdf library from /fs/hela_hybrid.small.hdf
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO - 0:00:00.031234 INFO: Traceback (most recent call last):
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "/fs/home/xx/conda-envs/alphadia-1.6.2/lib/python3.11/site-packages/alphadia/cli.py", line 333, in run
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -     plan = Plan(
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -            ^^^^^
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "/fs/home/kraken/conda-envs/alphadia-1.6.2/lib/python3.11/site-packages/alphadia/planning.py", line 126, in __init__
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -     self.load_library()
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "/fs/home/kraken/conda-envs/alphadia-1.6.2/lib/python3.11/site-packages/alphadia/planning.py", line 205, in load_library
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -     spectral_library = dynamic_loader(self.library_path)
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "/fs/home/kraken/conda-envs/alphadia-1.6.2/lib/python3.11/site-packages/alphadia/libtransform.py", line 40, in __call__
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -     return self.forward(*args)
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -            ^^^^^^^^^^^^^^^^^^^
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "/fs/home/kraken/conda-envs/alphadia-1.6.2/lib/python3.11/site-packages/alphadia/libtransform.py", line 121, in forward
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -     library.load_hdf(input_path, load_mod_seq=True)
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "/fs/home/kraken/conda-envs/alphadia-1.6.2/lib/python3.11/site-packages/alphabase/spectral_library/base.py", line 681, in load_hdf
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -     _hdf = HDF_File(
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -            ^^^^^^^^^
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "/fs/home/kraken/conda-envs/alphadia-1.6.2/lib/python3.11/site-packages/alphabase/io/hdf.py", line 533, in __init__
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -     with h5py.File(file_name, mode):#, swmr=True):
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -          ^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "/fs/home/kraken/conda-envs/alphadia-1.6.2/lib/python3.11/site-packages/h5py/_hl/files.py", line 562, in __init__
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -     fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "/fs/home/kraken/conda-envs/alphadia-1.6.2/lib/python3.11/site-packages/h5py/_hl/files.py", line 247, in make_fid
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -     fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO -   File "h5py/h5f.pyx", line 102, in h5py.h5f.open
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO - BlockingIOError: [Errno 11] Unable to synchronously open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
[2024-06-12, 15:19:11 UTC] {ssh.py:526} INFO - 
@jalew188
Copy link
Collaborator

Try export HDF5_USE_FILE_LOCKING='FALSE' before running multiple tasks

@jalew188
Copy link
Collaborator

To be honest, I also don't know why it is a instead of r here...

@jalew188
Copy link
Collaborator

Try export HDF5_USE_FILE_LOCKING='FALSE' before running multiple tasks

@mschwoer Is the issue solved by this command?

@mschwoer
Copy link
Contributor Author

didn't check yet..
but generally, I feel that file locking has its benefits, to prevent corruption by simultaneous writing.
So disabling it would make things less robuts. But could we not just change the "a" into an "r" in the piece of code mentioned above?

@jalew188
Copy link
Collaborator

I think Sander use "a" instead of "r" for a purpose ... I think we should add a readonly kwargs to the HDF reader

@mschwoer
Copy link
Contributor Author

mschwoer commented Jul 1, 2024

there is already a read_only parameter.. can't we leverage that like

        if delete_existing:
            mode = "w"
        elif read_only:
            mode = "r"
        else:
            mode = "a"
        with h5py.File(file_name, mode):  # , swmr=True):
            pass

@jalew188
Copy link
Collaborator

I would agree this solution

there is already a read_only parameter.. can't we leverage that like

        if delete_existing:
            mode = "w"
        elif read_only:
            mode = "r"
        else:
            mode = "a"
        with h5py.File(file_name, mode):  # , swmr=True):
            pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants