Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add process lock for remote filesystem scraper #562

Merged
merged 3 commits into from
Jun 6, 2024
Merged

Conversation

ml-evs
Copy link
Member

@ml-evs ml-evs commented Feb 4, 2024

Closes #539.

This PR adds a (manual) database lock so that multiple processes do not try to scrape the same filesystems repeatedly, which is the case in production when there are multiple worker processes running the API.

Still need to add unit tests for this; i.e., manually updating the lock and making sure that stale locks (older than the min cache age) are correctly swept away.

Copy link

cypress bot commented Feb 4, 2024

Passing run #1903 ↗︎

0 40 0 0 Flakiness 0

Details:

Merge a5f0781 into 7067d01...
Project: datalab Commit: b8dd3daa6c ℹ️
Status: Passed Duration: 01:59 💡
Started: Jun 5, 2024 9:14 PM Ended: Jun 5, 2024 9:16 PM

Review all test suite changes for PR #562 ↗︎

@ml-evs
Copy link
Member Author

ml-evs commented Jun 5, 2024

I'm convinced this works in the single-threaded case, but at some point we need to add tests for the multi-threaded case. This will do for now...

Copy link

codecov bot commented Jun 5, 2024

Codecov Report

Attention: Patch coverage is 72.00000% with 14 lines in your changes missing coverage. Please review.

Project coverage is 67.16%. Comparing base (7067d01) to head (a5f0781).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #562   +/-   ##
=======================================
  Coverage   67.15%   67.16%           
=======================================
  Files          62       62           
  Lines        3785     3828   +43     
=======================================
+ Hits         2542     2571   +29     
- Misses       1243     1257   +14     
Files Coverage Δ
pydatalab/pydatalab/config.py 79.67% <100.00%> (ø)
pydatalab/pydatalab/remote_filesystems.py 79.22% <70.21%> (-4.57%) ⬇️

@ml-evs ml-evs merged commit 325b554 into main Jun 6, 2024
11 checks passed
@ml-evs ml-evs deleted the ml-evs/remotes branch June 6, 2024 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority/high server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race condition in remote filesystem sync
1 participant