Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MANUAL file expiry deletion script #1

Merged
merged 40 commits into from
Jun 16, 2024
Merged

MANUAL file expiry deletion script #1

merged 40 commits into from
Jun 16, 2024

Conversation

cehune
Copy link
Collaborator

@cehune cehune commented Feb 17, 2024

base functions for file expiry tool + the necessary python script to run it

  • expiry check based on the time since last modification
  • sorting through an entire directory and deleting files which have expired

Motivation

https://github.com/WATonomous/infra-config/issues/1143

Description

As to avoid race conditions, this works by copying files into a separate directory with 700 root conditions and then deleting this separate directory. The code currently handles the following edge cases:

  • File naming conflicts when moved to the temp directory
  • Folder by temp name already exists
  • Deletes empty folders if all files inside were expired

This has been tested as follows:

  • Unit tests for file expiry and file creation (need to produce for temp folder generation and file deletion)
  • Manual Testing on local machine (only 10 ish files)

Unsure about how it will work when handling the entire storage folder.

Next steps

  • cron automatically run script?
  • better notification system (currently just prints the user whos file is deleted)
  • Better decision making for what files should be deleted (ie technically expired, but a necessary part of an active folder)

@cehune cehune added the enhancement New feature or request label Feb 17, 2024
@cehune cehune changed the title Base folder generation and expiry date checker MANUAL file expiry deletion script Feb 17, 2024
@cehune cehune self-assigned this Feb 17, 2024
@Jimmyj30
Copy link

Jimmyj30 commented Feb 17, 2024

weird thing I found that might be useful for this project? find . -ctime -30 seems to find the files in a directory that haven't been modified for 30 days, so if the output of this is empty for a certain directory, that means we can maybe delete that directory as nothing in it has been touched for 30+ days. But I do think we should only delete directories at the very "top" level of the wato-drive so we are not deleting any inactive files that are part of an ongoing project.

infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
Copy link
Member

@ben-z ben-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I think we can test this now. Feel free to change your role to ADMIN in your profile so that you have sudo access. Here's the profile editor: https://cloud.watonomous.ca/docs/utilities/profile-editor

Let's start with a machine that's not super heavily used (wato2-ubuntu1, tr-ubuntu1, delta-ubuntu2 are good choices) and a folder on that machine that is smaller like /var/lib/cluster/<uid>.

infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
Copy link
Member

@ben-z ben-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for waiting hehe. Gave this a closer look. We can start testing!

infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
Copy link
Member

@ben-z ben-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! Do we need the empty init.py files? Can we add instructions to README on how to run this tool?

infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
Copy link
Member

@ben-z ben-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure looks good! There are some details that we should iron out, like using consistent timestamps and being efficient with filesystem accesses.

infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/main.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
Copy link
Member

@ben-z ben-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good! Can we also run the tests in CI?

README.md Show resolved Hide resolved
infra_file_auto_expiry/source/main.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/tests/test_utils.py Outdated Show resolved Hide resolved
@Jimmyj30
Copy link

Jimmyj30 commented Apr 11, 2024

Looking pretty good! Can we also run the tests in CI?

Just another small comment: did we figure out how to deal with the "undefined" file info, like ctimes and stuff?

Copy link
Member

@ben-z ben-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good in general! Pretty much all just nitpicks.

I think you are looking into issues when running this on a larger folder right? Please ping me again once that's ready.

requirements.txt Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
infra_file_auto_expiry/source/utils.py Outdated Show resolved Hide resolved
file_expiry.sh Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
Copy link
Member

@ben-z ben-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just some nits

README.md Outdated Show resolved Hide resolved
source/data/expiry_constants.py Outdated Show resolved Hide resolved
Copy link
Member

@ben-z ben-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@cehune cehune merged commit 45070c9 into main Jun 16, 2024
3 checks passed
@cehune cehune deleted the mvp branch June 16, 2024 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants