Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update queue-and-scheduling-policy.md #495

Merged
merged 1 commit into from
Oct 22, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/policies/queue-scheduling/queue-and-scheduling-policy.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,15 @@ Some work will require use of Polaris that requires deviation from regular polic
## Big Run Mondays
As part of our regular maintenance procedures on Mondays, we will promote to the highest priority any jobs in the queued state requesting 802 nodes or more. Promotion is subject to operational discretion.

We may also, at our discretion, take the opportunity to promote the priority of capability jobs if the system has been drained of jobs for any other reason.
We may also, at our discretion, take the opportunity to promote the priority of capability jobs (20% of the machine) if the system has been drained of jobs for any other reason.

## Monday Maintenance
On Mondays where the ALCF is on a regular business schedule the system may be expected to undergo maintenance from 9:00 am until 5:00 pm US Central Time. The showres command may be used to view pending and active maintenance reservations.

## INCITE/ALCC Overburn Policy
If an INCITE or ALCC project has exhausted its allocation in the first 11 months of its allocation year, it is eligible for overburn running. At this point, capability jobs submitted by INCITE and ALCC projects will run in the default queue (instead of backfill) for the first 11 months of the allocation year until 125% of the project allocation has been consumed.
If an INCITE or ALCC project has exhausted its allocation in the first 11 months of its allocation year, it is eligible for overburn running. At this point, capability jobs ((20% of the machine) submitted by INCITE and ALCC projects will run in the default queue (instead of backfill) for the first 11 months of the allocation year until 125% of the project allocation has been consumed.

INCITE and ALCC projects needing additional overburn hours should e-mail [[email protected]](mailto:[email protected]) with a short description of what they plan to do with the additional hours, highlighting specific goals or milestones and the time expected to accomplish them. This will be reviewed by the scheduling committee, allocations committee, and ALCF management. Requests should be submitted 15 days before the start of the next quarter of the allocation year for full consideration. Non-capability jobs from projects that have exhausted their allocation will continue to run in backfill.
INCITE and ALCC projects needing additional overburn hours should e-mail [[email protected]](mailto:[email protected]) with a short description of what they plan to do with the additional hours, highlighting specific goals or milestones and the time expected to accomplish them. This will be reviewed by the scheduling committee, allocations committee, and ALCF management. Requests should be submitted 15 days before the start of the next quarter of the allocation year for full consideration. Non-capability jobs (less than 20% of the machine) from projects that have exhausted their allocation will continue to run in backfill.

To be clear, this policy does not constitute a guarantee of extra time, and we reserve the right to prioritize the scheduling of jobs submitted by projects that have not yet used 100% of their allocations, so the earlier that an INCITE or ALCC project exhausts its allocation, the more likely it is to be able to take full advantage of this policy.

Loading