-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modules: Move feasibility/satisfiability checking into a new module #1285
base: master
Are you sure you want to change the base?
Commits on Sep 28, 2024
-
Split feasibility from resource via CLI option.
Problem: sched.feasibility RPCs take up too much of sched-fluxion-resource's single-threaded time. Add CLI option to create a 'feasibility version' of sched-fluxion-resource called sched-fluxion-satisfiability that can run on multiple ranks.
Configuration menu - View commit details
-
Copy full SHA for f4fa865 - Browse repository at this point
Copy the full SHA f4fa865View commit details
Commits on Sep 29, 2024
-
Make s-f-satisfiability acquire from s-f-resource
Problem: Only one resource.acquire RPC can be active at a time, but both s-f-resource and s-f-satisfiability try to open one. Make s-f-satisfiability call s-f-resource.notify to populate its resources (kind of like s-f-qmanager does). Make s-f-resource send its resources instead of null on notify RPCs to accomodate s-f-satisfiability. Finally, force the FIRST matching policy for s-f-satisfiability.
Configuration menu - View commit details
-
Copy full SHA for a5ab60d - Browse repository at this point
Copy the full SHA a5ab60dView commit details -
Split feasibility into its own full module
Problem: Having a 'satisfiability version' of the s-f-resource module makes resource_match.cpp hard to read and less maintainable. Split resource_match.cpp into a resource.cpp and feasibility.cpp that contain their respective modules. Leave the code common to both in resource_match.cpp with a new header, resource_match.hpp. This simplifies adding new modules that acquire and match resources.
Configuration menu - View commit details
-
Copy full SHA for 372a805 - Browse repository at this point
Copy the full SHA 372a805View commit details -
Resolve CMakeLists.txt linking error
Problem: Tests segfault when unloading s-f-resource under flux-broker Create a sched-fluxion-resource-module library that loads the 'resource' target only once between both s-f-resource and s-f-feasibility. The previous CMakeLists.txt was loading the 'resource' target twice, which caused a segfault during unload when running under 'flux broker' on static variable ANY_RESOURCE_TYPE{"*"} in matcher.cpp.
Configuration menu - View commit details
-
Copy full SHA for 9276dfc - Browse repository at this point
Copy the full SHA 9276dfcView commit details -
Cancel feasibility module on s-f-resource error
Problem: The feasibility module ignores s-f-resource after it gets resources from s-f-resource.notify, but it should exit when s-f-resource does to prevent odd behavior after an s-f-resource reload, especially with different resources. Make s-f-feasibility listen for errors on the .notify stream. Send graph expiration time to feasibility module Problem: s-f-resource.notify does not currently send graph expiration info to s-f-feasibility. Send it. Fix several errors Problem: Improper use of git including forgetting to add a file in the previous commit and losing changes during a merge. Fix various things including broken type of m_acquired_resources and unnecessary+broken code in resource_match_opts.
Configuration menu - View commit details
-
Copy full SHA for e37f43b - Browse repository at this point
Copy the full SHA e37f43bView commit details -
Load feasibility on rank 0 only by default
Problem: s-f-feasibility launches on all ranks by default. This is probably not a good default behavior. Change rc1.d/01-s-f to launch s-f-feasibility on only rank 0. Change rc3.d/01-s-f to remove s-f-feasibility from all ranks. This allows for any layout of s-f-feasibility instances while guaranteeing at least one.
Configuration menu - View commit details
-
Copy full SHA for 80776f6 - Browse repository at this point
Copy the full SHA 80776f6View commit details -
Remove unnecessary check in notify_request_cb
Problem: notify_request_cb waits for resource.acquire if it has not yet recieved resources. However, it is guaranteed to have resources since init_resource- _graph must return before flux_reactor_run starts, and notify_request_cb can only happen after that. Remove the check.
Configuration menu - View commit details
-
Copy full SHA for 2ba4a06 - Browse repository at this point
Copy the full SHA 2ba4a06View commit details -
Handle 'feasibility.check' RPC in s-f-feasibility
Problem: s-f-feasibility performs feasibility checking through the sched.feasibility RPC. However, RFC 27 requires feasibility checking to be performed in feasibility.check. Make s-f-feasibility register the 'feasibility' service and 'feasibility.check' RPC instead of the 'sched.feasibility' RPC. Remove 'sched.feasibility' forwarder cb in qmanager.cpp.
Configuration menu - View commit details
-
Copy full SHA for 1346e5a - Browse repository at this point
Copy the full SHA 1346e5aView commit details -
Remove 'sched.feasibility' calls from tests
Problem: Some tests call 'sched.feasibility', which no loger exists. Swap 'sched.feasibility' for 'feasibility.check'. Add feasibility module to sched-sharness and t1020 Problem: Some tests that need satisfiability information do not load and unload s-f-feasibility. Add load_feasibility, reload_feasibility, and remove_feasibility functions to sched-sharness.sh and the relevant tests.
Configuration menu - View commit details
-
Copy full SHA for 1e5a6eb - Browse repository at this point
Copy the full SHA 1e5a6ebView commit details -
Problem: All calls to feasibility.check are in tests for s-f-resource, not tests for s-f-feasibility. Move them to a separate test for the feasibility module. Disallow load-file behavior in feasibility test Problem: t4014 expects feasibility to load resources from a resource module that was passed a load-file, which is not desired behavior. Update t4014 to expect failure on such a load.
Configuration menu - View commit details
-
Copy full SHA for b0b5801 - Browse repository at this point
Copy the full SHA b0b5801View commit details -
Problem: The formatting did not pass the CI code formatting check. Apply the required changes.
Configuration menu - View commit details
-
Copy full SHA for 44d5bb5 - Browse repository at this point
Copy the full SHA 44d5bb5View commit details -
Remove resource marking from feasibility init
Problem: As a holdover from s-f-resource, s-f-feasibility marks its acquired resources as DOWN, which is unnecessary. Remove this resource marking from init_resource_graph.
Configuration menu - View commit details
-
Copy full SHA for 0d43cb2 - Browse repository at this point
Copy the full SHA 0d43cb2View commit details -
Make s-f-resource propagate flux_respond_pack err
Problem: If notify_request_cb fails flux_respond_pack, it responds with nothing, leaving feasibility without resources but active. Make notify_request_cb send a flux_respond_error on a flux_respond_pack_error.
Configuration menu - View commit details
-
Copy full SHA for e57c11a - Browse repository at this point
Copy the full SHA e57c11aView commit details -
Problem: fedora40 CI fails on t4014 due to """ error: bug in the test script: broken &&-chain: load_feasibility flux dmesg -c | grep -q "File exists" """ Add && between successive statements in t4014.
Configuration menu - View commit details
-
Copy full SHA for 5c573ad - Browse repository at this point
Copy the full SHA 5c573adView commit details