Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add first nodex policy #1055

Merged
merged 5 commits into from
Sep 6, 2023
Merged

Add first nodex policy #1055

merged 5 commits into from
Sep 6, 2023

Conversation

trws
Copy link
Member

@trws trws commented Aug 2, 2023

First cut to add a node exclusive first policy. This seems to work, refactored one of the nodex tests to not be quite so duplicative and test this too. In cases with no overlapping jobs, firstnodex is the same as hinodex, but it will only look for one match rather than scoring matches.

Need to test the performance, but this should be significantly faster than the other nodex policies.

@trws
Copy link
Member Author

trws commented Aug 3, 2023

Looking at the code for this, I'm thinking it would be really useful to add a new "policy" that lets us build one of these up more dynamically. It would let us factor out duplication here and let us play with the number of matches to look for and things like that. I kinda want to get this in so we have it, then build that up.

@trws
Copy link
Member Author

trws commented Aug 3, 2023

Rebased.

@trws
Copy link
Member Author

trws commented Aug 4, 2023

Quick resource-bench.sh numbers:

firstnodex
Min. Match Time: 1.90735e-06
Max. Match Time: 0.000230789
Avg. Match Time: 0.000187913
first
Min. Match Time: 6.60419e-05
Max. Match Time: 0.000261068
Avg. Match Time: 9.12583e-05
hinodex
Min. Match Time: 1.90735e-06
Max. Match Time: 0.0184488
Avg. Match Time: 0.00863134

So, firstnodex is about 45 times faster than hinodex, but still about 10x slower than first, at least for this little microbenchmark.

@milroy milroy mentioned this pull request Aug 8, 2023
@milroy milroy linked an issue Aug 8, 2023 that may be closed by this pull request
Copy link
Member

@milroy milroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall and is an elegant implementation. I did find a few things that need to be changed, which I indicated in the line-specific comments.

Two other higher-level changes:

  1. Please add firstnodex and description here:
    " -P, --match-policy=<low|high|lonode|hinode|lonodex|hinodex|first|locality|variation>\n"
  2. Please break up the commit such that there is one for "policies" (changes to dfu_match_policy_factory.*pp, "testsuite" (changes to t3033-resoure-nodex.t), and "resource-query" (changes to resource-query.cpp).

@@ -38,11 +38,13 @@ std::shared_ptr<dfu_match_cb_t> create_match_cb (const std::string &policy)
std::shared_ptr<dfu_match_cb_t> matcher = nullptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a description to the body of the commit following the Flux "Problem:, Add:" convention.

Comment on lines +41 to +47
if (policy == FIRST_MATCH || policy == FIRST_NODEX_MATCH) {
std::shared_ptr<high_first_t> ptr
= std::make_shared<high_first_t> ();
ptr->add_score_factor (std::string ("node"), 1, 10000);
ptr->set_stop_on_k_matches (1);
if (policy == FIRST_NODEX_MATCH)
ptr->add_exclusive_resource_type ("node");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really impressed that this is all it takes to generate the correct behavior!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much as the current traverser is really complex, this is why. Dong designed it to be amazingly flexible and generic. Hopefully we can take a bit better advantage of that if we expose it a bit more.

t/t3033-resource-nodex.t Show resolved Hide resolved
t/t3033-resource-nodex.t Show resolved Hide resolved
Problem: node exclusive policies are currently slow
Solution: Add a policy that accepts the first match but allocates nodes
exclusively.  Thanks to the flexible design of the matcher, this is
entirely implemented by setting parameters on the existing
implementation.
In addition to testing firstnodex, this refactors the test for hinodex
and lonodex to use a common function that runs several tests
parametrized on their expected output files.
Since I have to touch the string to add first-nodex, it's time to get
rid of all the extra quotes and escaped newlines.
What it says on the tin, adds the new policy to the usage message.
@trws
Copy link
Member Author

trws commented Sep 1, 2023

It ended up being four commits, because I decided to convert the usage message string into a C++11 raw string and made that a separate commit. This should be good to go unless I botched something in rebasing it.

Copy link
Member

@milroy milroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reorganization!

There's a spacing issue that your editor introduced on one line. Please fix the alignment because it makes the block slightly confusing.

resource/utilities/resource-query.cpp Outdated Show resolved Hide resolved
Repair some odd spacing.

Co-authored-by: Daniel Milroy <[email protected]>
@codecov
Copy link

codecov bot commented Sep 6, 2023

Codecov Report

Merging #1055 (24e7d16) into master (5ede1d4) will increase coverage by 0.0%.
The diff coverage is 80.0%.

❗ Current head 24e7d16 differs from pull request most recent head 8fce829. Consider uploading reports for the commit 8fce829 to get more accurate results

@@          Coverage Diff           @@
##           master   #1055   +/-   ##
======================================
  Coverage    73.4%   73.4%           
======================================
  Files          98      98           
  Lines       11888   11889    +1     
======================================
+ Hits         8735    8737    +2     
+ Misses       3153    3152    -1     
Files Changed Coverage Δ
resource/utilities/resource-query.cpp 68.7% <50.0%> (+0.1%) ⬆️
resource/policies/dfu_match_policy_factory.cpp 92.8% <100.0%> (+0.3%) ⬆️

@trws trws added the merge-when-passing mergify.io - merge PR automatically once CI passes label Sep 6, 2023
@mergify mergify bot merged commit c619fbe into flux-framework:master Sep 6, 2023
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-passing mergify.io - merge PR automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need firstnode policy
2 participants