-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Factory often has one extra glidein job running #397
Comments
I don't think this is related to the reconfigure, the limits are written correctly in the
|
Could it be because the limits are applied per frontend group?
For the factory each frontend group is in reality a different frontend. In the above case the factory submitted one glidein for the main group and one for the main-canary one. |
@rynge for my education, what is the difference between main and main-canary? We need to be careful here. As confusing as this sounds, it might be the correct behavior. Groups can be different VOs submitting 1 test glidein each. So in the end you get 2 glideins... On the other hand, if we set a limit as 100 in the factory, I would not expect the factory to submit 200 glideins. I need to double check what the factory does in this case. |
Describe the bug
I have often observed that GlideinWMS exceeds its per-entry glidein maximum by one glidein job. It is especially apparent when we add a new site to the OSPool, because we always start with a cap of 2 glideins. Also, we do set num_factories = 2, because we have two production factories now.
To Reproduce
We set some glidein configuration in a YAML file which gets converted to regular GlideinWMS configuration. But here is a YAML fragment:
Expected behavior
For a case like above, I expect each factory to run at most 1 glidein job on the entry, for a total of up to 2 glidein jobs across the 2 factories.
Screenshots
Here is typical output from a Python script I use to check on a site:
PILOTS IN FACTORY ACCESS POINTS
+-------------------------------------------------+---------------------+------+-----+-------+-------+------+-------+-------+
| Schedd Name | Frontend Name | Idle | Run | Remov | Compl | Held | TxOut | Suspd |
+-------------------------------------------------+---------------------+------+-----+-------+-------+------+-------+-------+
| [email protected] | OSG_OSPool:frontend | 0 | 2 | 0 | 0 | 0 | 0 | 0 |
| [email protected] | OSG_OSPool:frontend | 0 | 2 | 0 | 0 | 0 | 0 | 0 |
+-------------------------------------------------+---------------------+------+-----+-------+-------+------+-------+-------+
This site had exactly the YAML configuration shown above.
Info (please complete the following information):
Stakeholders and components can be a comma separated list or on multiple lines.
If you add a new stakeholder or component, not on the sample list, add it on a line by its own.
Additional context
Just reach out to me (Tim C.) by email or Slack for any extra details.
The text was updated successfully, but these errors were encountered: