-
Notifications
You must be signed in to change notification settings - Fork 107
WorkQueueManager
The WorkQueueManager component performs the following tasks:
- Pull work from the global WorkQueue into the local queue
- Performs the workflow bootstrap inside the agent (sandbox and pileup list creation)
- Inject elements of work from the local queue into WMBS (and DBSBuffer)
- Update input data locations (in case data location has changed after work is pulled.)
- Kill workflows if necessary and clean up the work elements in the local queue when workflow is finished.
These 3 tasks are executed by 4 worker threads in the component:
- WorkQueueManagerWorkPoller performs task #1
- WorkQueueManagerWMBSFileFeeder performs task #2 #3
- WorkQueueManagerLocationPoller performs task #4
- WorkQueueManagerCleaner performs task #5
Before reading further on about this component, please make sure to read what the WorkQueue is.
This worker takes care of pulling the work from the global queue into the local queue and splitting that work from the local inbox into the local queue.
For pulling work, first the poller retrieves from WMBS the number of available slots for every site in Normal state, however this part of the system takes the total pending slots per site without substracting the current job counts in WMBS. Afterwards, the poller calculates the number of jobs in the available work elements in the local queue, ordered by priority. Finally, the system will poll the global queue for elements ordered by priority that are assigned to the same team as the current WMAgent and will assign them to the local queue according to the available resources minus the job counts initially retrieved from the local queue until the slots are full.
The poll process from the global queue is as follows:
- An list ordered by priority of the available work elements assigned to the team is retrieved from the global queue
- The list is traversed in the original order and each element is analyzed against the available site slots, when the system finds a site that has at least one slot and matches the location restrictions in the element (i.e. data location and site white/black lists) then the number of jobs in the elements is substracted from the site available slots and the process continues. Developer's Note: Note that if an element has 50k jobs, and a matching site has 1 slot it will acquire that work which may cause some overfilling in the queues from time to time.
- When the system is out of elements or site slots the acquiring process stops.
Assigning to the local queue, means that the work element is moved from Available to Negotiating and it is replicated to the local queue inbox.
For splitting work, the worker will retrieve all the Negotiating elements in the local inbox and run the appropriate splitters for them that will create a matching element in the local queue. The child element in the local queue will be marked as Available and the inbox element as Acquired. The local queue splitting is not really a "split" in the sense that one inbox element will not produce more than one local queue element, in fact the local queue element is almost the same as the inbox element. See the global queue for the real splitting.
This worker thread reads from the local queue Available elements and injects the work element information into WMBS to get the workflow started, or continue its processing if there is already an injected element.
Injecting a work element into WMBS means the following:
- Storing the dataset, block and all its valid files (within the block being acquired).
- All the file information from the input data, such as number of events, number of lumis, runs, checksums and all their location (MC fake files are registered against all PNNs). ACDC work units are registered with a limited number of run/lumis/events, as injected into the ACDCServer.
- A run/lumi mapping of all valid input files that needs to be used by JobCreator/splitting algorithm.
- Creating the workflow entries in WMBS, if not present already. Note that many work elements of the same workflow will be injected in a single agent and the workflow entries just have to be created once.
- Creating the necessary subscriptions according to the workload so the injected files are processed by the WMAgent.
- In addition to WMBS, this worker also register some workflow information into DBSBuffer tables.
Injecting work into WMBS is not done all at once but also regards job slots and priorities. First,the number of slots available in sites in Normal or Draining state is determined as follows (taken from the current version of ListThresholdsForCreate:
- The system calculates all the jobs that are in either created, executing, *failed or *cooloff states which have a location already assigned, ordered by workflow priority. However, in the count of executing jobs only those which are Pending in the batch system will be counted.
- The number of jobs which don't have locations assigned already (i.e. have not been submitted at least once) is calculated and are "assigned" to sites based on the location of the input files and the valid locations as defined in the subscription site white/black lists. This is also ordered by workflow priority.
- The number of pending slots from the wmbs_location table is retrieved for each site.
The number of pending slots minus the number of jobs previously calculated is passed to the local queue to retrieve Available work elements, the process is the same as described above for pulling work elements from the global queue. Afterwards, the file information is gathered, from DBS/PhEDEx in the case of data or the ACDC server in the case of ACDC, and stored in WMBS along with the workflows and subscriptions. The elements that are injected in WMBS are moved from Available to Running.
All the elements in a WorkQueue which have data (i.e. not MC elements) also have location information about that data, for example a work element from a ReDigi workflow may have 3 sets of data: Input block, parent blocks and a pileup dataset. Each element of data in a work element has locations associated to them and this information may change over time, e.g. when a subscription is made in PhEDEx. The location poller worker thread takes care of examining the Available elements in the local queue and updating the location information of its data when changes occurs. This means that work elements that are not yet injected in WMBS can have their potential locations changed and extended to include more sites when PhEDEx subscriptions are made and completed.
The WorkQueueManagerCleaner takes care of scanning the local queue and inbox for orphan elements which may have been left out from workflows that were terminated by unusual methods, and correcting replicating conflicts between the local inbox and the global parent queue. The worker thread deletes elements from the queue which are in terminal states, i.e. Failed or Done, and that have no parent elements, e.g. when the global queue has done its cleaning as well.