-
Notifications
You must be signed in to change notification settings - Fork 176
RFC #6: Refactoring Request Management System
Authors: K.Ciba, A.Tsaregorodtsev
Last Modified: 11.11.2012
The Request Management System (RMS) is designed for management of simple operations that are performed asynchronously on behalf of users - owners of the tasks. The RMS is used for multiple purposes: failure recovery ( failover system ), data management tasks and some others. It should be designed as an open system easily extendible for new types of tasks.
The core of the the RMS system is a request, which holds the information about its creator (DN and group), status, various timestamps (creation time, submission time, last update), job ID to which request belongs to, DIRAC setup and request's name. One request can be made of several sub-requests of various types (i.e. transfer, removal, registration, logupload, diset) and the operations that have to be executed (i.e. replicateAndRegister, registerFile, removeReplica etc.) to process request, source and destination storage elements to use, their statuses, various timestamps, error messages if any and order of their execution. The sub-request itself depending of type and operation can reference several sub-request's files, which again are holding all required bits of informations (i.e. file lfn or pfn or both, it's checksum and size, its GUID, status, error message etc.).
Current schema of the RequestDB.
All request information is kept in RequestDB database, which could use two kinds of back-ends: mysql (RequestDBMySQL) and local file system directory (RequestDBFile) through one common service (RequestManagerHandler) that could talk directly to the RequestDB allowing selection, insertion or update of particular request. All those CRUD operations are performed using specialised client interface (RequestClient).
The execution of requests is done by various specialised agents, each for one request type, i.e. TransferAgent which is processing transfer sub-requests, RemovalAgent for removal, RegistrationAgent of register sub-requests, DISETForwardingAgent for diset one and so on. The common pattern in agent code is to select sub-requests available for execution, perform some data manipulation to execute defined operation, update statuses in RequestDB and notify request's job when all sub-requests are done.
While on database side request is kept in a three closely connected tables (RequestDB.Requests, RequestDB.SubRequests and RequestDB.Files), on the python client side there is only one class available: RequestContainer. This imbalance between SQL and python world leads to not clear, too heavy, error prone and not so easy to use API.
As python API will be changed, one should consider this as a great opportunity for refactoring database schema as well.
Proposed schema of the RequestDB.
The most important changes in the new schema are:
- all columns holding statuses are ENUM with the well defined set of states
- SubRequests table is renamed to Operations, which better describes this table contents
- there is no need to keep RequestType and Operation column as only one bit of information (Operation.Type in the new schema) properly indicates required actions
- Files.Md5 and Files.Adler columns are dropped, instead of that there is a new entity holding a checksum type - Files.ChecksumType, while the checksum itself will be kept in Files.Checksum column
Inheritance diagram for Request zoo.
The basic ideas within the new API are:
- one python class per one SQL table
- the API should be lightweight
- all class members are named after DB column names
- class members are defined as python properties (see http://docs.python.org/2/library/functions.html#property)
- all classes are only a smart bags holding properties, no extra functionality except serialisation to SQL and or XML and manipulation/looping helpers of lower level classes (i.e. adding File instance to Operation, adding Operation instance to Request, looping over Operations in Request etc.)
- Operation execution order == index in Request::operations list
- mechanism for status 'calculation' from aggregated classes (i.e. at least one File is in 'Waiting', so Operation.Status is forced to be 'Waiting' too, this also could be propagated higher to Request object)
- same for request finalisation: should be done automatically when all Operations are 'Done'
- request should always be read as a whole: Request + all Operations + all Files, partial read should be forbidden
Statuses are somehow special and should be treated separately from all the other properties. In some cases user shouldn't be allow to modify them, i.e. if Operation has got at least one 'Waiting' File, its status cannot be set to 'Done', same for Request with at least one 'Waiting' or 'Queued' Operation).
Status propagation should be semi-automatic, i.e. on every change to File.Status, its parent (Operation) should be notified and if possible update its own status (i.e. checking if all children Files are 'Done').
State machines:
- for Request (no change to the previous implementation)
State machine for Request object.
- for Operation: new state 'Queued', at a time only one Operation in the Request is marked as 'Waiting', all the others are 'Done' or 'Queued' - this will save system resources, as the only 'selectable' Operations (and hence Requests) are those which are really waiting to be executed (their execution order == Request's current execution order)
State machine for Operation object.
- for File
State machine for File object.
For execution of request there should be implemented a new class based on Executors component with built-in state machine (possibly using Observer pattern) together with pluggable modules for "operation executors". It will allow to check and force the correct state propagation between Operations. Hopefully there will be possibility to instantiate several executors in case that current set won't be able to process requests at desired speed.
The built-in state machine should process request as a whole but the same code will be reused inside Operation plugins or File-level processing: higher level state machine (say STM for Request) will be observing lower level (Operation in that case) triggering execution of the next Operation in the list or Request status change when applicable (i.e. all Operations reached their final states).
The basic RMS components will be the same: a dedicated service (RequestManagerHandler) and client (RequestClient), but their code needs to be reviewed. Also it will be nice to have some validation mechanism in place:
- low level in the particular property.fset method (type and value checking)
- high level provided by a special tool (RequestValidator) providing overall logic checks of the request as a whole (i.e. if transfer Operation has at least one File attached etc.)
The RequestValidator could be used as a component inside RequestClient or RequestManagerHandler (need to decide).
There should be a set of CLI tools for monitoring request given request name or id. Those should provide a rich set of information (apparently all available) in a human readable format. Also it would be nice to have in place scripts presenting overall status of the system: number of specific requests of a given type waiting to be executed or executed in some time span (i.e. last day history). Some initial work for those have stared recently in context on the LHCbDIRAC project, but at the moment no such scripts exist.
One could also consider to provide a set of CLI tools allowing some basic operations, i.e. request cancelling or submission.
The old web page accessible under /jobs/RequestMonitor/display needs to be rewritten to display new structure. It would be nice if using this page user can also create or cancel her own requests.
The TransferDB is built on top of RequestDB adding several FTS specific tables holding information about scheduled files and FTS jobs properties.
The FTS processing was implemented in a way that one FTS job should perform a bulk transfer and hence can be made of files from several different requests. This leads to overcomplicated structure of database and python code, while in the reality typical FTS job is transferring only 1 file from one failover request. On the other side of scale there are bulk FTS jobs created from replication transformations, where a typical job is transferring 100 files at once. This leads to conclusion that original implementation wasn't perfect and needs some simplification and refactoring too.
If we allow the situation where each FTS job is transferring files belonging to only one request (which is the case for the majority of FTS transfers anyway), then it will be completely possible to move out FTS specific parts to their own database instance (say FTSDB), with minimalistic schema like this:
Of course this will require major changes in the DMS systems as well, but as an outcome the new FTS processing will be much simpler, clearer and easier to maintain. Also developer should be aware of and explore all new features provided by FTS3 and gfal2 implementations.
All FTS specific changes should be described in a separate RFC.