Skip to content

Refactoring of the RM

chaen edited this page Dec 13, 2013 · 3 revisions

Inheritance schema

The inheritance schema is heavy and useless, it is just some wrapper methods. The checks done are anyway done further. The idea is to merge CatalogToStorage and ReplicaManager into a single class, and instead of inheriting from all these interfaces, we will use the StorageElement and the FileCatalog directly.

SingleFile parameter

This attribute is not very much used and makes the class complex: it should disappear from the ReplicaManager, the StorageElement and the FileCatalog. The default will be the Successful/Failed dictionary convention. We can provide a helper function that converts the Successful/Failed to an S_OK/S_ERROR.

Catalogs parameter

As for SingleFile, it should disappear. If someone really wants to specify manually which catalog to use, he can just create a new FileCatalog and give the catalog name in the constructor of the FileCatalog. It concerns only very few modules.

Use of PFN

The PFN should not be used as arguments in the ReplicaManager, the FileCatalog and the StorageElement. Even though we claimed we were not using the PFN stored in the LFC, we are, and this has bad consequences like making it very difficult (if not impossible...) to change the basepath of a storage element transparently. The proposed solution is to always use the LFN. If we want to refer to a particular replica, we should use the LFN and the SE name. The only place where we would need the PFN stored in the LFC is to remove a replica, but this can be sorted out internally.

This should make the following methods obsolete :

  • getCatalogLFNForPFN. Used by:
    • Dirac : StorageManagementSystem/Agent/SENamespaceCatalogCheckAgent.py and DataManagementSystem/Client/DataIntegrityClient.py
  • getLfnForPfn. Used by:
    • Dirac : DataManagementSystem/Client/DataIntegrityClient.py
  • getPfnForLfn. Used by:
    • Dirac : Used in TransformationSystem/Agent/TransformationCleaningAgent.py
    • LHCbDirac : DataManagementSystem/scripts/dirac-dms-lfn-replicas.py,
  • getPfnForProtocol. Used by:
    • Dirac: StorageManagementSystem/Agent/SENamespaceCatalogCheckAgent.py, DataManagementSystem/Client/DataIntegrityClient.py, and DataManagementSystem/Client/FTSClient.py

The only places where the actual PFN should be used are inside the Catalog plugins and Storage plugins, and so not visible to the user.

“Forwarding” methods

Many methods are just forwarding the ReplicaManager call to the StorageElement or the FileCatalog classes. We should make sure that these methods are now called directly on the StorageElement/FileCatalog, and not through the ReplicaManager anymore. The idea is that the ReplicaManager should be used only if both the FileCatalog and the StorageElement are involved. A detailed list of places where these methods are used is available on demand.

Method Replacement
addCatalogFile Call FC.addFile
addCatalogReplica Call FC.addReplica
getCatalogDirectoryMetadata Call FC.getDirectoryMetadata
getCatalogExists Call FC.exists
getCatalogFileMetadata Call FC.getFileMetadata
getCatalogFileSize Call FC.getFileSize
getCatalogLFNForPFN Call FC.getLFNforPFN. But do we really need it if we get ride of the PFN?
getCatalogListDirectory Call FC.listDirectory with default verbose = False
getCatalogReplicas Call FC.getReplicas with default allStatus = False
getCatalogReplicaStatus Call FC.getReplicaStatus
getLfnForPfn This method should be removed
getPfnForLfn Call SE.getPfnForLfn. But do we really need it if we get ride of the PFN?
getPfnForProtocol Calls SE.getPfnForProtocol. The default protocol asked in the rm is SRM2.
getStorageFile Call SE.getFIle
getStorageFileAccessUrl Call SE.getAccessUrl
getStorageFileExists Call SE.exists
getStorageFileMetadata Call SE.getFileMetadata
getStorageFileSize Call SE.getFileSize
getStorageListDirectory Call SE.listDirectory
pinStorageFile Call SE.pinFile with default lifetime = 86400
prestageStorageFile Call SE.prestageFile with default lifetime = 86400
putStorageDirectory Call SE.putDirectory
removeCatalogDirectory Call FC.removeDirectory with default recursive = False
removeCatalogFile Call FC.removeFile but sort the lfn from the longest to the shortest
removeCatalogReplica Call FC.removeReplica
removeStorageDirectory Call SE.removeDirectory with default recursive = false
removeStorageFile Call SE.removeFile
setCatalogReplicaStatus Call FC.setReplicaStatus

These replacements concern 34 files in Dirac, and 29 in LHCbDirac.

The proposed plan is the following:

  • Modification of the ReplicaManager, FileCatalog and StorageElement for v6r[asap]. There are only very few calls to the gConfig (1 for the RM, 2 for the SE, 4 for the FC), so porting it to v7 should be easy. We would make the changes backward compatible: if a method is called via the ReplicaManager instead of directly calling the StorageElement/FileCatalog we could forward it to the proper class and issue a message. This hack would be here only the time of one release to give more time to people to change.
  • Modification of the scripts, agents, services .... will be done progressively. The version targeted to finish the changes would be v7r0. We would then drop the helper that insure the backward compatibility.

Unused methods

Several methods seem to be unused. We can maybe remove them, unless they are here for future use or used at a place I could not spot.

Method Behavior
createCatalogDirectory Call FC.createDirectory
createCatalogLink Call FC.createLink
getCatalogDirectoryReplicas Call FC.getDirectoryReplicas
getCatalogDirectorySize Call FC.getDirectorySize
getCatalogIsDirectory Call FC.isDirectory
getCatalogIsFile Call FC.isFile
getCatalogIsLink Call FC.isLink
getCatalogReadLink Call FC.readLink
getPrestageStorageFileStatus Call SE.prestageFileStatus
getStorageDirectory Call SE.getDirectory
getStorageDirectoryIsDirectory Call SE.isDirectory
getStorageDirectoryMetadata Call SE.getDirectoryMetadata
getStorageDirectorySize Call SE.getDirectorySize
getStorageFileIsFile Call SE.isFile
putStorageFile Call SE.putFile
releaseStorageFile Call SE.releaseFile
removeCatalogLink Call FC.removeLink
replicateStorageFile Call SE.replicateFile
setCatalogReplicaHost Call FC.setReplicaHost

glogger messages

With the current chain system (RM -> SE/FC-> plugins), an error happening in the plugins is often printed 3 times, which makes the logs heavy and more complex to read/parse. The proposition is that all the messages issued by these classes should be at the debug level. They are just tools, so it is up to the script/agent/service using this tool to report the error.

Clone this wiki locally