This repository has been archived by the owner on Jan 30, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 24
Copy Tools
Alexey Anisenkov edited this page Sep 12, 2021
·
2 revisions
The Pilot is capable to use externally defined copy tools or available backend API for file transfers.
The logic implementing file transfer operation for stage-in
and stage-out
modes using corresponding library or external tool is isolated into dedicated Pilot copytool module.
Each copytool module relies on following settings to configure and customize top Staging workflow (implemented in Data API) for file transfer operations:
parameter | type | default value | description |
---|---|---|---|
allowed_schemas |
list | any enabled for PandaQueue | a prioritized list of supported schemas for transfers by given copytool |
require_replicas |
boolean | False | indicates if given copytool requires input replicas to be resolved first from Rucio before stage-in
|
require_input_protocols |
boolean | False | indicates if given copytool requires input protocols and manual generation of input replicas for stage-in
|
require_protocols |
boolean | True | indicates if given copytool requires protocols to be resolved first for stage-out
|
check_availablespace |
boolean | True | indicates whether space check should be applied before stage-in transfers using given copytool |
resolve_surl |
handler | StagingClient.resolve_surl |
Get final destination SURL for file to be transferred. Can be customized at the level of specific copytool |
resolve_replica |
handler | StageInClient.resolve_replica |
Resolve input replica (matched by domain ) first according to primary_schemas , if not found then look up within allowed_schemas . Can be customized at the level of specific copytool |
In addition to these settings, each copytool module must implement following interface functions:
function signature | arguments | description |
---|---|---|
is_valid_for_copy_in(files) |
files : list of input FileSpec entries |
Check if passed files list is valid (allowed) for stage-in operation. Typically returns True
|
is_valid_for_copy_out(files) |
files : list of output FileSpec entries |
Check if passed files list is valid (allowed) for stage-out operation. Typically returns True
|
copy_in(files, **kwargs) |
|
Download (stage-in) given files using copytool related implementation. Copytool should update corresponding state fields of FileSpec object (status, status_code) |
copy_out(files, **kwargs) |
|
Upload (stage-out) given files using copytool related implementation. Copytool should update corresponding state fields of FileSpec object (status, status_code) |
The current range of supported copy tools is described below.
Copy tool | Require replicas (stage-in) |
Require input protocols (stage-in) |
Require protocols (stage-out) |
Check space (stage-in) |
Allowed schemas | description |
---|---|---|---|---|---|---|
gfal or gfal-copy
|
✔️ | ❌ | ✔️ | ✔️ | ['srm', 'gsiftp', 'https', 'davs', 'root'] |
GFAL2 tool (gfal-copy command) |
gs |
❌ | ✔️ | ✔️ | ✔️ | ['gs', 'srm', 'gsiftp', 'https', 'davs', 'root'] |
Google Cloud Storage (google.cloud API) |
lsm |
❌ | ❌ | ✔️ | ✔️ | ['srm', 'gsiftp', 'root'] |
Local site mover (lsm-get , lsm-put commands) |
mv |
❌ | ❌ | ✔️ | ❌ | any | Move file using filesystem commands (ln -s for stage-in, mv for stage-out) |
objectstore |
❌ | ✔️ | ✔️ | ✔️ | ['srm', 'gsiftp', 'https', 'davs', 'root', 's3', 's3+rucio'] |
Transfer files to OS storage using Rucio CLI (rucio download for stage-in, rucio upload for stage-out) |
rucio |
✔️ | ❌ | ❌ | ✔️ | any | Transfer files to RSE using Rucio python API (rucio.client.downloadclient , rucio.client.uploadclient ) |
s3 |
❌ | ✔️ | ✔️ | ✔️ | ['srm', 'gsiftp', 'https', 'davs', 'root', 's3', 's3+rucio'] |
Transfer files to Amazon Cloud Object Storage (S3 bucket) using boto3 python AWS API |
xrdcp |
✔️ | ❌ | ✔️ | ✔️ | ['root'] |
Transfer files using xrdcp command |