Skip to content
This repository has been archived by the owner on Jan 30, 2024. It is now read-only.

Metadata

Paul Nilsson edited this page Mar 5, 2019 · 6 revisions

The pilot maintains, parses, and trims a number of metadata files and objects. The list below gives details about each of them.

Job Report

The jobReport.json file is a metadata file created by most ATLAS payloads. If it exists (all production jobs create it but not all user analysis jobs), the pilot will add it to the final server update. Currently, it re-uses the 'metaData' field in the server update to send the JSON information as text.

Job Report for production jobs

The payload / transform in a production job is expected to create a job report (json dictionary) containing several fields that are needed by the pilot and by Harvester (on HPCs). In ATLAS, it contains many additional fields that are not used by the pilot or Harvester, but is used by other components so the pilot sends the entire file along with the final server update ('metaData' field). The default file name is "jobReport.json" but can be defined in the pilot configuration file (pilot/util/default.cfg, "jobreport"). The pilot expects to find the following fields:

exitCode - the payload exit code
exitMsg - the payload exit message

(the fields expected by Harvester may be documented elsewhere). For ATLAS there are several other fields used, including dbData, dbTime, events and cpuTime. Furthermore, the dictionary format (relevant for the above fields) is simple:

{
   ..
   "exitCode": [integer],
   "exitMsg": "[string]",
   ..
}

Metadata XML

The metadata.xml file produced by the transform is uploaded to the server with the 'xml' field. Internally used filename: metadata-<jobId>.xml.

Pool File Catalog XML

.. (PoolFileCatalog.xml)

DDM Endpoints (JSON)

.. (agis_ddmendpoints.json)

Schedconfig JSON

.. (agis_schedconf.json)

Rucio traces

The pilot sends detailed information about file transfers to Rucio. A list with the different fields contained in the trace report can be found in the Pilot 2 wiki.

The traces are sent by the Pilot directly to the Rucio server.

Memory Monitor Summary JSON

.. (memory_monitor_summary.json)

Job Definition JSON

A Harvester job definition file (pandaJobData.out) is copied from the Pilot's home directory and renamed to job_definition.json, and placed in the job's work directory for later reference (i.e. it gets stored in the log).

HPC Worker Attributes JSON

.. (worker_attributes.json)

HPC Jobs JSON

? (HPCJobs.json)

HPC Worker PanDA IDs JSON

.. (worker_pandaids.json)