-
Notifications
You must be signed in to change notification settings - Fork 107
WMStats Server REST APIs
This wiki provides some instructions and documentation on the WMStats Server 2 RESTful APIs.
A general requirement of the WMCore REST framework is that clients must provide an Accept
HTTP header in their request. Hence, if the client wants to retrieve data in a JSON format, it needs to provide the following HTTP header request: Accept: application/json
.
IMPORTANT: WMStats Server serves data from its local cache, instead of always contacting the database backend (CouchDB) for the actual data. WMStats Server has a cache update polling of 10min, so multiple queries within a 10min range will likely not deliver different data and they are highly discouraged.
Starting in the WMCore release HG2211 - from November, 2022 - using WMCore version 2.1.4
, the capability of gzip
compressed response has been added to the WMCore REST framework. End users are invited to request compressed data, especially for heavy APIs transferring (many) megabytes of data, including most of the wmstatsserver
RESTful APIs.
When the user is creating their HTTP request, an extra HTTP header has to be provided to communicate to the WMCore server that gzip'ed content is accepted by the client. The user has to provide this key/value parameter in their HTTP request: Accept-Encoding: gzip
.
This does not necessarily mean that the server will, so the user must check the HTTP response headers to decide how to read the response body, which might or not be compressed. In case the server has responded with compressed data, the following HTTP response header will be sent back to the client Content-Encoding: gzip
, flagging that that response body is in a binary/compressed format.
In order to decompress the body data, the client can use the gzip
third-party python library and decompress the data as:
gzip.decompress(body)
If HTTP requests are made with the curl
Unix tool, the same header has to be provided and the output data can be redirected to a file, example:
curl -L -k --cert $X509_USER_CERT --key $X509_USER_KEY --cacert $X509_USER_CERT https://cmsweb.cern.ch/wmstatsserver/data/info -vvv -H "Accept: application/json" -H "Accept-Encoding: gzip" > out.data
now to see the content of out.data
, one can use the zcat
tool, example:
zcat out.data
To summarize the use of gzip, the client needs to provide the correct Accept-Encoding
HTTP request header and when parsing the HTTP response object, a check for the HTTP response header Content-Encoding
is required to decide how to deal with that object.
To retrieve the data from Active requests (active requests mean request whose states are not "*-archived")
GET /wmstatsserver/data/filtered_requests?[key]=[value]&mask=[key] HTTP/1.1
Accept: application/json
Host: cmsweb.cern.ch
-
'key' is the property in the request document (i.e. RequestStatus, PrepID, etc) 'value' is the specific value matches the property value in the request document. (key and value are case sensitive)
-
for the same keys, it works as 'or' operator i.e.) RequestStatus=running-open&RequestStatus=running-closed will select requests where the RequestStatus is "running-open" OR "running-closed"
-
from the different keys, it works as 'and' operator i.e) RequestStatus=running-open&PrepID=ABC will select request where the RequestStatus is "running-open" AND PrepID is ABC.
-
mask controls output property. It will returns specified output by mask, i.e.) mask=Campaign&mask=PrepID will return only Campaign and PrepID (also RequestName is always returned without setting the mask explicitly)
-
if the key specified for mask doesn't exists it returns null for that key
-
example
https://cmsweb.cern.ch/wmstatsserver/data/filtered_requests?RequestStatus=new&RequestStatus=assignment-approvedCampaign=PhaseIIFall16LHEGS82&mask=MCPileup&mask=Campaign
will return something like this
{"result": [
{
"MCPileup": null,
"RequestStatus": "new",
"RequestName": "pdmvserv_task_HIG-PhaseIIFall16LHEGS82-00018__v1_T_170228_162325_5100",
"Campaign": "PhaseIIFall16LHEGS82"
},{
"MCPileup": [
"/MinBias_TuneCUETP8M1_14TeV-pythia8/PhaseIIFall16GS82-90X_upgrade2023_realistic_v1-v1/GEN-SIM"
],
"RequestStatus": "assignment-approved",
"RequestName": "pdmvserv_task_HIG-PhaseIIFall16LHEGS82-00021__v1_T_170126_092925_1311",
"Campaign": "PhaseIIFall16LHEGS82"
},{
"MCPileup": null,
"RequestStatus": "assignment-approved",
"RequestName": "pdmvserv_task_HIG-PhaseIIFall16LHEGS82-00018__v1_T_170228_165502_3011",
"Campaign": "PhaseIIFall16LHEGS82"
},{
"MCPileup": null,
"RequestStatus": "assignment-approved",
"RequestName": "pdmvserv_task_HIG-PhaseIIFall16LHEGS82-00018__v1_T_170228_170033_676",
"Campaign": [
"PhaseIIFall16DR82",
"PhaseIIFall16LHEGS82"
]
}]}
This API is meant to provide a list of unmerged LFNs that are undergoing in the workload management system (by retrieving the workflow property: OutputModulesLFNBases
).
It includes transient output LFNs as well as the final unmerged LFNs.
Workflows in one of the following statuses are considered as active in the system:
['assignment-approved', 'assigned', 'staging', 'staged', 'acquired', 'failed',
'running-open', 'running-closed', 'force-complete', 'completed', 'closed-out']
The REST API is protectedlfns
GET /wmstatsserver/data/protectedlfns HTTP/1.1
Accept: application/json
Host: cmsweb.cern.ch
protectedlfns
will return a 503 error in a case where the WMStats data cache is empty.
This API behaves very similar to protectedlfns
, the only difference is that protectedlfns_final
does not yield transient output LFNs (those defined by KeepOutput=True, and/or TransientOutputModules).
It relies on the workflow property: OutputDatasets
, then it builds the unmerged LFNs based on the output dataset names.
The REST API is protectedlfns_final
GET /wmstatsserver/data/protectedlfns_final HTTP/1.1
Accept: application/json
Host: cmsweb.cern.ch
This API returns a list of datasets that are in use by workflows with the following statuses:
['assignment-approved', 'assigned', 'staging', 'staged', 'acquired', 'failed',
'running-open', 'running-closed', 'force-complete', 'completed', 'closed-out']
The REST API is globallocks
GET /wmstatsserver/data/globallocks HTTP/1.1
Accept: application/json
Host: cmsweb.cern.ch
Example Output
{"result": [
"/Cosmics/Commissioning2015-PromptReco-v1/RECO","/Cosmics/CMSSW_7_3_2-CosmicSP-DQMHLTonRAWAOD_2017_TaskChain_InclParents_reqmgr2-v11/RAW-RECO"]}
globallocks
will return a 503 error in a case where the WMStats data cache is empty.
After the retirement of Unified input/output data placement, Dynamic Data Managment (DDM) will use a combination of the WMStats globallocks
and protectedlfns
APIs plus the ReqMgr2 API parentlocks
API to determine the set of global datasets and unmerged files that are in use and should not be removed.