WMStats proposal

This wiki collects set of actions and thoughts about WMStats re-design proposal.

The WMStats will consist of two independent services:

WMStats cache server which will communicate with CouchDB (primary source of information)
- it is responsible for serving data from CouchDB to other clients
WMStats UI server to represent WMStats data to end-users. This service should be lightweight and very responsive to user queries, filters, etc.

To accomplish this model we need the following set of actions:

ensure that data comes from WMStats cache server in appropriate data-format
- we need to use flat schema with static keys
- we need ability to fetch data in chunks, a.k.a. pagination
- we need data-streaming, therefore it is desired to support both application/json and application/ndjson data-format
- we need to apply gzip encoding for HTTP request between server and client
For more details please refer to appropriate section of WMStats details document
re-evaluate data that might not be needed
enforce static schema
provide various benchmarks between different data representation (dynamic vs static schema, find CPU, RAM and how to scale)
we need to outline RESTful APIs
- /fetch?idx=1&limit=10 to provide stream of data
- /fetch/<workflow> to provide details of single workflow
- /sites/<site> to provide site information
- /campaign/<campaign> to provide campaign information
- /agent/<agent> to provide WMAgent information
- /release/<release> to provide release information
to proceed transition between current and new implementation we need to provide mocking data using static schema
- should be able to generate necessary set of records with dummy content but proper data-types
we need WMStats UI server to adapt new static schema
we need to decide on technology, language, etc.
- which programming language to use for WMStats implementation, e.g. current python server, Go implementation, any other one
- which database layer to use
- which CSS/JS frameworks to use, e.g. kube CSS framework (used in ReqMgr2, WMArchive, etc)
we need to find producer/consumers of this data
WMStats UI server should provide flexible filters (dynamically generated at run-time or static pre-process)

The current WMStats server stores documents using dynamic keys, see example. We suggest to convert this record (and remove unnecessary keys/values) using static schema, e.g.

[
   {"workflow": "bla-bla',
    "nevents:1,
   },
]

where all keys will be pre-define, using CamelCase naming convention, and all values will have well defined data-types.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WMStats proposal

Clone this wiki locally