FAIR Data Pipeline Command Line Interface

FAIR-CLI forms the main interface for synchronising changes between local and shared remote FAIR Data Pipeline registries, it is also used to instantiate model runs/data submissions to the pipeline. Full documentation of the FAIR Data Pipeline can be found on the project website.

Installation

The package is installed using Pip:

pip install fair-cli

To enable tab completion you need to modify your shell:

Bash

_FAIR_COMPLETE=bash_source fair > ~/.config/.fair-complete.bash
echo '. ~/.config/.fair-complete.bash' >> ~/.bashrc

zsh

_FAIR_COMPLETE=zsh_source fair > ~/.fair-complete.zsh
echo '. ~/.fair-complete.zsh' >> ~/.bashrc

Fish

_FAIR_COMPLETE=bash_source fair > ~/.config/fish/.fair-complete.fish
echo '. ~/.config/fish/.fair-complete.fish' >> ~/.bashrc

Uninstallation

To uninstall the CLI run:

fair purge --all
pip uninstall fair

The User Configuration File

Job runs are configured via config.yaml files. Upon initialisation of a project, FAIR-CLI automatically generates a starter configuration file with all requirements in place. To execute a process (e.g. perform a model run from a compiled binary/script) an additional key of either script or script_path must be provided. Alternatively the command fair run bash can be used to append the key and run a command directly.

By default the shell used to execute a process is sh or batch for UNIX and Windows systems respectively. This can be overwritten by assigning the optional shell key with one of the following values (where {0} is the script file):

Shell	Command
`bash`	`bash -eo pipefail {0}`
`java`	`java {0}`
`julia`	`julia {0}`
`powershell`	`powershell -command ". '{0}'"`
`pwsh`	`pwsh -command ". '{0}'"`
`python2`	`python2 {0}`
`python3`	`python3 {0}`
`python`	`python {0}`
`R`	`R -f {0}`
`sh`	`sh -e {0}`
`batch`	`{0}`

A full description of config.yaml files can be found here.

Available Commands

`init`

Initialises a new FAIR repository within the given directory. This should ideally be the same location as the .git folder for the current project, however during setup an option is given to specify an alternative. The command will ask the user a series of questions which will provide metadata for tracking run authors, and also allow for the creation of a starter config.yaml file. Initialisation will also configure the CLI itself.

Custom CLI Configuration

After setup is complete, the current CLI configuration can also be saved using the command:

fair init --export

the created file can then be re-read at a later point during setup. Alternatively, if creating a configuration from scratch the YAML file should contain the following information:

namespaces:
  input: testing
  output: testing
registries:
  local:
    data_store: /path/to/local/data_store/,
    directory: /local/registry/install/directory
    uri: http://127.0.0.1:8000/api/
  origin:
    data_store: /remote/registry/data/store/path/
    token: /path/to/remote/token
    uri: https://data.fairdatapipeline.org/api/'
user:
  email: 'test@noreply',
  family_name: 'Test'
  given_names: 'Interface'
  orcid: None,
  uuid: '2ddb2358-84bf-43ff-b2aa-3ac7dc3b49f1'
git:
  local_repo: /local/repo/path
  remote: origin
description: Testing Project

this file is then read during the initialisation:

fair init --using <cli-config.yaml file>

For integration into a CI workflow, the setup can be skipped by running:

fair init --ci

which will create temporary directories for some of the required location paths.

`run`

The purpose of run is to execute a model/submission run and submit results to the local registry. Outputs of a run will be stored within the coderun folder in the directory specified under the data_store tag in the config.yaml, by default this is $HOME/.fair/data/coderun.

fair run

If you wish to use an alternative config.yaml then specify it as an additional argument:

fair run /path/to/config.yaml

You can also launch a bash command directly, this will be automatically written into the config.yaml:

fair run --script 'echo "Hello World"'

note the command itself must be quoted as it is a single argument.

By default the CLI will not allow the user to perform a run if the state of the analysis repository is such that it is behind the git remote, or contains uncommitted changes. To override this behaviour use the --dirty flag.

`pull`

The command pull will update any entries within the config.yaml under the register heading creating external_object and data_product objects on the registry and downloading the data to the local data storage. Any data required for a run is downloaded and stored within the local registry. In addition any data products requested that are available on the remote registry are pulled locally.

fair pull /path/to/config.yaml

`status`

This command displays objects which are awaiting staging or have been staged behaving in a manner similar to git status:

fair status

staged changes are displayed in green, and unstaged in red.

`add`

Before changes can be pushed to the remote registry they must be staged. This command allows you to stage objects displayed when running fair status so that they can be sent to the remote registry. Data products are displayed and staged in the form namespace:data_product_name@version:

fair add my_namespace:[email protected]

`push`

The push command will push any staged data products to the remote registry:

fair push

`purge`

The purge command removes setup of the current project so it can bereinitialised:

fair purge

To remove all configurations entirely (including those global to all projects) run:

fair purge --global

To remove the data directory itself run:

fair purge --data

WARNING: This is not recommended as the registry may still have entries pointing to this location!

Finally to remove everything run:

fair purge --all

this will remove the current repository .fair folder and the global FAIR directory which also contains the local registry.

You can skip any confirmation messages by running:

fair purge --yes

`registry`

By default the CLI will launch the registry whenever a synchronisation or run is called. The server will only be halted once all ongoing CLI processes (in the case of multiple parallel calls) have been completed.

However the user may also specify a manual launch that will override this behaviour, instead leaving the server running constantly allowing them to view the registry in the browser.

The commands:

fair registry start

and

fair registry stop

will launch and halt the server respectively.

The registry can be installed using the CLI as well by running:

fair registry install

with the additional options to specify the installation location, and the data registry repository tag to install from:

fair registry install --directory ~/.fair/my_registry --version v1.0-rc5

`log`

Runs are logged locally within the local FAIR repository. A full list of runs is shown by running:

fair log

This will present a list of runs in a summary analogous to a git log call:

run 0db35c20946a1ebeaafdc3b30103cd74a57eb6b6
Author: Joe Bloggs <[email protected]>
Date:   Wed Jun 30 09:09:30 2021

NOTE
The SHA for a job is not related to a registry code run identifier as multiple code runs can be executed within a single job.

`view`

To view the stdout of a run given its SHA as shown by running fair log use the command:

fair view <sha>

you do not need to specify the full SHA but rather the first few unique characters.

Template Variables

Within the config.yaml file, template variables can be specified by using the notation ${{ VAR }}, the following variables are currently recognised:

Variable	Description
`DATE`	Date in the form `%Y%m%d`
`DATETIME`	Date and time in the form `%Y-%m-%sT%H:%M:S`
`DATETIME-%Y%H%M`	Date and time in custom format (where `%Y%H%M` can be any valid form)
`USER`	The current user as defined in the CLI
`USER_ID`	The unique identifier for the current user
`REPO_DIR`	The FAIR repository root directory
`CONFIG_DIR`	The directory containing the `config.yaml` after template substitution
`LOCAL_TOKEN`	The token for access to the local registry
`SOURCE_CONFIG`	Path of the user defined `config.yaml`
`GIT_BRANCH`	Current branch of the `git` repository
`GIT_REMOTE`	The URI of the git repository specified during setup
`GIT_TAG`	The latest tag on `git`

Name		Name	Last commit message	Last commit date
Latest commit History 916 Commits
.github		.github
docs		docs
fair		fair
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FAIR Data Pipeline Command Line Interface

Installation

Bash

zsh

Fish

Uninstallation

The User Configuration File

Available Commands

`init`

Custom CLI Configuration

`run`

`pull`

`status`

`add`

`push`

`purge`

`registry`

`log`

`view`

Template Variables

About

Releases 7

Packages

Contributors 8

Languages

License

FAIRDataPipeline/FAIR-CLI

Folders and files

Latest commit

History

Repository files navigation

FAIR Data Pipeline Command Line Interface

Installation

Bash

zsh

Fish

Uninstallation

The User Configuration File

Available Commands

init

Custom CLI Configuration

run

pull

status

add

push

purge

registry

log

view

Template Variables

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 8

Languages

`init`

`run`

`pull`

`status`

`add`

`push`

`purge`

`registry`

`log`

`view`

Packages