Skip to content

Commit

Permalink
Restructure CLI (#20)
Browse files Browse the repository at this point in the history
- Move CLI commands to seperate groups
- Add click-completion
- Make imports in CLI lazy, to improve responsiveness
- Update README.md
  • Loading branch information
chrisjsewell authored Mar 1, 2020
1 parent 96c161d commit 66912e3
Show file tree
Hide file tree
Showing 14 changed files with 389 additions and 260 deletions.
238 changes: 149 additions & 89 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ to come ...

## Example CLI usage

From checked-out folder:
From checked-out repository folder:

```console
$ jcache -h
Expand All @@ -50,37 +50,53 @@ Usage: jcache [OPTIONS] COMMAND [ARGS]...
The command line interface of jupyter-cache.

Options:
-v, --version Show the version and exit.
-p, --cache-path Print the current cache path and exit.
-h, --help Show this message and exit.
-v, --version Show the version and exit.
-p, --cache-path Print the current cache path and exit.
-a, --autocomplete Print the terminal autocompletion command and exit.
-h, --help Show this message and exit.

Commands:
cache-limit Change the maximum number of notebooks stored in the cache.
cache-nb Cache a notebook that has already been executed.
cache-nbs Cache notebook(s) that have already been executed.
cat-artifact Print the contents of a cached artefact.
clear Clear the cache completely.
diff-nb Print a diff of a notebook to one stored in the cache.
execute Execute outdated notebooks.
list-cached List cached notebook records in the cache.
list-staged List notebooks staged for possible execution.
remove-cached Remove notebooks stored in the cache.
show-cached Show details of a cached notebook in the cache.
show-staged Show details of a staged notebook.
stage-nb Cache a notebook, with possible assets.
stage-nbs Stage notebook(s) for execution.
unstage-nbs Unstage notebook(s) for execution.
cache Commands for adding to and inspecting the cache.
clear Clear the cache completely.
config Commands for configuring the cache.
execute Execute staged notebooks that are outdated.
stage Commands for staging notebooks to be executed.
```

**Important**: Execute this in the terminal for auto-completion:

```console
eval "$(_JCACHE_COMPLETE=source jcache)"
```

### Caching Executed Notebooks

You can cache notebooks straight into the cache. When caching, a check will be made that the notebooks look to have been executed correctly, i.e. the cell execution counts go sequentially up from 1.
```console
$ jcache cache -h
Usage: jcache cache [OPTIONS] COMMAND [ARGS]...

Commands for adding to and inspecting the cache.

Options:
-h, --help Show this message and exit.

Commands:
add-many Cache notebook(s) that have already been executed.
add-one Cache a notebook, with possible artefact files.
cat-artifact Print the contents of a cached artefact.
diff-nb Print a diff of a notebook to one stored in the cache.
list List cached notebook records in the cache.
remove Remove notebooks stored in the cache.
show Show details of a cached notebook in the cache.
```

You can add notebooks straight into the cache. When caching, a check will be made that the notebooks look to have been executed correctly, i.e. the cell execution counts go sequentially up from 1.

```console
$ jcache cache-nbs tests/notebooks/basic.ipynb
Cache path: /Users/cjs14/GitHub/sandbox/.jupyter_cache
$ jcache cache add-many tests/notebooks/basic.ipynb
Cache path: jupyter-cache/.jupyter_cache
The cache does not yet exist, do you want to create it? [y/N]: y
Caching: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic.ipynb
Caching: jupyter-cache/tests/notebooks/basic.ipynb
Validity Error: Expected cell 1 to have execution_count 1 not 2
The notebook may not have been executed, continue caching? [y/N]: y
Success!
Expand All @@ -89,11 +105,12 @@ Success!
Or to skip validation:

```console
$ jcache cache-nbs --no-validate tests/notebooks/*.ipynb
Caching: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic.ipynb
Caching: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic_failing.ipynb
Caching: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic_unrun.ipynb
Caching: /Users/cjs14/GitHub/sandbox/tests/notebooks/complex_outputs.ipynb
jcache cache add-many --no-validate tests/notebooks/*.ipynb
Caching: jupyter-cache/tests/notebooks/basic.ipynb
Caching: jupyter-cache/tests/notebooks/basic_failing.ipynb
Caching: jupyter-cache/tests/notebooks/basic_unrun.ipynb
Caching: jupyter-cache/tests/notebooks/complex_outputs.ipynb
Caching: jupyter-cache/tests/notebooks/external_output.ipynb
Success!
```

Expand All @@ -102,60 +119,67 @@ Once you've cached some notebooks, you can look at the 'cache records' for what
Each notebook is hashed (code cells and kernel spec only), which is used to compare against 'staged' notebooks. Multiple hashes for the same URI can be added (the URI is just there for inspetion) and the size of the cache is limited (current default 1000) so that, at this size, the last accessed records begin to be deleted. You can remove cached records by their ID.

```console
$ jcache list-cached --hashkeys
ID URI Created Accessed Hashkey
---- --------------------- ---------------- ---------------- --------------------------------
4 complex_outputs.ipynb 2020-02-23 20:33 2020-02-23 20:33 800c4a057730a55a384cfe579e3850aa
3 basic_unrun.ipynb 2020-02-23 20:33 2020-02-23 20:33 818f3412b998fcf4fe9ca3cca11a3fc3
2 basic_failing.ipynb 2020-02-23 20:33 2020-02-23 20:33 72859c2bf1e12f35f30ef131f0bef320
$ jcache cache list
ID URI Created Accessed
---- ------------------------------------- ---------------- ----------------
5 tests/notebooks/external_output.ipynb 2020-02-29 03:17 2020-02-29 03:17
4 tests/notebooks/complex_outputs.ipynb 2020-02-29 03:17 2020-02-29 03:17
3 tests/notebooks/basic_unrun.ipynb 2020-02-29 03:17 2020-02-29 03:17
2 tests/notebooks/basic_failing.ipynb 2020-02-29 03:17 2020-02-29 03:17
```

You can also cache notebooks with artefacts (external outputs of the notebook execution).

```console
$ jcache cache-nb -nb tests/notebooks/basic.ipynb tests/notebooks/artifact_folder/artifact.txt
Caching: /Users/cjs14/GitHub/jupyter-cache/tests/notebooks/basic.ipynb
$ jcache cache add-one -nb tests/notebooks/basic.ipynb tests/notebooks/artifact_folder/artifact.txt
Caching: jupyter-cache/tests/notebooks/basic.ipynb
Success!
```

Show a full description of a cached notebook by referring to its ID

```console
$ jcache show-cached 1
ID: 1
URI: /Users/cjs14/GitHub/jupyter-cache/tests/notebooks/basic.ipynb
Created: 2020-02-24 14:58
Accessed: 2020-02-24 14:58
$ jcache cache show 6
ID: 6
URI: jupyter-cache/tests/notebooks/basic.ipynb
Created: 2020-02-29 03:19
Accessed: 2020-02-29 03:19
Hashkey: 818f3412b998fcf4fe9ca3cca11a3fc3
Artifacts:
- artifact_folder/artifact.txt
```

```console
$ jcache cat-artifact 1 artifact_folder/artifact.txt
An artifact
Note artefact paths must be 'upstream' of the notebook folder:

```console
$ jcache cache add-one -nb tests/notebooks/basic.ipynb tests/test_db.py
Caching: jupyter-cache/tests/notebooks/basic.ipynb
Artifact Error: Path 'jupyter-cache/tests/test_db.py' is not in folder 'jupyter-cache/tests/notebooks''
```

These must be 'upstream' of the notebook folder:
To view the contents of an execution artefact:

```console
$ jcache cache-nb -nb tests/notebooks/basic.ipynb tests/test_db.py
Caching: /Users/cjs14/GitHub/jupyter-cache/tests/notebooks/basic.ipynb
Artifact Error: Path '/Users/cjs14/GitHub/jupyter-cache/tests/test_db.py' is not in folder '/Users/cjs14/GitHub/jupyter-cache/tests/notebooks''
$ jcache cache cat-artifact 1 artifact_folder/artifact.txt
An artifact

```

You can directly remove a cached notebook by its ID:

```console
$ jcache remove-cached 3
Removing Cache ID = 3
$ jcache cache remove 4
Removing Cache ID = 4
Success!
```

You can also diff any of the cached notebooks with any (external) notebook:

```console
$ jcache diff-nb 2 tests/notebooks/basic.ipynb
$ jcache cache diff-nb 2 tests/notebooks/basic.ipynb
nbdiff
--- cached pk=2
+++ other: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic.ipynb
+++ other: sandbox/tests/notebooks/basic.ipynb
## inserted before nb/cells/1:
+ code cell:
+ execution_count: 2
Expand All @@ -177,96 +201,132 @@ nbdiff

### Staging Notebooks for execution

```console
$ jcache stage -h
Usage: jcache stage [OPTIONS] COMMAND [ARGS]...

Commands for staging notebooks to be executed.

Options:
-h, --help Show this message and exit.

Commands:
add-many Stage notebook(s) for execution.
add-one Stage a notebook, with possible asset files.
list List notebooks staged for possible execution.
remove-ids Un-stage notebook(s), by ID.
remove-uris Un-stage notebook(s), by URI.
show Show details of a staged notebook.
```

Staged notebooks are recorded as pointers to their URI,
i.e. no physical copying takes place until execution time.

If you stage some notebooks for execution, then you can list them to see which have existing records in the cache (by hash) and which will require execution:
If you stage some notebooks for execution,
then you can list them to see which have existing records in the cache (by hash),
and which will require execution:

```console
$ jcache stage-nbs tests/notebooks/*.ipynb
Staging: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic.ipynb
Staging: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic_failing.ipynb
Staging: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic_unrun.ipynb
Staging: /Users/cjs14/GitHub/sandbox/tests/notebooks/complex_outputs.ipynb
$ jcache stage add-many tests/notebooks/*.ipynb
Staging: jupyter-cache/tests/notebooks/basic.ipynb
Staging: jupyter-cache/tests/notebooks/basic_failing.ipynb
Staging: jupyter-cache/tests/notebooks/basic_unrun.ipynb
Staging: jupyter-cache/tests/notebooks/complex_outputs.ipynb
Staging: jupyter-cache/tests/notebooks/external_output.ipynb
Success!
```

```console
$ jcache list-staged
ID URI Created Cache ID
---- ------------------------------------- ---------------- -----------
4 tests/notebooks/complex_outputs.ipynb 2020-02-23 20:48 4
3 tests/notebooks/basic_unrun.ipynb 2020-02-23 20:48
2 tests/notebooks/basic_failing.ipynb 2020-02-23 20:48 2
1 tests/notebooks/basic.ipynb 2020-02-23 20:48
$ jcache stage list
ID URI Created Assets Cache ID
---- ------------------------------------- ---------------- -------- ----------
5 tests/notebooks/external_output.ipynb 2020-02-29 03:29 0 5
4 tests/notebooks/complex_outputs.ipynb 2020-02-29 03:29 0
3 tests/notebooks/basic_unrun.ipynb 2020-02-29 03:29 0 6
2 tests/notebooks/basic_failing.ipynb 2020-02-29 03:29 0 2
1 tests/notebooks/basic.ipynb 2020-02-29 03:29 0 6
```

You can remove a staged notebook by its URI or ID:

```console
$ jcache stage remove-ids 4
Unstaging ID: 4
Success!
```

You can then run a basic execution of the required notebooks:

```console
$ jcache cache remove 6
Removing Cache ID = 6
Success!
$ jcache execute
Executing: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic.ipynb
Success: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic.ipynb
Executing: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic_unrun.ipynb
Success: /Users/cjs14/GitHub/sandbox/tests/notebooks/basic_unrun.ipynb
Executing: jupyter-cache/tests/notebooks/basic.ipynb
Success: jupyter-cache/tests/notebooks/basic.ipynb
Executing: jupyter-cache/tests/notebooks/basic_unrun.ipynb
Success: jupyter-cache/tests/notebooks/basic_unrun.ipynb
Finished!
```

Successfully executed notebooks will be cached to the cache,
along with any 'artefacts' created by the execution, that are inside the notebook folder, and data supplied by the executor.

```console
$ jcache list-staged
ID URI Created Commit ID
---- ------------------------------------- ---------------- -----------
5 tests/notebooks/basic.ipynb 2020-02-23 20:57 5
4 tests/notebooks/complex_outputs.ipynb 2020-02-23 20:48 4
3 tests/notebooks/basic_unrun.ipynb 2020-02-23 20:48 6
2 tests/notebooks/basic_failing.ipynb 2020-02-23 20:48 2
$ jcache stage list
ID URI Created Assets Cache ID
---- ------------------------------------- ---------------- -------- ----------
5 tests/notebooks/external_output.ipynb 2020-02-29 03:29 0 5
3 tests/notebooks/basic_unrun.ipynb 2020-02-29 03:29 0 6
2 tests/notebooks/basic_failing.ipynb 2020-02-29 03:29 0 2
1 tests/notebooks/basic.ipynb 2020-02-29 03:29 0 6
```

```console
jcache show-cached 5
ID: 1
URI: /Users/cjs14/GitHub/jupyter-cache/tests/notebooks/basic.ipynb
Created: 2020-02-25 19:21
Accessed: 2020-02-25 19:21
$ jcache cache show 6
ID: 6
URI: jupyter-cache/tests/notebooks/basic_unrun.ipynb
Created: 2020-02-29 03:41
Accessed: 2020-02-29 03:41
Hashkey: 818f3412b998fcf4fe9ca3cca11a3fc3
Data:
execution_seconds: 1.4187269599999999
execution_seconds: 1.2328746560000003
```

Once executed you may leave staged notebooks, for later re-execution, or remove them:

```console
$ jcache unstage-nbs --all
$ jcache stage remove-ids --all
Are you sure you want to remove all? [y/N]: y
Unstaging: /Users/cjs14/GitHub/jupyter-cache/tests/notebooks/basic.ipynb
Unstaging ID: 1
Unstaging ID: 2
Unstaging ID: 3
Unstaging ID: 5
Success!
```

You can also stage notebooks with assets; external files that are required by the notebook during execution. As with artefacts,
these files must be in the same folder as the notebook, or a sub-folder.

```console
$ jcache stage-nb -nb tests/notebooks/basic.ipynb tests/notebooks/artifact_folder/artifact.txt
$ jcache stage add-one -nb tests/notebooks/basic.ipynb tests/notebooks/artifact_folder/artifact.txt
Success!
```

```console
$ jcache list-staged
$ jcache stage list
ID URI Created Assets
---- --------------------------- ---------------- --------
1 tests/notebooks/basic.ipynb 2020-02-25 10:01 1
```

```console
$ jcache show-staged 1
$ jcache stage show 1
ID: 1
URI: /Users/cjs14/GitHub/jupyter-cache/tests/notebooks/basic.ipynb
URI: jupyter-cache/tests/notebooks/basic.ipynb
Created: 2020-02-25 10:01
Assets:
- /Users/cjs14/GitHub/jupyter-cache/tests/notebooks/artifact_folder/artifact.txt
- jupyter-cache/tests/notebooks/artifact_folder/artifact.txt
```

## Contributing
Expand Down
1 change: 0 additions & 1 deletion jupyter_cache/cache/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +0,0 @@
from .main import JupyterCacheBase, DEFAULT_CACHE_LIMIT # noqa: F401
7 changes: 7 additions & 0 deletions jupyter_cache/cli/commands/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,9 @@
import click_completion

# Activate the completion of parameter types provided by the click_completion package
click_completion.init()

from .cmd_cache import * # noqa: F401,F403
from .cmd_config import * # noqa: F401,F403
from .cmd_exec import * # noqa: F401,F403
from .cmd_stage import * # noqa: F401,F403
Loading

0 comments on commit 66912e3

Please sign in to comment.