Skip to content

Commit

Permalink
documentation and cleanup pass for v0.6.0
Browse files Browse the repository at this point in the history
  • Loading branch information
JoshKarpel committed May 21, 2020
1 parent e0fd6de commit 704e77a
Show file tree
Hide file tree
Showing 25 changed files with 809 additions and 300 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -110,3 +110,4 @@ venv.bak/
!htmap-exec/singularity.d/*

prof/
docs/source/tutorials/*.txt
3 changes: 3 additions & 0 deletions binder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ FROM htcondor/htc-minimal-notebook:latest

USER root

RUN echo '#!/bin/bash\nfind ${HOME}/tutorials -name '\''*.ipynb'\'' -and -not -iname '\''*-checkpoint.ipynb'\''' > /usr/bin/find_notebooks \
&& chmod +x /usr/bin/find_notebooks

# Use the repository version of HTMap, not whatever was in the htc-notebook.
COPY . ${HOME}/htmap
RUN chown -R ${NB_UID}:${NB_GID} ${HOME}/htmap
Expand Down
8 changes: 8 additions & 0 deletions binder/exec.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/usr/bin/env bash

set -e

CONTAINER_TAG=htmap-binder-exec

docker build -t ${CONTAINER_TAG} --file binder/Dockerfile .
docker run --rm --mount type=bind,source="$(pwd)"/docs/source/tutorials,target=/home/jovyan/tutorials ${CONTAINER_TAG} -- bash -l -c 'sleep 5 && condor_who -wait:60 "IsReady && STARTD_State =?= \"Ready\"" && rm -r /home/jovyan/tutorials/*.txt ; for x in $(find_notebooks); do nbstripout $x && jupyter nbconvert --to notebook --inplace --execute --allow-errors --ExecutePreprocessor.timeout=None $x && htmap remove --all ; done && rm -r /home/jovyan/tutorials/*.txt'
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/autobuild → docs/autobuild.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ set -e
export PYTHONPATH="$PWD:$PYTHONPATH"

echo "NOTE: CONNECT TO http://127.0.0.1:8000 NOT WHAT SPHINX-AUTOBUILD TELLS YOU"
sleep 1
sleep 3

sphinx-autobuild docs/source docs/_build --host 0.0.0.0 --poll --watch htmap/
4 changes: 4 additions & 0 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,10 @@ These functions are useful for generating machine-readable status information.

.. autofunction:: htmap.status_csv

Delivery Methods
----------------

.. autofunction:: htmap.register_delivery_method

Transplant Installs
+++++++++++++++++++
Expand Down
6 changes: 5 additions & 1 deletion docs/source/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,11 @@
CLI Reference
=============

View the available sub-commands with this command:
HTMap provides a command line tool called ``htmap`` that exposes a subset
of functionality focused around monitoring long-running maps without needing
to run Python yourself.

View the available sub-commands by running:

.. code:: shell
Expand Down
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
# -- Project information -----------------------------------------------------

project = 'HTMap'
copyright = '2018, HTCondor Team, Computer Sciences Department, University of Wisconsin-Madison, WI'
copyright = '2018-2020, HTCondor Team, Computer Sciences Department, University of Wisconsin-Madison, WI'
author = 'HTCondor Team'

# The short X.Y version
Expand Down
127 changes: 85 additions & 42 deletions docs/source/dependencies.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,39 +5,52 @@ Dependency Management

.. py:currentmodule:: htmap
Dependency management for Python programs is a thorny issue in general, and running code on computers that you don't own is even thornier.
HTMap provides several methods for ensuring that the software that your code depends on is available for your map components.
This could include other Python packages like ``numpy`` or ``tensorflow``, or external software like ``gcc``.
Dependency management for Python programs is a thorny issue in general, and
running code on computers that you don't own is even thornier.
HTMap provides several methods for ensuring that the software that your code
depends on is available for your map components.
This could include other Python packages like ``numpy`` or ``tensorflow``, or
external software like ``gcc``.

There are two halves of the dependency management game.
The first is on "your" computer, which we call **submit-side**.
This could be your laptop running a personal HTCondor poll,
This could be your laptop running a personal HTCondor pool,
or an HTCondor "submit node" that you ``ssh`` to,
or whatever other way you access your HTCondor pool.
The other side is **execute-side**, which isn't really a single place:
it is all of the execute nodes in the pool that your map components might run on.

Submit-side dependency management can be handled using standard Python package management tools.
We recommend using ``miniconda`` as your package manager (https://docs.conda.io/en/latest/miniconda.html).
Submit-side dependency management can be handled using standard Python package
management tools.
We recommend using ``miniconda`` as your package manager
(https://docs.conda.io/en/latest/miniconda.html).

HTMap itself requires that execute-side can run a Python script using a Python install that also has ``htmap`` installed.
That Python installation also needs whatever other packages your code needs to run.
For example, if you ``import numpy`` in your code, you need to have ``numpy`` installed execute-side.
HTMap itself requires that execute-side can run a Python script using a Python
install that also has ``htmap`` installed.
That Python installation also needs whatever other packages your code needs to
run.
For example, if you ``import numpy`` in your code, you need to have ``numpy``
installed execute-side.

As mentioned above, HTMap provides several "delivery methods" for getting that Python install to the execute location.
As mentioned above, HTMap provides several "delivery methods" for getting that
Python installation to the execute location.
The built-in delivery methods are

* ``docker`` - runs in a (possibly user-supplied) Docker container.
* ``singularity`` - runs in a (possibly user-supplied) Singularity container.
* ``shared`` - runs with the same Python installation used submit-side.
* ``assume`` - assumes that the dependencies have already been installed at the execute location.
* ``transplant`` - copy the submit-side Python installation to the execute location.
* ``assume`` - assumes that the dependencies have already been installed at
the execute location.
* ``transplant`` - copy the submit-side Python installation to the execute
location.

More details on each of these methods can be found below.

The default delivery method is ``docker``, with the default image ``htcondor/htmap-exec:<version>``,
The default delivery method is ``docker``, with the default image
``htcondor/htmap-exec:<version>``,
where version will match the version of HTMap you are using submit-side.
If your pool can run Docker jobs and your Python code does not depend on any custom packages
If your pool can run Docker jobs and your Python code does not depend on any
custom packages
(i.e., you never import any modules that you wrote yourself),
this default behavior will likely work for you without requiring any changes.
See the section below on Docker if this isn't the case!
Expand Down Expand Up @@ -77,11 +90,16 @@ At runtime:
In this mode, HTMap will run inside a Docker image that you provide.
Remember that this Docker image needs to have the ``htmap`` module installed.
The default Docker image is `htcondor/htmap-exec <https://hub.docker.com/r/htcondor/htmap-exec/>`_,
The default Docker image is
`htcondor/htmap-exec <https://hub.docker.com/r/htcondor/htmap-exec/>`_,
which is based on Python 3 and has many useful packages pre-installed.

If you want to use your own Docker image, just change the ``'DOCKER.IMAGE'`` setting.
Because of limitations in HTCondor, your Docker image needs to be pushed back to `Docker Hub <https://hub.docker.com/>`_ to be usable.
If you want to use your own Docker image, just change the ``'DOCKER.IMAGE'``
setting.
Your Docker image needs to be pushed back to
`Docker Hub <https://hub.docker.com/>`_
(or some other Docker image registry that your HTCondor pool can access)
to be usable.
For example, a very simple Dockerfile that can be used with HTMap is

.. code-block:: docker
Expand All @@ -90,18 +108,23 @@ For example, a very simple Dockerfile that can be used with HTMap is
RUN pip install --no-cache-dir htmap
This would create a Docker image with the latest versions of Python 3 and ``htmap`` installed.
From here you could install more Python dependencies, or add more layers to account for other dependencies.
This would create a Docker image with the latest versions of Python 3 and
``htmap`` installed.
From here you could install more Python dependencies, or add more layers to
account for other dependencies.

.. attention::

More information on building Docker images for use with HTMap can be found in the :doc:`recipes/docker-image-cookbook`.
More information on building Docker images for use with HTMap can be found
in the :doc:`recipes/docker-image-cookbook`.


.. warning::

This delivery mechanism will only work if your HTCondor pool supports Docker jobs!
If it doesn't, you'll need to talk to your pool administrators or use a different delivery mechanism.
This delivery mechanism will only work if your HTCondor pool supports
Docker jobs!
If it doesn't, you'll need to talk to your pool administrators or use a
different delivery mechanism.


Run Inside a Singularity Container
Expand All @@ -124,30 +147,43 @@ At runtime:
htmap.settings["SINGULARITY.IMAGE"] = "<image>"
In this mode, HTMap will run inside a Singularity image that you provide.
Remember that this Singularity image needs to have the ``cloudpickle`` module installed.
Remember that this Singularity image needs to have the ``cloudpickle`` module
installed.

Singularity can also use Docker images.
Specify a Docker Hub image as ``htmap.settings['SINGULARITY.IMAGE'] = "docker://<repository>/<image>:<tag>"`` to download a Docker image from DockerHub and automatically use it as a Singularity image.
Specify a Docker Hub image as
``htmap.settings['SINGULARITY.IMAGE'] = "docker://<repository>/<image>:<tag>"``
to download a Docker image from DockerHub and automatically use it as a
Singularity image.

For consistency with Docker delivery, the default Singularity image is `docker://continuumio/anaconda3:latest <https://hub.docker.com/r/continuumio/anaconda3/>`_, which has many useful packages pre-installed.
For consistency with Docker delivery, the default Singularity image is
`docker://continuumio/anaconda3:latest <https://hub.docker.com/r/continuumio/anaconda3/>`_,
which has many useful packages pre-installed.

If you want to use your own Singularity image, just change the ``'SINGULARITY.IMAGE'`` setting.
If you want to use your own Singularity image, just change the
``'SINGULARITY.IMAGE'`` setting.

.. warning::

This delivery mechanism will only work if your HTCondor pool supports Singularity jobs!
If it doesn't, you'll need to talk to your pool administrators or use a different delivery mechanism.
This delivery mechanism will only work if your HTCondor pool supports
Singularity jobs!
If it doesn't, you'll need to talk to your pool administrators or use a
different delivery mechanism.


.. note::

When using this delivery method, HTMap will discover ``python3`` on the system ``PATH`` and use that to run your code.
When using this delivery method, HTMap will discover ``python3`` on the
system ``PATH`` and use that to run your code.


.. warning::

This delivery method relies on the directory ``/htmap/scratch`` either existing in the Singularity image, or Singularity being able to run with ``overlayfs``.
If you get a ``stderr`` message from Singularity about a bind mount directory not existing, that's the problem.
This delivery method relies on the directory ``/htmap/scratch`` either
existing in the Singularity image, or Singularity being able to run
with ``overlayfs``.
If you get a ``stderr`` message from Singularity about a bind mount
directory not existing, that's the problem.


Run With a Shared Python Installation
Expand Down Expand Up @@ -196,10 +232,10 @@ At runtime:
htmap.settings["DELIVERY_METHOD"] = 'assume'
In this mode, HTMap assumes that a Python installation with all Python dependencies is already present.
This will almost surely require some additional setup by your HTCondor pool's administrators.

Additional dependencies can still be delivered via :class:`MapOptions`.
In this mode, HTMap assumes that a Python installation with all Python
dependencies is already present.
This will almost surely require some additional setup by your HTCondor
pool's administrators.


Transplant Existing Python Install
Expand All @@ -217,24 +253,31 @@ At runtime:
htmap.settings["DELIVERY_METHOD"] = 'transplant'
If you are running HTMap from a standalone Python install (like an Anaconda installation),
you can use this delivery mechanism to transfer a copy of your entire Python install.
All locally-installed packages (including ``pip -e`` "editable" installs) will be available.
If you are running HTMap from a standalone Python install
(like an Anaconda installation),
you can use this delivery mechanism to transfer a copy of your entire Python
install.
All locally-installed packages (including ``pip -e`` "editable" installs) will
be available.

For advanced transplant functionality, see :ref:`transplant-settings`.

.. note::

The first time you run a map after installing/removing packages, you will need to wait while HTMap re-zips your installation.
The first time you run a map after installing/removing packages,
you will need to wait while HTMap re-zips your installation.
Subsequent maps will use the cached version.

HTMap uses ``pip`` to check whether the cached Python is current, so make sure that ``pip`` is installed in your Python.
HTMap uses ``pip`` to check whether the cached Python is current, so make
sure that ``pip`` is installed in your Python.

.. warning::

This mechanism does not work with system Python installations (which you shouldn't be using anyway!).
This mechanism does not work with system Python installations
(which you shouldn't be using anyway!).

.. note::

When using the transplant method the transplanted Python installation will be used to run the component,
When using the transplant method the transplanted Python installation will
be used to run the component,
regardless of any other Python installations that might exist execute-side.
27 changes: 20 additions & 7 deletions docs/source/devs/env.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ in a development container.
inside (and vice versa).

Anything you pass to ``dr`` will be executed inside the container.
By default (i.e., if you pass nothing) you will get a ``bash`` shell.
The initial working directory is the ``htmap`` repository inside the container.
If you pass nothing, it will run ``bash`` with no arguments, giving you a shell
to work in.
Expand All @@ -30,7 +31,7 @@ development container with multiple workers:

.. code:: shell
$ ./dr bash # for example
$ ./dr
# ...
mapper@161b6af91d72:~/htmap$ pytest
Expand All @@ -41,25 +42,34 @@ development container with multiple workers:
mapper@161b6af91d72:~/htmap$ pytest -n 4
See `pytest-xdist <https://pypi.org/project/pytest-xdist/>`_ for more details.
The test suite is very slow when run serially; we highly recommend running
with a large number of workers (on a moderately-powerful desktop it seemed to
saturate around 10).


Building the Docs
-----------------

HTMap's documentation is served by `Read the Docs <https://readthedocs.org/>`_,
which builds the docs as well.
However, it can be helpful to build the docs locally during development.
From inside the development container,
The docs are built automatically on each commit to master.

It can be helpful to build the docs locally during development.
We use ``sphinx-autobuild`` to serve the documentation via a local webserver
and automatically rebuild the documentation when changes are made to the
package source code or the documentation itself.
To run the small wrapper script we have written around ``sphinx-autobuild``,
from inside or outside the development container run,

.. code:: shell
$ ./dr bash
$ ./dr
# ...
mapper@161b6af91d72:~/htmap$ ./docs/autobuild
mapper@161b6af91d72:~/htmap$ docs/autobuild.sh
NOTE: CONNECT TO http://127.0.0.1:8000 NOT WHAT SPHINX-AUTOBUILD TELLS YOU
# trimmed; visit URL above
Note the startup message: ignore the link that `sphinx-autobuild` gives you,
Note the startup message: ignore the link that ``sphinx-autobuild`` gives you,
and instead go to http://127.0.0.1:8000 to see the built documentation.


Expand All @@ -75,10 +85,13 @@ To test whether the Binder container is working properly, run the

.. code:: shell
$ ./binder/test.sh
$ ./binder/run.sh
It will give you a link to enter into your web browser that will land you in the
same Jupyter environment you would get on Binder.

The ``binder/edit.sh`` script will do the same, but also bind-mount the
tutorials into the container so that they can be edited in the Jupyter environment.

When preparing a release, run ``binder/exec.sh`` and commit the results into
the repository.
4 changes: 3 additions & 1 deletion docs/source/devs/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ How to Release a New HTMap Version
To release a new version of HTMap:

#. Run ``binder/exec.sh`` and commit the resulting executed tutorial notebooks
into the repository.
#. Merge the version PR into ``master`` via GitHub.
#. Make a GitHub release from https://github.com/htcondor/htmap/releases .
Name it exactly ``vX.Y.Z``, and link to the release notes for that version
Expand All @@ -14,7 +16,7 @@ To release a new version of HTMap:
#. Delete anything in the ``dist/`` directory in your copy of the repository.
#. On your machine, make sure ``master`` is up-to-date, then run
``python3 setup.py sdist bdist_wheel`` to create the source distribution
and the wheel. (This is where the files in ``dist/`` are created.)
and the wheel.
#. Install Twine: ``pip install twine``.
#. Upload to PyPI:
``python3 -m twine upload dist/*``.
Expand Down
Loading

0 comments on commit 704e77a

Please sign in to comment.