Update the maintenance and development guide to include info from the…

… SMG - which is now redundant. Split out the anlaysis plugin section to its own page.
cedadev · Sep 16, 2015 · c11f53c · c11f53c
1 parent 1080d9c
commit c11f53c
Show file tree

Hide file tree

Showing 3 changed files with 212 additions and 123 deletions.
diff --git a/doc/analysis_plugin_development.rst b/doc/analysis_plugin_development.rst
@@ -0,0 +1,132 @@
+===========================
+Analysis plugin development
+===========================
+
+Users can write their own plugins for performing the collocation of two data sets.
+There are three different types of plugin available for collocation, first we will describe the overall design and how
+these different components interact, then each will be described in more detail.
+
+Basic collocation design
+========================
+
+The diagram below demonstrates the basic design of the collocation system, and the roles of each of the components.
+In the simple case of the default collocator (which returns only one value) the :ref:`Collocator <collocator_description>`
+loops over each of the sample points, calls the relevant :ref:`Constraint <constraint_description>` to reduce the
+number of data points, and then the :ref:`Kernel <kernel_description>` which returns a single value, which the
+collocator stores.
+
+.. image:: img/CollocationDiagram.png
+   :width: 600px
+
+.. _kernel_description:
+
+Kernel
+======
+
+A kernel is used to convert the constrained points into values in the output. There are two sorts of kernel one
+which act on the final point location and a set of data points (these derive from :class:`.Kernel`) and the more specific kernels
+which act upon just an array of data (these derive from :class:`.AbstractDataOnlyKernel`, which in turn derives from :class:`.Kernel`).
+The data only kernels are less flexible but should execute faster. To create a new kernel inherit from :class:`.Kernel` and
+implement the abstract method :meth:`.Kernel.get_value`. To make a data only kernel inherit from :class:`.AbstractDataOnlyKernel`
+and implement :meth:`.AbstractDataOnlyKernel.get_value_for_data_only` and optionally overload :meth:`.AbstractDataOnlyKernel.get_value`.
+These methods are outlined below.
+
+.. automethod:: cis.collocation.col_framework.Kernel.get_value
+    :noindex:
+
+.. automethod:: cis.collocation.col_framework.AbstractDataOnlyKernel.get_value_for_data_only
+    :noindex:
+
+.. _constraint_description:
+
+Constraint
+==========
+
+The constraint limits the data points for a given sample point.
+The user can also add a new constraint mechanism by subclassing :class:`.Constraint` and providing an implementation for
+:meth:`.Constraint.constrain_points`. If more control is needed over the iteration sequence then the
+:meth:`.Constraint.get_iterator` method can also be
+overloaded. Note however that this may not be respected by all collocators, who may still iterate over all
+sample data points. It is possible to write your own collocator (or extend an existing one) to ensure the correct
+iterator is used - see the next section. Both these methods, and their signatures, are outlined below.
+
+.. automethod:: cis.collocation.col_framework.Constraint.constrain_points
+    :noindex:
+
+.. automethod:: cis.collocation.col_framework.Constraint.get_iterator
+    :noindex:
+
+To enable a constraint to use a :class:`.AbstractDataOnlyKernel`, the method
+:meth:`get_iterator_for_data_only` should be implemented (again though, this may be ignored by a collocator). An
+example of this is the :meth:`.BinnedCubeCellOnlyConstraint.get_iterator_for_data_only` implementation.
+
+.. _collocator_description:
+
+Collocator
+==========
+
+Another plugin which is available is the collocation method itself. A new one can be created by subclassing :class:`.Collocator` and
+providing an implementation for :meth:`.Collocator.collocate`. This method takes a number of sample
+points and applies the given constraint and kernel methods on the data for each of those points. It is responsible for
+returning the new data object to be written to the output file. As such, the user could create a collocation routine
+capable of handling multiple return values from the kernel, and hence creating multiple data objects, by creating a
+new collocation method.
+
+.. note::
+
+    The collocator is also responsible for dealing with any missing values in sample points. (Some sets of sample points may
+    include values which may or may not be masked.) Sometimes the user may wish to mask the output for such points, the
+    :attr:`missing_data_for_missing_sample` attribute is used to determine the expected behaviour.
+
+The interface is detailed here:
+
+.. automethod:: cis.collocation.col_framework.Collocator.collocate
+    :noindex:
+
+Implementation
+==============
+
+For all of these plugins any new variables, such as limits, constraint values or averaging parameters,
+are automatically set as attributes in the relevant object. For example, if the user wanted to write a new
+constraint method (``AreaConstraint``, say) which needed a variable called ``area``, this can be accessed with ``self.area``
+within the constraint object. This will be set to whatever the user specifies at the command line for that variable, e.g.::
+
+  $ ./cis.py col my_sample_file rain:"model_data_?.nc"::AreaConstraint,area=6000,fill_value=0.0:nn_gridded
+
+Example implementations of new collocation plugins are demonstrated below for each of the plugin types::
+
+
+  class MyCollocator(Collocator):
+
+      def collocate(self, points, data, constraint, kernel):
+          values = []
+          for point in points:
+              con_points = constraint.constrain_points(point, data)
+              try:
+                  values.append(kernel.get_value(point, con_points))
+              except ValueError:
+                  values.append(constraint.fill_value)
+          new_data = LazyData(values, data.metadata)
+          new_data.missing_value = constraint.fill_value
+          return new_data
+
+
+  class MyConstraint(Constraint):
+
+      def constrain_points(self, ref_point, data):
+          con_points = []
+          for point in data:
+              if point.value > self.val_check:
+                  con_points.append(point)
+          return con_points
+
+
+  class MyKernel(Kernel):
+
+      def get_value(self, point, data):
+          nearest_point = point.furthest_point_from()
+          for data_point in data:
+              if point.compdist(nearest_point, data_point):
+                  nearest_point = data_point
+          return nearest_point.val
+
diff --git a/doc/index.rst b/doc/index.rst
@@ -26,6 +26,7 @@ Contents:
    statistics
    overlay_examples
    plugin_development
+   analysis_plugin_development
    maintenance_and_development
    CIS as a Python library (API) <api/cis>
 

diff --git a/doc/maintenance_and_development.rst b/doc/maintenance_and_development.rst
@@ -2,12 +2,34 @@
 Maintenance and Developer Guide
 ===============================
 
-Unit test suite
-===============
+Source files
+============
+
+The cis source code is hosted at https://github.com/cedadev/jasmin_cis.git, while the conda recipes and other files are
+hosted here: https://github.com/cistools.
+
+Test suites
+===========
+
+The unit tests suite can be ran using Nose readily. Just go the root of the repository (i.e. cis) and type
+``nosetests cis/test/unit`` and this will run the full suite of tests.
+
+A comprehensive set of integration tests are also provided. There  is a folder full of test data
+at: ``/group_workspaces/jasmin/cis/cis_repo_test_files`` which has been compressed and is available as a tar inside that
+folder.
+
+To add files to the folder simply copy them in then delete the old tar file and create a new one with::
+
+ tar --dereference -zcvf cis_repo_test_files.tar.gz .
+
+Ignore warning about file changing - it is because the tar file is in the directory. Having the tar file in the
+directory, however, means the archive can be easily unpacked, without creating an intermediate folder.
+To make the integration tests run this needs to be copied to the local machine and decompressed. Then set the
+environment variable ``CIS_DATA_HOME`` to the location of the data sets, and run ``nosetests cis/test/integration``.
 
-The unit tests suite can be ran using Nose readily. Just go the root of the repository (i.e. cis) and type ``nosetests cis/test/unit`` and this will run the full suite of tests.
-A comprehensive set of integration tests are also provided. These require data sets which can be found in the JASMIN CIS group workspace under the ``cis_repo_test_files`` directory. To run the integration tests set the environment variable ``CIS_DATA_HOME`` to the location of the data sets, and then run ``nosetests cis/test/integration``.
-There are also a number of plot tests available under the ``test/plot_tests`` directory which can be run using the ``run_all.sh`` script. These perform a diff of some standard plots against reference plots, however small changes in the platform libraries and fonts can break these tests so they shouldn't be relied on.
+There are also a number of plot tests available under the ``test/plot_tests`` directory which can be run using
+the ``run_all.sh`` script. These perform a diff of some standard plots against reference plots, however small changes
+in the platform libraries and fonts can break these tests so they shouldn't be relied on.
 
 
 Dependencies
@@ -19,8 +41,30 @@ A graph representing the dependency tree can be found at ``doc/cis_dependency.do
    :width: 900px
 
 
+Creating a Release
+==================
+
+To carry out intermediate releases follow this procedure:
+
+1. Check the version number and status is updated in the CIS source code (cis/__init__.py)
+
+2. Tag the new version on Github with new version number and release notes.
+
+3. Create a tarball - use ``python setup.py egg_info sdist`` in the cis root dir.
+
+4. Install this onto the release virtual environment: this is at ``/group_workspaces/jasmin/cis/cis_dev_venv``. So activate
+   the venv, upload the tarball somewhere on the GWS and then do ``pip install <LOCATION_OF_TARBALL>``.
+
+5. Create an anaconda build  - see below.
+
+6. Request Phil Kershaw upload the tarball to PyPi. (Optional)
+
+For a release onto JASMIN, complete the steps above and then ask Alan Iwi to produce an RPM, deploy it on a
+test VM, confirm functionality then rollout across full JAP and LOTUS nodes.
+
+
 Anaconda Build
-==============
+--------------
 
 The Anaconda build recipes for CIS and the dependencies which can't be found either in the core channel, or in SciTools are stored in their own github repository `here <https://github.com/cistools/conda-recipes>`_.
 To build a new CIS package clone the conda-recipes repository and then run the following command::
@@ -47,134 +91,46 @@ This will output the documentation in html under the directory ``doc/_build/html
 
 .. _analysis_plugin_development:
 
-Analysis plugin development
-===========================
-
-Users can write their own plugins for performing the collocation of two data sets.
-There are three different types of plugin available for collocation, first we will describe the overall design and how
-these different components interact, then each will be described in more detail.
-
-Basic collocation design
-------------------------
-
-The diagram below demonstrates the basic design of the collocation system, and the roles of each of the components.
-In the simple case of the default collocator (which returns only one value) the :ref:`Collocator <collocator_description>`
-loops over each of the sample points, calls the relevant :ref:`Constraint <constraint_description>` to reduce the
-number of data points, and then the :ref:`Kernel <kernel_description>` which returns a single value, which the
-collocator stores.
-
-.. image:: img/CollocationDiagram.png
-   :width: 600px
+Continuous Integration Server
+=============================
+JASMIN provide a Jenkins CI Server on which the CIS unit and integration tests are run whenever origin/master is updated.
+The integration tests take approximately 7 hours to run whilst the unit tests take about 5s. The Jenkins server is
+hosted on jasmin-sci1-dev at ``/var/lib/jenkins`` and is accessed at http://jasmin-sci1-dev.ceda.ac.uk:8080/
 
-.. _kernel_description:
+We also have a Travis cloud instance (https://travis-ci.org/cedadev/cis) which in principle allows us to build and test
+on both Linux and OS X. There are unit test builds currently working but because of a hard time limit on builds (120
+minutes) the integration tests don't currently run.
 
-Kernel
-------
+Copying files to the CI server
+------------------------------
 
-A kernel is used to convert the constrained points into values in the output. There are two sorts of kernel one
-which act on the final point location and a set of data points (these derive from :class:`.Kernel`) and the more specific kernels
-which act upon just an array of data (these derive from :class:`.AbstractDataOnlyKernel`, which in turn derives from :class:`.Kernel`).
-The data only kernels are less flexible but should execute faster. To create a new kernel inherit from :class:`.Kernel` and
-implement the abstract method :meth:`.Kernel.get_value`. To make a data only kernel inherit from :class:`.AbstractDataOnlyKernel`
-and implement :meth:`.AbstractDataOnlyKernel.get_value_for_data_only` and optionally overload :meth:`.AbstractDataOnlyKernel.get_value`.
-These methods are outlined below.
+The contents of the test folder will not be automatically copied across to the Jenkins directory, so if you add any
+files to the folder you'll need to manually copy them to the Jenkins directory or the integration tests will fail. The
+directory is ``/var/lib/jenkins/workspace/CIS Integration Tests/cis/test/test_files/``. This is not entirely simple
+because:
 
-.. automethod:: cis.collocation.col_framework.Kernel.get_value
-    :noindex:
+ * We don't have write permissions on the test folder
+ * Jenkins doesn't have read permissions for the CIS group_workspace
 
-.. automethod:: cis.collocation.col_framework.AbstractDataOnlyKernel.get_value_for_data_only
-    :noindex:
+In order to copy files across we have done the following:
 
-.. _constraint_description:
+1. Copy the files we want to /tmp
 
-Constraint
-----------
+2. Open up the CIS Integration Tests webpage and click 'Configure'
 
-The constraint limits the data points for a given sample point.
-The user can also add a new constraint mechanism by subclassing :class:`.Constraint` and providing an implementation for
-:meth:`.Constraint.constrain_points`. If more control is needed over the iteration sequence then the
-:meth:`.Constraint.get_iterator` method can also be
-overloaded. Note however that this may not be respected by all collocators, who may still iterate over all
-sample data points. It is possible to write your own collocator (or extend an existing one) to ensure the correct
-iterator is used - see the next section. Both these methods, and their signatures, are outlined below.
+3. Scroll down to 'Build' where the shell script to be executed is found and insert a line to copy the file to the
+   directory, e.g. ``cp /tmp/file.nc /var/lib/jenkins/workspace/CIS Integration Tests/cis/test/test_files``
 
-.. automethod:: cis.collocation.col_framework.Constraint.constrain_points
-    :noindex:
+4. Run the CIS Integration Tests
 
-.. automethod:: cis.collocation.col_framework.Constraint.get_iterator
-    :noindex:
+5. Remove the line from the build script
 
-To enable a constraint to use a :class:`.AbstractDataOnlyKernel`, the method
-:meth:`get_iterator_for_data_only` should be implemented (again though, this may be ignored by a collocator). An
-example of this is the :meth:`.BinnedCubeCellOnlyConstraint.get_iterator_for_data_only` implementation.
+6. Remove the files from /tmp
 
-.. _collocator_description:
 
-Collocator
-----------
-
-Another plugin which is available is the collocation method itself. A new one can be created by subclassing :class:`.Collocator` and
-providing an implementation for :meth:`.Collocator.collocate`. This method takes a number of sample
-points and applies the given constraint and kernel methods on the data for each of those points. It is responsible for
-returning the new data object to be written to the output file. As such, the user could create a collocation routine
-capable of handling multiple return values from the kernel, and hence creating multiple data objects, by creating a
-new collocation method.
-
-.. note::
-
-    The collocator is also responsible for dealing with any missing values in sample points. (Some sets of sample points may
-    include values which may or may not be masked.) Sometimes the user may wish to mask the output for such points, the
-    :attr:`missing_data_for_missing_sample` attribute is used to determine the expected behaviour.
-
-The interface is detailed here:
-
-.. automethod:: cis.collocation.col_framework.Collocator.collocate
-    :noindex:
-
-Implementation
---------------
+Problems with Jenkins
+---------------------
 
-For all of these plugins any new variables, such as limits, constraint values or averaging parameters,
-are automatically set as attributes in the relevant object. For example, if the user wanted to write a new
-constraint method (``AreaConstraint``, say) which needed a variable called ``area``, this can be accessed with ``self.area``
-within the constraint object. This will be set to whatever the user specifies at the command line for that variable, e.g.::
-
-  $ ./cis.py col my_sample_file rain:"model_data_?.nc"::AreaConstraint,area=6000,fill_value=0.0:nn_gridded
-
-Example implementations of new collocation plugins are demonstrated below for each of the plugin types::
-
-
-  class MyCollocator(Collocator):
-  
-      def collocate(self, points, data, constraint, kernel):
-          values = []
-          for point in points:
-              con_points = constraint.constrain_points(point, data)
-              try:
-                  values.append(kernel.get_value(point, con_points))
-              except ValueError:
-                  values.append(constraint.fill_value)
-          new_data = LazyData(values, data.metadata)
-          new_data.missing_value = constraint.fill_value
-          return new_data
-
-
-  class MyConstraint(Constraint):
-  
-      def constrain_points(self, ref_point, data):
-          con_points = []
-          for point in data:
-              if point.value > self.val_check:
-                  con_points.append(point)
-          return con_points
-  
-  
-  class MyKernel(Kernel):
-  
-      def get_value(self, point, data):
-          nearest_point = point.furthest_point_from()
-          for data_point in data:
-              if point.compdist(nearest_point, data_point):
-                  nearest_point = data_point
-          return nearest_point.val
-  
+Sometimes the Jenkins server experiences problems which make it unusable. One particular issue we've encountered more
+than once is that Jenkins occasionally loses all its stylesheets and then becomes impossible to use. Asking CEDA support
+(or Phil Kershaw) to restart Jenkins should solve this.