Skip to content

Testing Standards

Andrew Lee edited this page May 18, 2021 · 21 revisions

Code Testing

All models and tools within the IDAES code base are expected to have accompanying tests, which check for both coding errors and final results.

Testing Tools

All IDAES tests are written using pytest (pytest documentation). pytest automatically identifies test modules, classes and methods, thus the following guidelines should be followed when creating tests:

  • Test modules should be contained in a folder named tests, generally as a sub-folder of the code being tested.
  • Test files must start with test_, and conversely non-test files should avoid having test in their names.
  • Test files must contain one or more methods which start with test_
  • Tests can also be organized into classes, which must start with Test.

Types of Tests

IDAES test are divided into three categories, which are used for organizing tests into different levels of rigor and complexity (and thus execution time). Lower level tests are are expected to run frequently, and thus need to keep execution time to a minimum to avoid delays, whilst higher level tests are run less frequently and can thus take longer to complete. The three categories used by IDAES are:

  • unit: Unit tests are used to test basic functionality and execution of individual parts of the code. Within IDAES, these tests are generally used to test model construction, but not model solutions. These tests should be fast to run (less than 2 seconds), and should not require the use of a solver. Unit tests should ideally cover all lines of code within a model with the exception of those requiring a solver to complete (initialization and final solves).
  • component: Component tests are used to test model solutions for single example cases in order to check that a model can be solved. These test obviously require the use of a solver, but should still be relatively quick to execute (ideally less than 10 seconds).
  • integration: The final level of tests are integration tests, which are used for longer duration verification and validation tests. These tests are used to confirm model accuracy and robustness over a wide range of conditions, and as such can take longer to execute. integration tests are also used to execute all examples in the IDAES Examples to ensure that any changes to the core codebase do not break the examples.

As a general rule, any tool or module should have a set of unit and component tests that exercise and solve all possible options/combinations for a single (generally simple) test case to confirm that the code works as expected and can be solved. Each model or tool should also have a more extensive suite of integration tests which test and verify the accuracy and robustness of the model/tool over as wide a range of conditions as possible.

When and How to Run Tests

Developers should run the unit and component tests at a minimum before creating a Pull Request in order to avoid test failures on the PR. A PR must pass all tests (including integration tests) before it will be merged, thus it is best to identify these early. It is also a good idea to run the integration tests unless you are certain your changes will not affect the examples. The IDAES tests can be run from the command line using the following command:

>>> pytest

Running Specific Tests

Pytest can be used to run tests in a specific path or file using:

>>> pytest PATH

Pytest can also be directed to run tests with specific marks:

>>> pytest -m unit  # Only run tests with the "unit" mark
>>> pytest -m "not integration"  # Run all tests in PATH that do not have the "integration" mark

Automated Testing

IDAES Also used automated testing via GitHub Actions to run the test suite on all Pull Requests to the main repository, as well as to run regularly scheduled tests of the current code.

Testing Standards

Test Coverage

A starting point for determining quality of testing in a repository is to look at the percentage of the lines of code in the project that are executed during test runs, and which lines are missed. A number of automated tools are available to do this, and IDAES uses CodeCov for this purpose, which is integrated in to the Continuous Integration environment and reported for all Pull Requests. CodeCov provides a number of different reports which can be used to examine test coverage across the project and identify areas of low coverage for improvement.

80% test coverage is generally accepted as a good goal to aim for, and IDAES aims to meet this as a minimum standard for test coverage. In order to meet this target, the following requirements are placed on Pull Requests to ensure test coverage is constantly increasing:

  • A Pull Request may not cause the overall test coverage to decrease, and
  • The test coverage on the changed code (diff coverage) must be equal to or greater than that of the overall test coverage.

However, there are limitations to relying on test coverage as the sole metric for testing, as coverage only checks to see which lines of code were executed during testing, and cannot does not consider quality of the tests. Further, it is often necessary to test the same line of code multiple times for different user inputs (.e.g. testing for cases where a user passes a string when a float was expected) which is not easily quantified. This is especially true for process models, where the same model needs to be tested under a wide range of conditions to be accepted as validated. Thus, code coverage should only be considered as an initial, lower bound on testing.

Testing Guidelines

Unit Tests

Unit tests are the hardest tests to write, as they primarily focus on testing aspects of the current model or tool only - that is they do not test integration with other models or tools. This is especially challenging for testing of Unit Models, as they inherently depend upon property packages and control volumes to provide critical information and infrastructure. To help with this, IDAES provides some testing utilities (Testing Utility Functions) that include minimal property and reaction packages for testing purposes, which allows testing to focus on the Unit Model alone.

Unit tests should:

  • Only focus on code in the module being tested. Code in other necessary modules should be tested separately in tests for those modules; i.e. every module should have its own set of unit tests, and tests for multiple modules should not be combined.
  • Must not involve a solver, and thus cannot test model initialization or results (this is the purpose of component tests).
  • Aim to cover as much of the code as possible (subject to limitations on testing of initialization routines).
  • Confirm model is constructed as expected; i.e. test for the presence, type and shape (indexing sets) of expected model components.
  • Test all possible if/else branches, and ideally all combinations of if/else branches.
  • Test for all Exceptions that can be raised by the code.

Unit tests should not include tests of unit consistency, as these are computationally expensive; unit consistency is tested as part of the component tests.

Component Tests

Component tests are used to test integration of models and tools for a set of well defined test cases for which it is known the model can be solved. By necessity, this often involves integration of multiple models (e.g. Unit Models with property packages). Component tests are also the point where initialization routines and solutions can be tested, and thus can (but do not need to) involve the use of a solver.

Component tests should:

  • Aim to cover all lines of code not covered by unit tests; i.e. all initialization routines and other code that requires a solver.
  • Test for consistent units of measurement. Asserting unit consistency is expensive, hence why these are not included in unit tests.
  • Include a test case for all possible model configurations (i.e. for every if/else option available during model construction).
  • Test model results/outputs against expected values to a sufficient tolerance (generally 1-2 orders of magnitude greater than solver tolerance).
  • Test values for as many key variables as possible (not just inputs and outputs).
  • Test involving solvers should always confirm convergence before checking results.
  • Confirm conservation of mass and energy (and momentum if applicable).

Integration Tests

Integration tests should:

  • Test model performance over as wide a range of conditions and configurations as possible.
  • Compare model results/outputs with literature data, including source information (i.e. references).
  • Commercial modeling tools should not be used as sources of testing data. Many of these have licensing clauses that prohibit their use for benchmarking and testing of other tools.
  • Results/output should be tested to accuracy of literature data. More accurate data is always preferred where possible.
  • Should always confirm solver convergence before testing results.
  • Should include tests for model robustness and reliability (TBD).

How to Write Good Tests

Writing tests is something of an art form, and it takes a while to learn what should be tested and how best to achieve this. Below are some suggestions compiled from the experience of the IDAES developers.

Things to Include

  • All tests should include a pytest.mark to indicate the type of test.
  • Tests should be written that execute all branches in conditional statements, and should check to make sure the correct branch was taken.
  • Any Exceptions raised by the code should be tested. You can use pytest.raises(ExceptionType, matches=str) to check that the correct type of Exception was raised and that the message matches the expected string.
  • When testing model solution, always begin by checking that the solver returned an optimal solution.
results = solver.solve(model)
assert results.solver.termination_condition == TerminationCondition.optimal
assert results.solver.status == SolverStatus.ok
  • When testing model solutions, check for as many key variables as possible. Include intermediate variables as well to help narrow down any failures.
  • Also keep in mind solver tolerances when testing results. The default solver tolerance is 1e-6, so you should you should try to test to a slightly looser tolerance (we often use 1e-5). Be aware that IPOPTs tolerance is both absolute and relative.

Things to Avoid

  • Tests should always contain an assert statement (or equivalent). It is easy to write a test that executes the code, but unless you add checks for specific behaviors all the test will tell you is if there are any Exceptions during execution.
Clone this wiki locally