Skip to content

Commit

Permalink
Add wording
Browse files Browse the repository at this point in the history
  • Loading branch information
shuds13 committed Oct 19, 2023
1 parent 907dcd3 commit 8f353fc
Showing 1 changed file with 29 additions and 0 deletions.
29 changes: 29 additions & 0 deletions docs/papers/joss/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,35 @@ design, decision, and inference studies on or across laptops and heterogeneous h

# Statement of need

There are a growing number of packages aimed at workflows, and a sub-set of these focus on running ensembles of calculations on clusters and supercomputers. A dynamic ensemble refers to packages that automatically steer the ensemble based on intermediate results. This may involve deciding simulation parameters based on numerical optimization or machine learning techniques, among other possibilities. Other packages in this space include Colmena and the RADICAL-Ensemble Toolkit.

[***merge sim/gen with this?]
[***where communications?]

LibEnsemble stands out primarily through its generator-simulator paradigm, which eliminates the need for users to explicitly define task dependencies. Instead, it emphasizes data dependencies between customizable Python user functions. This modular design also lends itself to exploiting the large library of example user functions that are provided with libEnsemble, maximizing code re-use. For instance, users can readily choose an existing generator function and tailor a simulator function to their particular needs.

Some crucial considerations relevant to these packages include:

- Portability - running on different machines with different schedulers, hardware, MPI runners with minimal modification to user scripts.

- Scalability - working efficiently with large simulations and many concurrent simulations.

- Dynamic resources - ability to dyanmically assign machine resources to simulations.

Check warning on line 66 in docs/papers/joss/paper.md

View workflow job for this annotation

GitHub Actions / Spellcheck release branch

"dyanmically" should be "dynamically".

- Ease of use - does the software require a complex setup.

- Ability to cancel simulations on on-the-fly.

To acheive portability, libEnsemble employs system detection beyond other packages. It detects crucial system information such as scheduler details, MPI runners, core counts, GPU counts (for different types of GPU), and uses these to produce run-lines and GPU settings for these sytems, without the user having to alter scripts. For example, on a system using "srun", libEnsemble will use srun options to assign GPUs, while on other systems it may assign via environment variables such as ROCR_VISIBLE_DEVICES or CUDA_VISIBLE_DEVICES, while the user only states the number of GPUs needed for each simulation. For cases where autodetection is insufficient the user can supply platform information or the name of a known system via scripts or an environment variable.

Check warning on line 72 in docs/papers/joss/paper.md

View workflow job for this annotation

GitHub Actions / Spellcheck release branch

"acheive" should be "achieve".

Check warning on line 72 in docs/papers/joss/paper.md

View workflow job for this annotation

GitHub Actions / Spellcheck release branch

"sytems" should be "systems".

By default, libEnsemble divides available compute resources amongst workers. However, when simulation parameters are created, the number of processes and GPUs can also be specified for each simulation. Combined with the portability features, this makes it very simple to transfer user scripts between platforms.

libensemble takes the philosohpy of minimising required dependencies, while supporting various back-end mechanisms when needed. For example, the vast majority of users do not require to be running a database application or special run-time to use libEnsemble, but for those that do, Balsam can be used on the back-end by substituting the reguler MPI executor for the Balsam executor. This approach simplifies the user experience and reduces the initial setup and adoption costs when using libEnsemble.

Check warning on line 76 in docs/papers/joss/paper.md

View workflow job for this annotation

GitHub Actions / Spellcheck release branch

"philosohpy" should be "philosophy".

Check warning on line 76 in docs/papers/joss/paper.md

View workflow job for this annotation

GitHub Actions / Spellcheck release branch

"reguler" should be "regular".

The close coupling between libEsnemble generator and simulators enable the generator to both asychronously be taking in results and updating models, and to cancel previously issued simulations. Running simulations can be terminated and resources recovered. This is more flexible compared to other packages, where the generation of simulations is external to the dispatch of a batch of simulations.

Check warning on line 78 in docs/papers/joss/paper.md

View workflow job for this annotation

GitHub Actions / Spellcheck release branch

"asychronously" should be "asynchronously".

libEnsemble supports persistent user functions.that run on workers, maintaining their memory, which prevents the storing and reloading of data required by packages that only support a fire-and-forget approach to ensemble components.

Examples of way in which libEnsemble has been used in science and engineering problems include:

- Optimization of variational algorithms on quantum computers [@Liu2022layer].
Expand Down

0 comments on commit 8f353fc

Please sign in to comment.