Adding basic parameter sweep tool #1284

andrewlee94 · 2023-11-07T19:21:57Z

Part of diagnostics work

Summary/Motivation:

Diagnostic checks for numerical issues need to be run across a range of input values to ensure that the model is well posed across the full range of operation. The existing ConvergenceTester tool provides much of this functionality, however it it is set up as a stand-alone tool targeted at a very specific purpose.

This PR aims to generalize a lot of the core capabilities and to hopefully start aligning this with similar capabilities being developed in WaterTAP. The end goal for this is to define a standard API for setting up parameter sweep type runs in IDAES, whilst allowing users to implement workflow managers using a parallelization tool of their choice.

Changes proposed in this PR:

Move core functionality of ConvergenceTester to a new parameter sweep utility. The old ConvergenceTester will be retained for now for backward compatibility.
Update input specification class to leverage Pysmo's sampling tools
Create a ParameterSweepBase call with core functionality and API for running parameter sweeps
Create a derived SequentialSweepRunner class which derived from ParameterSweepBase to implement a simple sequential workflow manager.

Legal Acknowledgement

By contributing to this software project, I agree to the following terms and conditions for my contribution:

I agree my contributions are submitted under the license terms described in the LICENSE.txt file at the top level of this directory.
I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

…ter_sweep

codecov · 2023-11-07T20:05:03Z

Codecov Report

Attention: 40 lines in your changes are missing coverage. Please review.

Comparison is base (94b98f7) 77.51% compared to head (f697d85) 77.58%.

Files	Patch %	Lines
idaes/core/util/model_diagnostics.py	84.17%	18 Missing and 4 partials ⚠️
idaes/core/util/parameter_sweep.py	92.99%	12 Missing and 6 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1284      +/-   ##
==========================================
+ Coverage   77.51%   77.58%   +0.07%     
==========================================
  Files         390      391       +1     
  Lines       63884    64288     +404     
  Branches    11756    11815      +59     
==========================================
+ Hits        49517    49878     +361     
- Misses      11795    11829      +34     
- Partials     2572     2581       +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

idaes/core/util/parameter_sweep.py

Robbybp · 2023-11-14T03:52:59Z

idaes/core/util/parameter_sweep.py

+        """
+        Returns OrderedDict containing the results from the parameter sweep.
+        """
+        return self._results


Should this be a DataFrame for consistency with ParameterSweepSpecification?

Not sure on that - I don't see any reason why it could not be, but I am also not sure that it needs to be either. The reason the ParameterSweepSpecification uses a DataFrame is because Pysmo returns a numpy.array (which I then turned into a DataFrame so that the input names were explicitly associated with the data to avoid future issues).

If we want to combine the samples and results into one dataframe, we can do:

samples.join(pd.DataFrame(runner.results).transpose())

My current approach to generating serialized data is something like:

reslist = samples.to_dict(orient="records") for i, res in enumerate(reslist): # Add important results to the dict containing the sampled inputs # runner.results[i]["results"] is a named tuple I use to hold results res["solved"] = runner.results[i]["solved"] res["feasible"] = runner.results[i]["results"].feasible res["objective"] = runner.results[i]["results"].objective res["solve_time"] = runner.results[i]["results"].timer.timers["solve"].total_time res["error"] = results[i]["error"] with open(frame, "w") as f: json.dump(reslist, f)

No objection to the current data structure for results.

For now I think I want to leave things as they are. For one, I think it is best to keep the samples separate from the results (we've tried different things at different time and for now I prefer this).

idaes/core/util/parameter_sweep.py

Robbybp · 2023-11-14T23:34:32Z

idaes/core/util/parameter_sweep.py

+        args = self.config.build_model_arguments
+        if args is None:
+            args = {}
+
+        model = self.config.build_model(**args)


Should there be an option to send sampled parameter values to the build_model function? I could imagine this being useful if we want to set parameter values before initializing the model. (Or maybe this should be done by implementing a custom run_model function that calls model.initialize() before solving?)

That was part of my intention for the run_model method; build_model would do everything up to setting values, and then run_model would do anything that needed to follow.

I think that makes sense

andrewlee94 · 2024-02-06T16:03:55Z

@Robbybp Would you have time to take a quick look at this again? I am not sure there is much more we can do until the WaterTAP parameter sweep tool is separated from WaterTAP, so I think it might be best ot get the general API we want in place and then merge this.

andrewlee94 · 2024-02-06T16:07:10Z

@k1nshuk Would you have some time to take a look at this and see what you think. A lot of this will need to wait until the WaterTAP parameter sweep tool is in its own repo, but I would like to get the IDAES API ready so that we can start using it in our tests (with a basic sequential runner in the background).

k1nshuk

Looks good to me.

Robbybp · 2024-02-07T01:06:08Z

@andrewlee94 I will try to re-review by the end of the week.

Robbybp · 2024-02-15T15:25:39Z

FYI, I have not forgotten about reviewing this. I am working through an application using this, and will add my review once I'm done. I don't anticipate major change requests.

Robbybp

My main comment is that I don't think that build_outputs should be required, then a few other small comments below.

Robbybp · 2024-02-16T01:17:18Z

idaes/core/util/parameter_sweep.py

+        if self.config.build_outputs is None:
+            raise ConfigurationError(
+                "Please provide a method to collect results from sample run."
+            )


Is there any reason we require a build_outputs method? It seems like a reasonable default could be to just return run_stats

I suppose we could just default to run_stats - this is probably not what most users want, but they should be providing a build_outputs method anyway.

I actually thought the results object would be what the user typically wants. And with this default, if they want some other data structure, they can just return it from run_model. Now that I think about it, I'm not sure that build_outputs is necessary. Is there a reason we don't just expect users to process output in a custom run_model method?

I think it is partly compatibility with WaterTAP and partly to keep things separate.

idaes/core/util/parameter_sweep.py

Robbybp · 2024-02-16T16:45:35Z

idaes/core/util/parameter_sweep.py

+        """
+        Returns OrderedDict containing the results from the parameter sweep.
+        """
+        return self._results


If we want to combine the samples and results into one dataframe, we can do:

samples.join(pd.DataFrame(runner.results).transpose())

My current approach to generating serialized data is something like:

reslist = samples.to_dict(orient="records") for i, res in enumerate(reslist): # Add important results to the dict containing the sampled inputs # runner.results[i]["results"] is a named tuple I use to hold results res["solved"] = runner.results[i]["solved"] res["feasible"] = runner.results[i]["results"].feasible res["objective"] = runner.results[i]["results"].objective res["solve_time"] = runner.results[i]["results"].timer.timers["solve"].total_time res["error"] = results[i]["error"] with open(frame, "w") as f: json.dump(reslist, f)

No objection to the current data structure for results.

Robbybp · 2024-02-16T16:49:56Z

idaes/core/util/model_diagnostics.py

+)
+
+
+class ConvergenceAnalysis:


I haven't used this functionality, and haven't seen an example of it, so I can't offer much of a review of it.

This is a fairly specific use case for robustness checking. It is basically a sweep of predefined samples to collect stats from IPOPT (solver status, iterations, etc.) and comparing them to a baseline to ensure no drift in solver behaviour.

Is this a port of the convergence tester that Carl wrote? Maybe the name should indicate that this functionality is Ipopt-specific?

Yes, it is an update of Carl's convergence tester. I am not sure if the name needs to be changed; this is primarily an internal tool for testing.

Given that the class is not marked as private via _, I'd prefer the name to indicate that the functionality is Ipopt-specific.

…ter_sweep

idaes/core/util/parameter_sweep.py

MarcusHolly

LGTM

Co-authored-by: MarcusHolly <[email protected]>

…ter_sweep

Robbybp

I am fine with this, although I'd prefer the name of ConvergenceAnalysis to indicate that it is something Ipopt-specific. Your choice whether this is worth it.

Robbybp · 2024-02-21T17:24:52Z

idaes/core/util/model_diagnostics.py

+)
+
+
+class ConvergenceAnalysis:


Given that the class is not marked as private via _, I'd prefer the name to indicate that the functionality is Ipopt-specific.

andrewlee94 · 2024-02-21T17:56:21Z

@Robbybp Any suggestion for a good name? IPOPTConvergenceAnalysis seems cumbersome to me.

Robbybp · 2024-02-21T20:57:14Z

@andrewlee94 I would go with IpoptConvergenceAnalyzer, IpoptSolveAnalyzer, or just IpoptAnalyzer

andrewlee94 added 6 commits November 1, 2023 14:55

Initial work on parameter sweep

d994c81

Merge branch 'main' of https://github.com/IDAES/idaes-pse into parame…

5166e73

…ter_sweep

Fixing config tests

1f5e33a

Working on testing parameter sweep base class

d3e711c

Finishing parameter sweep base file

9e80e17

Adding basic sequential runner

98a2503

andrewlee94 self-assigned this Nov 7, 2023

andrewlee94 changed the title ~~Adding basic parameter sweep tool~~ [WIP] Adding basic parameter sweep tool Nov 7, 2023

andrewlee94 added enhancement New feature or request help wanted Extra attention is needed Priority:Normal Normal Priority Issue or PR core Issues dealing with core modeling components testing Issues dealing with testing of code WaterTAP WIP PySMO diagnostics labels Nov 7, 2023

Deprecate old convergence tester tools

bffa953

andrewlee94 added 5 commits November 10, 2023 09:16

Merging main

fc92607

Working on ConvergenceAnalysis

bb8c2a6

Adding methods to compare to baseline

df3b2f3

Adding doc strings

f7fe78f

Fixing spelling

89c1f50

dangunter reviewed Nov 13, 2023

View reviewed changes

idaes/core/util/parameter_sweep.py Show resolved Hide resolved

Adding parameterization to parameter sweep callbacks

0a881ad

Robbybp reviewed Nov 14, 2023

View reviewed changes

idaes/core/util/parameter_sweep.py Outdated Show resolved Hide resolved

Robbybp reviewed Nov 14, 2023

View reviewed changes

andrewlee94 mentioned this pull request Nov 16, 2023

Model Diagnostics Checklist #1222

Open

31 tasks

andrewlee94 marked this pull request as ready for review February 6, 2024 16:02

andrewlee94 requested review from eslickj and lbianchi-lbl as code owners February 6, 2024 16:02

andrewlee94 requested a review from Robbybp February 6, 2024 16:03

k1nshuk reviewed Feb 7, 2024

View reviewed changes

andrewlee94 changed the title ~~[WIP] Adding basic parameter sweep tool~~ Adding basic parameter sweep tool Feb 8, 2024

Merge branch 'main' into parameter_sweep

b738d6f

Robbybp reviewed Feb 16, 2024

View reviewed changes

andrewlee94 added 3 commits February 20, 2024 13:47

Merge branch 'main' of https://github.com/IDAES/idaes-pse into parame…

01403b6

…ter_sweep

Addressing most PR comments

1d4997d

Fixing doc string formatting

42912bc

andrewlee94 requested review from Robbybp, bpaul4 and MarcusHolly February 21, 2024 14:41

MarcusHolly reviewed Feb 21, 2024

View reviewed changes

idaes/core/util/parameter_sweep.py Outdated Show resolved Hide resolved

MarcusHolly approved these changes Feb 21, 2024

View reviewed changes

andrewlee94 and others added 3 commits February 21, 2024 10:43

Update idaes/core/util/parameter_sweep.py

cf9bf92

Co-authored-by: MarcusHolly <[email protected]>

Fixing typo in test regex

3f07eec

Merge branch 'main' of https://github.com/IDAES/idaes-pse into parame…

dfcfad0

…ter_sweep

Robbybp approved these changes Feb 21, 2024

View reviewed changes

Renaming ConvergenceAnalysis to IpoptConvergenceAnalysis

f697d85

andrewlee94 enabled auto-merge (squash) February 22, 2024 14:57

andrewlee94 merged commit 777b813 into IDAES:main Feb 22, 2024
39 checks passed

andrewlee94 deleted the parameter_sweep branch February 22, 2024 15:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding basic parameter sweep tool #1284

Adding basic parameter sweep tool #1284

andrewlee94 commented Nov 7, 2023

codecov bot commented Nov 7, 2023 •

edited

Loading

Robbybp Nov 14, 2023

andrewlee94 Nov 22, 2023

Robbybp Feb 16, 2024

andrewlee94 Feb 20, 2024

Robbybp Nov 14, 2023

andrewlee94 Nov 22, 2023

Robbybp Nov 23, 2023

andrewlee94 commented Feb 6, 2024

andrewlee94 commented Feb 6, 2024

k1nshuk left a comment

Robbybp commented Feb 7, 2024

Robbybp commented Feb 15, 2024 •

edited

Loading

Robbybp left a comment

Robbybp Feb 16, 2024

andrewlee94 Feb 16, 2024

Robbybp Feb 16, 2024

andrewlee94 Feb 16, 2024

Robbybp Feb 16, 2024

Robbybp Feb 16, 2024

andrewlee94 Feb 16, 2024

Robbybp Feb 16, 2024

andrewlee94 Feb 16, 2024

Robbybp Feb 21, 2024

MarcusHolly left a comment

Robbybp left a comment

Robbybp Feb 21, 2024

andrewlee94 commented Feb 21, 2024

Robbybp commented Feb 21, 2024

Adding basic parameter sweep tool #1284

Adding basic parameter sweep tool #1284

Conversation

andrewlee94 commented Nov 7, 2023

Part of diagnostics work

Summary/Motivation:

Changes proposed in this PR:

Legal Acknowledgement

codecov bot commented Nov 7, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewlee94 commented Feb 6, 2024

andrewlee94 commented Feb 6, 2024

k1nshuk left a comment

Choose a reason for hiding this comment

Robbybp commented Feb 7, 2024

Robbybp commented Feb 15, 2024 • edited Loading

Robbybp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarcusHolly left a comment

Choose a reason for hiding this comment

Robbybp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewlee94 commented Feb 21, 2024

Robbybp commented Feb 21, 2024

codecov bot commented Nov 7, 2023 •

edited

Loading

Robbybp commented Feb 15, 2024 •

edited

Loading