Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for higher level abstraction of flowsheet models, flowsheet units, and automated flowsheet construction #1503

Open
1 task
avdudchenko opened this issue Oct 17, 2024 · 8 comments
Assignees
Labels
Priority:Normal Normal Priority Issue or PR

Comments

@avdudchenko
Copy link
Contributor

avdudchenko commented Oct 17, 2024

There is a need for an abstraction layer that can support developing preconfigured unit models and flowsheets, that could enable automated construction of complex flowsheets and their use in GUI. Currently, unit models in IDAES/WaterTAP, etc are designed as abstract units that can be modified/used in an infinite number of ways, but many users want standard units that can be used right away for typical process simulation and analysis (example: Reverse osmosis, requires pump, RO unit and maybe ERD. This means a user needs to build a flowsheet with at least these three unit models to do a basic analysis, or adapt another flowsheet)
Furthermore, there is a need to be able to easily assemble flowsheet units in any number of configurations, for example multi-stage processes or where unit order is changed. Right now this is done manually, resulting in the development of bespoke flowsheets and unit model configurations that are not portable, in many cases these flowsheets grow in complexity and become impossible to maintain or navigate.
If a standard rigid structure is adopted for how flowsheet units are configured and built, it is possible to:

  1. Develop flowsheet units that can be duplicated and used in different places on a flowsheet
  2. Provide intuitive API for UI software or higher-level code-based API to interact with unit models
  3. Provide an automated method to assemble flowsheets, initialize them, and perform analysis.
    I propose the development of three classes within IDAES to support this capability:
  4. FlowsheetBase – provides base configuration for the flowsheet
    o This houses utilities/properties/functions accessible to all unit models on the flowsheet, and all unit models are built on the FlowsheetBase
  5. FlowsheetUnit – This houses all information for unit desi
    gned to be built on a flowsheet base, unlike a normal unit model, these units have to be preconfigured for a standard rigid operation expected of the underlying sub-unit model.
  6. FlowsheetBuilder - This is a tool that would manage the FlowsheetBase and FlowsheetUnits to provide automated construction, initialization, and scaling of unit models in order requested by user using simple API calls (Similar to flowsheet_builder tool – for example using strings to configure order of unit operations), this would act as code based UI.
    o This might not be a block constructed on a model, but a separate tool entirely

The detailed proposal that includes the functions for each class and requirements or more or less are in the attached word doc/pdf

Proposal for the higher level abstraction of flowsheet models.pdf

Proposal for higher level abstraction of flowsheet models.docx

Related Issues:

@andrewlee94
Copy link
Member

I'll try to get to this tomorrow if possible, but one initial comment I had was whether FlowsheetBase and FlowsheetUnit needed to be separate, or if the functionality of FlowsheetUnit could just be optional capabilities on FlowsheetBlock.

@avdudchenko
Copy link
Contributor Author

Yes, I am open to suggestions on if we want to adopt/modify existing blocks/code or inherit them. the reality is flowsheetbase and flowsheetunit share alot of the same innards but have just minor differences. so its likely the should inherit from same base class and (probably flowsheetblock) and add functionality on top of it.

@dallan-keylogic
Copy link
Contributor

dallan-keylogic commented Oct 17, 2024

My first reaction reading through the proposal PDF is that much of this can be accomplished using existing standalone flowsheets and Pyomo Config blocks. A lot of the things in the "metadata" category can be accomplished by creating sets with standard names on the FlowsheetBlock (like an inlet_ports ComponentSet, an outlet_ports ComponentSet, a bidirectional_ports ComponentSetfor things that can't be categorized as an inlet or outlet,design_variablesandcontrol_variables ComponentSets`, etc).

@andrewlee94
Copy link
Member

andrewlee94 commented Oct 18, 2024

My initial thoughts after reading through the document is that this is really a proposal for a few different things (but all important) and that it would probably pay to look at each of the individually. As I see it, the different aspects are:

  1. Formalizing how we define modular sub-flowsheets and any necessary API to do this. In many ways, the infrastructure to do this already exists (either through FlowhseetBlock or UnitModelBase), and we just need to create some standards on how to do this and examples to show it. The main thing I think we are missing right now is metadata on external connectivity and APIs to access it.
  2. Separate to this, there is a proposal for a way to define meta-methods to assemble flowsheets from existing modules (flowsheet builder), which would rely on the metadata defined in 1.
  3. There is also the idea of having pre-configured options for modular components (flowsheets or unit models); again the infrastructure to do this exists via ConfigBlocks and I think what we really need here is just some standards and examples of how to do it.
  4. More options for reporting flowsheet results. Again, I think the user level API exists for this throguh the existing report method. What I think might be missing here (from early comments and discussion) is a) a way to select from key stream to display (rather than all streams), b) a way to include other key outputs in a flowsheet report (e.g. key unit or flowsheet variables, costing data, etc.), and c) documentation to make it clear that model developers can overload the report method to do whatever they want.
  5. Overarching all of this, there is a need for a way to define arbitrary metadata on ProcessModels as a whole, as well as specific expected metadata for different types of models (flowsheet, unit, property, reaction, surrogate, etc.). Addressing this I think will guide us to what is needed for the other four aspects, so I would suggest we start here.

The other thing I see is that we have a lot of stakeholders in with an interest in this, and we should involve all of them in the planning stage to make sure we don't inadvertently exclude anyone. My suggestion is to start by asking the stakeholders involved (modular flowsheet builders, UI team, Project Ahuora, anyone else who is interested) to compile a list of use-cases of interest to them and the associated metadata/APIs/utilities they wish they had to support this. We can then use this to start building out a more concrete proposal for what metadata is needed and how to use it.

Some more specific questions/clarifications about the proposal:

  1. Is the proposed FlowsheetBase here the same as the existing FlowsheetBlock (probably with some extensions to support some new capabilities/metadata)?
  2. How does FlowsheetUnit differ from FlowsheetBase? Are these just specific realizations (sub-classes) of FlowsheetBase?
  3. Regarding Initialization and Scaling, IDAES is generally moving to using the new class based tools for all of this. This makes a lot of sense for generic/ general purpose models as you often need different tools/options available for different use cases. However, for flowsheets (especially the more pre-packaged variety) a method based approach makes a lot of sense as these tend to be a lot more case-specific. I suggest we use a class-based approach for standardization, but we could have a method based API to access these for cases where it makes sense.

@avdudchenko
Copy link
Contributor Author

@dallan-keylogic yes- i think alot of basic structure/tooling exists but just needs to be standarized/packaged - as right now I don't actually see many flowsheets being built with these

@andrewlee94 : To all your points yes, I think the key thing is standardization and ensuring all the tools can work with "control API" (e.g. Ahuora, UI, etc), specifically providing access to standard expected methods and data. Much of it really is just a metadata management issue.

For specific questions:

  1. FlowsheetBase would probably inherit FlowsheetBlock to add needed standard features - (such as initialization routines for starting/terminal points and props/packages. or we could add features to FlowsheetBlock - not sure what would be less of a headache.
  2. FlowsheetUnit would really just be a metadata difference for control API to differentiate by.... in my thought FlowsheetBase/Unit would inherit from Flowsheet Block and add features to it needed by Control API's and standardized methods for controlling the Base/Unit. They have slight differences but can share a lot of functionality, which would be added onto FlowsheetBlock

The key practical difference between FlowsheetBase and FlowsheetUnit is the functions necessary for managing the flowsheet by a Control API. For example, a key difference is initialization routines. The FlowsheetBase needs to have methods to initialize/scale starting points and termination points separately, so its two separate functions. Meanwhile, FlowsheetUnit should only have a single initialize function that initializes everything. This is because the control API would need to inject the initialization routine of all units in between starting and terminal point initialization. So your initialization order would go to FlowsheetBase Starting points (and props/etc) -> All flowsheetUnits -> FlowsheetBase terminal points (and props/etc, for example, costing prop package). This can be done through a config option and single initialize function as well - in fact we could have a single FlowsheetBlock class, that supports many configurations - this feels like it will get messy and is a case of an overgeneralization, where I think explicit structure might be better for this.

There might also be other use-specific functionalities that are needed for FlowsheetUnits vs FlowsheetBase, but they would have to be identified by stakeholders.

  1. Yes, we should use the new tooling for scaling - I need to look into it. I think a key problem right now with iscale is how difficult it is to rescale models after the fact - in many cases, we don't allow rescaling if a model is already scaled in a given unit model (e..g we check if a scaling factor exists, and only calc scales if it does not... so calling iscale.calc_sclaing_factors a second time does not help.

I also agree on the suggestion to get stakeholders to provide input on what's needed - and how they would want to interact with an API. The proposal lacks details on API and implementation as I want to get everyone's perspective.

I believe current stakeholders are:
UI team
Project Ahuora
Members of IDAES/WaterTAP (Prommis?)

@andrewlee94
Copy link
Member

andrewlee94 commented Oct 19, 2024

@avdudchenko I am not sure I understand the distinction between FlowsheetBase and FlowsheetUnit still - to my mind the result would be the same if you had one type of object and just nested them. In writing that however, I think I might have just realized the distinction you are making however; are you envisioning that FlowsheetBase will have distinct blocks representing the starting and terminal points (as components on the flowsheet), where as FlowsheetUnits will just having Ports which extend those of the first/last unit in the sub-flowsheet?

If so, I think that we do not need to make that distinction - feed and product states can easily be viewed as just another component in the process (especially if we encourage the use of the Feed and Product unit operations). It should be noted that Feed and Product blocks are effectively glorified StateJunctions (and maybe they should inherit from that), so it is really just a matter of the flowsheet developer (or subflowsheet developer) needs to write their initialization routines based on the structure of their flowsheet.

To put it another way, the execution of the initialization for the two approaches is functionally the same on the inside; some initial conditions are set for the starting state(s) outside the initialization function (by the modeler for a top-level flowsheet feed, or by propagating a state for a sub-flowsheet), then we call the initialization routine that does whatever is necessary to take that starting states and propagate them throguh the entire (sub)flowsheet and ensures that all variables have a suitable initial value. Thus, to me the differences are internal to the initialization routine, and not at the API level.

@ksbeattie ksbeattie added the Priority:Normal Normal Priority Issue or PR label Oct 24, 2024
@avdudchenko
Copy link
Contributor Author

@avdudchenko I am not sure I understand the distinction between FlowsheetBase and FlowsheetUnit still - to my mind the result would be the same if you had one type of object and just nested them. In writing that however, I think I might have just realized the distinction you are making however; are you envisioning that FlowsheetBase will have distinct blocks representing the starting and terminal points (as components on the flowsheet), where as FlowsheetUnits will just having Ports which extend those of the first/last unit in the sub-flowsheet?

If so, I think that we do not need to make that distinction - feed and product states can easily be viewed as just another component in the process (especially if we encourage the use of the Feed and Product unit operations). It should be noted that Feed and Product blocks are effectively glorified StateJunctions (and maybe they should inherit from that), so it is really just a matter of the flowsheet developer (or subflowsheet developer) needs to write their initialization routines based on the structure of their flowsheet.

To put it another way, the execution of the initialization for the two approaches is functionally the same on the inside; some initial conditions are set for the starting state(s) outside the initialization function (by the modeler for a top-level flowsheet feed, or by propagating a state for a sub-flowsheet), then we call the initialization routine that does whatever is necessary to take that starting states and propagate them throguh the entire (sub)flowsheet and ensures that all variables have a suitable initial value. Thus, to me the differences are internal to the initialization routine, and not at the API level.

That makes sense - I have been sort of torn between treating feed/product etcs as just another unit or not. I think the other thing I did not mention is FlowsheetBase provided a common ground/default properties for all units on the flowsheet, including costing etc... maybe its not as much as FlowesheetBase as FlowsheetBaseProperties?

Specifically, FlowsheetBase needs to:

  1. Provide default property packages all units can expect to accept/use
  2. Provide base costing package/other aggregation packages/variables that would be used/accessible by every unit model.
  3. Provide methods for global scaling of variables - but this might just need to be handled by a given Feed model block.
  4. Be accessible to every unit model, so they can access global property packages/options etc.

@andrewlee94
Copy link
Member

andrewlee94 commented Nov 1, 2024

OK, based on that then I think we actually already have a lot of the capability you are suggesting, and what we are mostly lacking is a set of standards and example of how to use these.

  1. For the properties, we already have this as an optional capability in FlowsheetBlock. I do not think we have any examples of using it, but users (and developers) are free to decide whether or not to use it.
  2. I think this could be handled in a similar way to the above, as an optional capability on all flowsheets.
  3. This one I think should be done via the new Scaler tools rather than the flowsheet itself. However, all IDAES models (including FlowsheetBlocks) now have a default_Scaler argument that can be used to assign a default Scaler to use.
  4. I would suggest it would be better to do this the other way around. All units models are aware of their parent flowsheet via the flowsheet() method, and I think it is best that they only look there. If they happen to be part of a sub-flowsheet, then it should be the responsibility of the sub-flowsheet to collect the necessary property packages/options from its parent (via its own flowsheet() method), etc. This allows you to set different defaults at different layers if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority:Normal Normal Priority Issue or PR
Projects
None yet
Development

No branches or pull requests

5 participants