Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethinking the GraphBuilder #176

Open
jobrachem opened this issue Jan 17, 2024 · 2 comments
Open

Rethinking the GraphBuilder #176

jobrachem opened this issue Jan 17, 2024 · 2 comments
Labels
comp:model This issue is related to the model module team Work processes and organization of the team

Comments

@jobrachem
Copy link
Contributor

In the recent weeks, there have been a number of issues that are in some way related to the lsl.GraphBuilder class.

I think it may be worthwile to have a look at the GraphBuilder and see how it might evolve in the future.

Three claims

To start off the discussion, I claim the following:

  1. The GraphBuilder class is not in fact where we build the Liesel graph.
  2. The GraphBuilder instead offers methods to change an existing graph or to change nodes in that graph.
  3. A large chunk of the functionality of the GraphBuilder is concentrated in private methods that are being used in the GraphBuilder.build_model method.

Details on the claims

Claim 1

The typical way you build a model in Liesel is by initializing variables, connecting them with calculators. Only once in this process, and most likely near the end of it, do you call the GraphBuilder by adding the "terminal nodes": nodes without outputs that can be used to build the whole graph by recursively following their inputs. In my usage, I often have a one-liner of the form:

model = lsl.GraphBuilder().add(response).build_model()

Claim 2

The graph- or var-manipulating methods of the graph builder are:

  • GraphBuilder.replace_x (x can be "node" or "var")
  • GraphBuilder.rename_x (x can be "node" or "var")
  • GraphBuilder.transform

Claim 3

Here's an excerpt from the GraphBuilder.build_model method:

        gb._set_missing_names()
        gb._add_model_log_lik_node()
        gb._add_model_log_prior_node()
        gb._add_model_log_prob_node()
        gb._add_model_seed_nodes()

        nodes, _vars = gb._all_nodes_and_vars()
        nodes_and_vars = nodes + _vars

        model = Model(nodes_and_vars, grow=False, copy=copy)

What follows from these claims?

For me, it currently seems like the GraphBuilder may not need to be its own class. I think its functionality can be allocated as follows:

  1. Var- and node-changing functionality can live directly on Vars and nodes.
  2. Model setup can be done in the Model init. As a side node, the issue addressed in Raise error if duplicate nodes are detected in GraphBuilder.transform #167 can be solved in the process by requiring users to exclusively provide the "terminal nodes" to the model class, such that the graph is always built from the set of minimally required nodes by recursively following their inputs.
  3. Methods for changing an existing model could become independent functions instead. This would only really apply to the GraphBuilder.replace_x methods, as far as I can see. I am aware that the current model class is intentionally static.

Concluding words

This issue is not intended as a fully worked-out solution to only be approved. It is instead intended to provide a basis for discussion about how the GraphBuilder might evolve.

@jobrachem jobrachem added comp:model This issue is related to the model module team Work processes and organization of the team labels Jan 17, 2024
@jobrachem
Copy link
Contributor Author

jobrachem commented Jan 17, 2024

We had a first exchange of thoughts on the matter. If we tackle the GraphBuilder, we will go through three stages:

  1. Decision about whether we want to tackle a bigger change to the GraphBuilder. We are in this stage. If "no", the process ends here.
  2. Decision about what this change should look like.
  3. Decision about the time-frame: When do we want to implement the change?

Here are some notes from today's meeting.

  • There is not much disagreement about the claims.
  • Paul originally thought of a GraphBuilder as some sort of context manager: Nodes that are created within the context manager are recognized automatically, such that you do not have to add your "terminal nodes" manually.
    • This is not how it works today: You do add your terminal node to the graph builder manually.
    • Johannes worries about using a context manager for building a large model: For one thing, the indentation is annoying, if the model code becomes large. Another point (not discussed): Using a context manager makes it hard to pre-package model parts or build up a model in a distributed way.
    • Generally, the issue is: How can we make model building both smooth and understandable?
  • Another original idea was: Every node belongs to one GraphBuilder only.
    • Personal note from me: We have not discussed this yet, but I am not entirely sure why this is important.
  • The discussion about "what follows" from the claims has not really started at this point.

@jobrachem
Copy link
Contributor Author

@wiep @GianmarcoCallegher, my notes are included above. Feel free to comment for additions or corrections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:model This issue is related to the model module team Work processes and organization of the team
Projects
None yet
Development

No branches or pull requests

1 participant