Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated freezes cause repeated computation #102

Open
inducer opened this issue Oct 12, 2021 · 2 comments
Open

Repeated freezes cause repeated computation #102

inducer opened this issue Oct 12, 2021 · 2 comments

Comments

@inducer
Copy link
Owner

inducer commented Oct 12, 2021

(Maybe this is an arraycontext issue? Not sure.)

Suppose a moderately complex expression is built and evaluated. Then it is evaluated again. Repeated codegen (most likely cached, though see inducer/pytato#163) and repeated evaluation ensues. This isn't ideal. Possibly, a symbolic array, once evaluated, should hold on to its value, to avoid reevaluation. If we don't deal with this somehow, then pytato's arrays can't be meaningfully memoized (e.g. with @memoize/@memoize_method from pytools), as, sure, they memoize the expression, but the actual evaluation is done again and again.

A simple approach to this would be simply stuff the evaluated value into a hidden attribute (_evaluated?) on the expression nodes. I'm aware that this introduces mutation into otherwise immutable objects... but it's the cache-y kind of mutability, and it doesn't change the semantics of the object in the slightest. So I suspect it's OK.

Another possible concern is the lifetime of the evaluated value (i.e. it might outstay its welcome). I don't think that'll be a big concern because

  • somebody held on to the symbolic value, demonstrating that there's at least potential interest in reevaluating.
  • if it becomes an issue, we could make an interface to free the stored value, enabling the memory to be freed.

Generally, codegen should stop at already-evaluated notes (and treat them as DataWrappers).

@MTCam is hitting this with cache retrievals upon freeze when exercising lazy evaluation for chemistry in https://github.com/illinois-ceesd/mirgecom.

cc @kaushikcfd

@kaushikcfd
Copy link
Collaborator

kaushikcfd commented Oct 13, 2021

Some other things to keep in mind:

  1. The evaluated results must maintain a mapping from the target to its evaluated results
  2. To ensure that an array can be statically evaluated, only arrays that exclusively depend on DataWrappers should be evaluated.
  3. This also leads to pytato making assumptions that the DataWrapper contains immutable data, which should be noted in the docs.

Slightly related to arraycontext:
Personally, I've overcome this by doing a thaw(freeze(...)) the time-stepper loop in array context, as generally for me the expressions that were meant to be folded into constants were forming the sub-expressions of a bigger expression that was actually frozen. And so, the evaluation being mentioned here wouldn't be of much relevance in those cases.

@inducer
Copy link
Owner Author

inducer commented Oct 13, 2021

Agree with all three of your points. Thinking about it more has convinced me that this actually is an arraycontext issue, and pytato probably shouldn't contribute to its solution. In particular, pytato nodes shouldn't hold references to their values (because, as you point out, they're target-specific).

A few more thoughts on this:

  • If values are computed via freeze (i.e. with compilation "on the fly", not via precompilation), a lower bound on the cost is set by needing to compare the DAG against one that was fished out of the cache. This means that computing large DAGs this way can't be efficient, so we should probably avoid it as much as possible.
  • Should we set a threshold for the size of DAGs computed in freeze above which we warn about efficiency?
  • Since the nodes themselves won't hold references to values, another option would be a WeakKeyDictionary held by the array context that keeps them alive as long as the expression is.
  • To discover that a value already exists, we need to traverse the whole DAG. Since we already need to traverse the DAG for equality comparison, maybe those two passes can be merged?

Your third point is already covered AFAICT:

https://github.com/inducer/pytato/blob/08ba9271dcf4a61b92366800aa112e9f0d4e32dc/pytato/array.py#L1587-L1588

@inducer inducer transferred this issue from inducer/pytato Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants