-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make MMM Plots tests run faster #363
Conversation
Codecov Report
@@ Coverage Diff @@
## main #363 +/- ##
=======================================
Coverage 93.62% 93.62%
=======================================
Files 20 20
Lines 1664 1664
=======================================
Hits 1558 1558
Misses 106 106 |
control_columns=["control_1", "control_2"], | ||
channel_columns=["channel_1", "channel_2"], | ||
) | ||
# fit the model | ||
mmm.fit( | ||
X=toy_X, | ||
y=toy_y, | ||
chains=2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to sample? In many CLV tests we create a dummy idata with prior draws as if they were posterior.
If you really need to sample, maybe also reduce tune?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think the bottleneck comes from https://github.com/pymc-labs/pymc-marketing/blob/main/pymc_marketing/model_builder.py#L308 as we are not use the samples
argument in the plot_prior
which is one of the slowest tests. @michaelraczycki can we add **kwargs
to be also passed to the sample_prior_predictive
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really wise to call all 3 sampling methods when "fitting" the model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this discussion might be beyond this PR as it is more about the fit
scope in the ModelBuilder
. I suggest we merge these small changes which make the ci time from 33min -> 27 min and then iterate on the prior_predictive_sample arguments. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure that is out of scope.
I still don't think we should be doing mcmc sampling at all just for a plotting test. Definitely don't need prior/posterior and posterior predictive draws?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't think we should be doing mcmc sampling at all just for a plotting test.
ok! so I then suggest we close this PR and I will open another one where I try to do it with dummy idata as suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make the additional sampling (prior and posterior predictive) optional, controlled by boolean parameters. That way we can make it more compact by people that actually want to get all samples straight away, and not force anyone to burn computing time if they don't want to. Fine with you? @juanitorduz off, I'll include it in the same PR, I'm also adjusting the docstrings in model_builder, as they need some more attention
I will try to avoid sampling at all for the plotting tests in a future PR. |
Closes: #361
📚 Documentation preview 📚: https://pymc-marketing--363.org.readthedocs.build/en/363/