-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change priors for TVPs #775
Conversation
* add scaling into lift test method * test scaling methods * test change of MMM likelihood * link up beta_channel parameters * change name to sigma * reorganize and edit * closes #406 * address the feedback in docstrings * add more to docstring * format the docstring * be verbose for future devs * be explicit for the column max values * incorporate the feedback * hide cells and add to intro * add conclusion * switch to header 2 * run notebook through * move the types together * separate model estimated from empirical * fix typos
* drop python 3.9 * try python 3.12 * undo try python 3.12
* improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually
* add more info to the notebook * hide plots code * fix plot y labels * fix plot outputs and remove model build * improve final note probability plots * address comments * use quickstart dataset * feedback part 3 * remowe warnings manually * feedback part 4
* improve mmm docs init * add more code examples to docstrings * minor improvemeents * typo * better phrasing * add thomas suggestion
remove ruff E501 ignore
updates: - [github.com/astral-sh/ruff-pre-commit: v0.3.5 → v0.3.7](astral-sh/ruff-pre-commit@v0.3.5...v0.3.7) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Juan Orduz <[email protected]>
* fix potential bugs * minor improvements * remove rule for nb * fix test * improve tests syntax * use stacklevel 2 for warnings * use stacklevel 1 for warnings as they are used in public methods * ignore B904 * undo change * ricardos feedback * use fit_posterior * Update pymc_marketing/clv/models/gamma_gamma.py Co-authored-by: Ricardo Vieira <[email protected]> * last fix XD --------- Co-authored-by: Ricardo Vieira <[email protected]>
* notebook opening and imports * model definition markdown * Data Load Notebook section * WIP model fitting section * added notebook to docs directory * notebook edits and graph code * ppc section and nb cleanup * demz sampling and WIP plotting * WIP predictive plots * WIP heatmap plots * predictive plots * WIP covariates and nbqa-ruff edits * covariate section * plot additions * fig sizes * remove model file
…book (#651) * add spaces, increase indentation, and fix number order * explicit with 6
* Creating plot waterfall Co-Authored-By: Carlos Trujillo <[email protected]> * requested changes * pre-commit --------- Co-authored-by: Carlos Trujillo <[email protected]>
Databricks should have a lower-case b.
updates: - [github.com/astral-sh/ruff-pre-commit: v0.4.1 → v0.4.2](astral-sh/ruff-pre-commit@v0.4.1...v0.4.2) - [github.com/pre-commit/mirrors-mypy: v1.9.0 → v1.10.0](pre-commit/mirrors-mypy@v1.9.0...v1.10.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* support for negative values and dates (if used) * fix terrible spelling * test dates in coords * cover numpy objects * consolidate the tests
* add tv intrecept to readme * add to comparison table
* ignore non-implemented * remove not implemented error from abstract classes * simplify docstrings
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Carlos made some recent changes so their will have to be a rebase. Not only is there the intercept tvp kwargs (now config) but media tvp config |
return pm.Deterministic(f"{name}", 1 / ell_inv) | ||
|
||
|
||
def approx_hsgp_hyperparams(x, x_center, lengthscale_range: list[float], cov_func: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is taken directly from PyMC, right? Can this be imported? Or not of its origins
Maybe create an issue to remove at some point once the PyMC version guarantees its existance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bwengals! This is super nice, and I definitely appreciate your input on better GP priors. I left some comments that are more questions :)
It seems there is a merge conflict with a notebook so feel free to open a PR or rebase (whatever fits you well). I like the PC prior and we just need to define a nice fallback mechanism. We do not expect our users to be experts on GPs so that is a challenge. Once we crack that, we simply add a couple of tests and merge :)
One dimensional PC prior for GP lengthscales, parameterized by tail probability: | ||
p(lengthscale < lower) = alpha. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very nice! Would it be possible to get a bit more info about the nature of this prior and the benefits ? This can help us (and our users) whoa re not familiar with this :)
# Note: this function sometimes fails, how to handle? | ||
# If not using PC prior, should user set ls_mu and ls_sigma instead? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any simple heuristics for a reasonable fallback mechanism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None that I know of, I think @ulfaslak told me he wrote something to fix this though?
} | ||
cov_func = eta**2 * cov_funcs[cov_func.lower()](input_dim=1, ls=ls) | ||
|
||
gp = pm.gp.HSGP(m=[m], L=[L], cov_func=cov_func, drop_first=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we pass the drop_true
from the function signature here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got hung up on the best default. There can be identifiability issues between the intercept and a GP, particularly for smaller intercepts (relative to the GP) and at longer lengthscales. This is evident in the HSGP when c gets a bit larger, the first basis vector becomes more and more constant. For a default, I think the best option is to either, keep the intercept parameter and set drop_first = True, or have drop_first=False and not have an intercept parameter.
I'm not sure what would be better though, it depends on what the data "usually" looks like. Like if the intercept for the TVP is like 1000 and the GP part oscillates between -1 and 1, then there wont be identifiability issues, regardless of lengthscale. But if the intercept is like 0.2 and the GP oscillates between -1 and 1, then it'll be a problem.
pymc_marketing/mmm/tvp.py
Outdated
phi, sqrt_psd = gp.prior_linearized(Xs=X[:, None] - X_mid) | ||
hsgp_coefs = pm.Normal(f"{name}_hsgp_coefs", dims=hsgp_dims) | ||
f = phi @ (hsgp_coefs * sqrt_psd).T | ||
centered_f = f - f.mean(axis=0) + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to oscillate around 1.0
as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took out f - f.mean(axis=0) + 1
since I think that will make predicting into the future difficult. And a better way to zero mean this is to de-mean the basis phi
directly. I think if z ~ Normal(0, 1), then exp(z) will have a median at 1, so still oscillates around 1.0.
@@ -146,7 +257,7 @@ def create_time_varying_intercept( | |||
) | |||
return pm.Deterministic( | |||
name="intercept", | |||
var=intercept_base * multiplier, | |||
var=pt.exp(intercept_base + multiplier), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this parametrization much better!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey team, can we deep-dive a bit more about this implementation?
In the original implementation, The mean of 𝛽𝑡 would be the
**_beta
(since the mean of multiplier is 1). Then variance of 𝛽𝑡 will depend on the variance of the multiplier
.
Because of it, the initial implementation is more straightforward and interpretable (Personal opinion). Mostly around the response curves.
- What would be the benefit of this parameterization, if it is not more interpretable?
- Being exponential, it forces the final parameter to be positive. This depending on the parameter we want to be time dependent may or may not be needed.
- I think that if the result of
**_base + latent
is a normal distribution, there should be no problem, the result is a LogNormal distribution. But if it isGamma
orHalfNormal
, the result would be an even more skewed distribution -This might be feasible because the config controls the base distributions- I'm not sure if it's a problem or not, but have we thought about how this transformation would apply to different arbitrary distributions?
A couple of questions that may be very obvious and I am a little slow to see the answers! 😅
# if model_config["intercept_tvp_kwargs"]["L"] is None: | ||
# model_config["intercept_tvp_kwargs"]["L"] = ( | ||
# time_index_mid + DAYS_IN_YEAR / time_resolution | ||
# ) | ||
# if model_config["intercept_tvp_kwargs"]["ls_mu"] is None: | ||
# model_config["intercept_tvp_kwargs"]["ls_mu"] = ( | ||
# DAYS_IN_YEAR / time_resolution * 2 | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to be deleted?
Are we using the DAYS_IN_YEAR
at all? 🤔
Gotcha thanks @wd60622. I'll rebase and check out those updates. This should benefit that as well. |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-06-25T07:58:52Z Line #12. sys.path.insert(0, "../../../../") why do you need this :) ? |
@bwengals now that we have released 0.7.0 after some refactor I think this is a great time to work on this improvements :) Would you mind rebasing or (maybe even better to avoid fixing conflicts) creating a new PR and work on the integration? We can also work on smaller PRs to work in shorter iterations :) |
I will take this over using |
f9a38fd
to
818ba39
Compare
Description
This PR changes the TVP HSGP prior parameters for the scale and lengthscale to default to the PC prior, and be allowed to change to the Inverse Gamma.
Posting this PR as draft since I got pretty hung up on how best to integrate this into PyMC-Marketing, so hopefully I could get some guidance on how to do that.
I also reran the results in the TVP example nb. The first sine wave example looks OK to me, but the GP just reverts to it's mean (which looks to be estimated a bit off). The second example with the linear trend looks much much better. The PC prior allows the data to choose a very long lengthscale now. The third example shouldn't change much, a short lengthscale will work the best here. I do think it'd be nice to include an example where the GP can really shine, where the TVP varies irregularly with a moderate frequency.
Related Issue
Checklist
Modules affected
Type of change
📚 Documentation preview 📚: https://pymc-marketing--775.org.readthedocs.build/en/775/