Change priors for TVPs #775

bwengals · 2024-06-23T00:26:58Z

Description

This PR changes the TVP HSGP prior parameters for the scale and lengthscale to default to the PC prior, and be allowed to change to the Inverse Gamma.

Posting this PR as draft since I got pretty hung up on how best to integrate this into PyMC-Marketing, so hopefully I could get some guidance on how to do that.

I also reran the results in the TVP example nb. The first sine wave example looks OK to me, but the GP just reverts to it's mean (which looks to be estimated a bit off). The second example with the linear trend looks much much better. The PC prior allows the data to choose a very long lengthscale now. The third example shouldn't change much, a short lengthscale will work the best here. I do think it'd be nice to include an example where the GP can really shine, where the TVP varies irregularly with a moderate frequency.

Related Issue

Closes #
Related to Different default priors for HSGP TVPs #774

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Modules affected

MMM
CLV

Type of change

📚 Documentation preview 📚: https://pymc-marketing--775.org.readthedocs.build/en/775/

* add scaling into lift test method * test scaling methods * test change of MMM likelihood * link up beta_channel parameters * change name to sigma * reorganize and edit * closes #406 * address the feedback in docstrings * add more to docstring * format the docstring * be verbose for future devs * be explicit for the column max values * incorporate the feedback * hide cells and add to intro * add conclusion * switch to header 2 * run notebook through * move the types together * separate model estimated from empirical * fix typos

* drop python 3.9 * try python 3.12 * undo try python 3.12

* improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually

* add more info to the notebook * hide plots code * fix plot y labels * fix plot outputs and remove model build * improve final note probability plots * address comments * use quickstart dataset * feedback part 3 * remowe warnings manually * feedback part 4

* improve mmm docs init * add more code examples to docstrings * minor improvemeents * typo * better phrasing * add thomas suggestion

* move fixtures to conftest * docstrings and moved set_model_fit to conftest * fixed pandas quickstart warnings * revert to MockModel and add ParetoNBD support * quickstart edit for issue 609 * notebook edit

remove ruff E501 ignore

updates: - [github.com/astral-sh/ruff-pre-commit: v0.3.5 → v0.3.7](astral-sh/ruff-pre-commit@v0.3.5...v0.3.7) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Juan Orduz <[email protected]>

* fix potential bugs * minor improvements * remove rule for nb * fix test * improve tests syntax * use stacklevel 2 for warnings * use stacklevel 1 for warnings as they are used in public methods * ignore B904 * undo change * ricardos feedback * use fit_posterior * Update pymc_marketing/clv/models/gamma_gamma.py Co-authored-by: Ricardo Vieira <[email protected]> * last fix XD --------- Co-authored-by: Ricardo Vieira <[email protected]>

* notebook opening and imports * model definition markdown * Data Load Notebook section * WIP model fitting section * added notebook to docs directory * notebook edits and graph code * ppc section and nb cleanup * demz sampling and WIP plotting * WIP predictive plots * WIP heatmap plots * predictive plots * WIP covariates and nbqa-ruff edits * covariate section * plot additions * fig sizes * remove model file

…book (#651) * add spaces, increase indentation, and fix number order * explicit with 6

* Creating plot waterfall Co-Authored-By: Carlos Trujillo <[email protected]> * requested changes * pre-commit --------- Co-authored-by: Carlos Trujillo <[email protected]>

Databricks should have a lower-case b.

updates: - [github.com/astral-sh/ruff-pre-commit: v0.4.1 → v0.4.2](astral-sh/ruff-pre-commit@v0.4.1...v0.4.2) - [github.com/pre-commit/mirrors-mypy: v1.9.0 → v1.10.0](pre-commit/mirrors-mypy@v1.9.0...v1.10.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* support for negative values and dates (if used) * fix terrible spelling * test dates in coords * cover numpy objects * consolidate the tests

* add tv intrecept to readme * add to comparison table

* ignore non-implemented * remove not implemented error from abstract classes * simplify docstrings

review-notebook-app · 2024-06-23T00:27:02Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

wd60622 · 2024-06-23T05:30:40Z

Carlos made some recent changes so their will have to be a rebase. Not only is there the intercept tvp kwargs (now config) but media tvp config

wd60622 · 2024-06-23T05:32:38Z

pymc_marketing/mmm/tvp.py

+    return pm.Deterministic(f"{name}", 1 / ell_inv)
+
+
+def approx_hsgp_hyperparams(x, x_center, lengthscale_range: list[float], cov_func: str):


This is taken directly from PyMC, right? Can this be imported? Or not of its origins
Maybe create an issue to remove at some point once the PyMC version guarantees its existance

juanitorduz

Thanks @bwengals! This is super nice, and I definitely appreciate your input on better GP priors. I left some comments that are more questions :)

It seems there is a merge conflict with a notebook so feel free to open a PR or rebase (whatever fits you well). I like the PC prior and we just need to define a nice fallback mechanism. We do not expect our users to be experts on GPs so that is a challenge. Once we crack that, we simply add a couple of tests and merge :)

juanitorduz · 2024-06-24T09:43:34Z

pymc_marketing/mmm/tvp.py

+    One dimensional PC prior for GP lengthscales, parameterized by tail probability:
+    p(lengthscale < lower) = alpha.


This looks very nice! Would it be possible to get a bit more info about the nature of this prior and the benefits ? This can help us (and our users) whoa re not familiar with this :)

juanitorduz · 2024-06-24T09:47:08Z

pymc_marketing/mmm/tvp.py

+        # Note: this function sometimes fails, how to handle?
+        # If not using PC prior, should user set ls_mu and ls_sigma instead?


Do we have any simple heuristics for a reasonable fallback mechanism?

None that I know of, I think @ulfaslak told me he wrote something to fix this though?

juanitorduz · 2024-06-24T09:47:55Z

pymc_marketing/mmm/tvp.py

+    }
+    cov_func = eta**2 * cov_funcs[cov_func.lower()](input_dim=1, ls=ls)
+
+    gp = pm.gp.HSGP(m=[m], L=[L], cov_func=cov_func, drop_first=True)


shall we pass the drop_true from the function signature here?

I got hung up on the best default. There can be identifiability issues between the intercept and a GP, particularly for smaller intercepts (relative to the GP) and at longer lengthscales. This is evident in the HSGP when c gets a bit larger, the first basis vector becomes more and more constant. For a default, I think the best option is to either, keep the intercept parameter and set drop_first = True, or have drop_first=False and not have an intercept parameter.

I'm not sure what would be better though, it depends on what the data "usually" looks like. Like if the intercept for the TVP is like 1000 and the GP part oscillates between -1 and 1, then there wont be identifiability issues, regardless of lengthscale. But if the intercept is like 0.2 and the GP oscillates between -1 and 1, then it'll be a problem.

juanitorduz · 2024-06-24T09:48:50Z

pymc_marketing/mmm/tvp.py

    phi, sqrt_psd = gp.prior_linearized(Xs=X[:, None] - X_mid)
-    hsgp_coefs = pm.Normal(f"{name}_hsgp_coefs", dims=hsgp_dims)
-    f = phi @ (hsgp_coefs * sqrt_psd).T
-    centered_f = f - f.mean(axis=0) + 1


Do we want this to oscillate around 1.0 as before?

I took out f - f.mean(axis=0) + 1 since I think that will make predicting into the future difficult. And a better way to zero mean this is to de-mean the basis phi directly. I think if z ~ Normal(0, 1), then exp(z) will have a median at 1, so still oscillates around 1.0.

juanitorduz · 2024-06-24T09:49:12Z

pymc_marketing/mmm/tvp.py

@@ -146,7 +257,7 @@ def create_time_varying_intercept(
        )
        return pm.Deterministic(
            name="intercept",
-            var=intercept_base * multiplier,
+            var=pt.exp(intercept_base + multiplier),


I like this parametrization much better!

Hey team, can we deep-dive a bit more about this implementation?

In the original implementation, The mean of 𝛽𝑡 would be the
**_beta (since the mean of multiplier is 1). Then variance of 𝛽𝑡 will depend on the variance of the multiplier.

Because of it, the initial implementation is more straightforward and interpretable (Personal opinion). Mostly around the response curves.

What would be the benefit of this parameterization, if it is not more interpretable?

Being exponential, it forces the final parameter to be positive. This depending on the parameter we want to be time dependent may or may not be needed.

I think that if the result of **_base + latent is a normal distribution, there should be no problem, the result is a LogNormal distribution. But if it is Gamma or HalfNormal, the result would be an even more skewed distribution -This might be feasible because the config controls the base distributions- I'm not sure if it's a problem or not, but have we thought about how this transformation would apply to different arbitrary distributions?

A couple of questions that may be very obvious and I am a little slow to see the answers! 😅

cc: @bwengals @juanitorduz @wd60622 @ulfaslak

juanitorduz · 2024-06-24T09:49:57Z

pymc_marketing/mmm/tvp.py

+        # if model_config["intercept_tvp_kwargs"]["L"] is None:
+        #    model_config["intercept_tvp_kwargs"]["L"] = (
+        #        time_index_mid + DAYS_IN_YEAR / time_resolution
+        #    )
+        # if model_config["intercept_tvp_kwargs"]["ls_mu"] is None:
+        #    model_config["intercept_tvp_kwargs"]["ls_mu"] = (
+        #        DAYS_IN_YEAR / time_resolution * 2
+        #    )


Is this to be deleted?

Are we using the DAYS_IN_YEAR at all? 🤔

bwengals · 2024-06-25T00:05:05Z

Carlos made some recent changes so their will have to be a rebase. Not only is there the intercept tvp kwargs (now config) but media tvp config

Gotcha thanks @wd60622. I'll rebase and check out those updates. This should benefit that as well.

review-notebook-app · 2024-06-25T07:58:52Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-06-25T07:58:52Z
----------------------------------------------------------------

Line #12.    sys.path.insert(0, "../../../../")

why do you need this :) ?

juanitorduz · 2024-07-02T20:21:42Z

@bwengals now that we have released 0.7.0 after some refactor I think this is a great time to work on this improvements :) Would you mind rebasing or (maybe even better to avoid fixing conflicts) creating a new PR and work on the integration? We can also work on smaller PRs to work in shorter iterations :)

wd60622 · 2024-07-11T12:04:52Z

I will take this over using main branch. I will get to it when I can

wd60622 and others added 30 commits April 2, 2024 15:07

Drop python 3.9 support (#615)

8590b45

* drop python 3.9 * try python 3.12 * undo try python 3.12

add lift tests check

27a55d5

Add more content to the Gamma-Gamma Notebook (#573)

f3ec2cf

* improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually

Improve MMM Docs (#612)

da9f025

* improve mmm docs init * add more code examples to docstrings * minor improvemeents * typo * better phrasing * add thomas suggestion

Fix clv plotting bugs and edits to Quickstart (#601)

38b55a2

* move fixtures to conftest * docstrings and moved set_model_fit to conftest * fixed pandas quickstart warnings * revert to MockModel and add ParetoNBD support * quickstart edit for issue 609 * notebook edit

[pre-commit.ci] pre-commit autoupdate (#616)

ffef5a6

improve coords matching (#623)

a151330

python 3.12 attempt (#618)

6f33cef

closes #520 (#621) dataset -> data

c40c8b8

mmm example notebook small fixes (#626)

ff937f4

Remove ruff E501 ignore (#619)

e2dba27

remove ruff E501 ignore

Update pypi.yml (dash change) (#633)

b777eb5

UP rule ruff (#635)

2be2f63

Add ruff RUF rules (#636)

582d2ed

Update version.txt (#640)

78964f0

[pre-commit.ci] pre-commit autoupdate (#642)

18d926c

Fix build badge (#645)

ce41f5f

Add downloads stats to README

818781a

add spaces, increase indentation, and fix number order to Pareto note…

7034e28

…book (#651) * add spaces, increase indentation, and fix number order * explicit with 6

Add link to new Pareto notebook (#649)

d531d0c

Plot Waterfall Components Decomposition (#631)

1c186f0

* Creating plot waterfall Co-Authored-By: Carlos Trujillo <[email protected]> * requested changes * pre-commit --------- Co-authored-by: Carlos Trujillo <[email protected]>

Update resources.md (#652)

b36b0ac

Databricks should have a lower-case b.

fix ylabel (#654)

ae6ae1d

Lift test fixes (#656)

a89bd93

* support for negative values and dates (if used) * fix terrible spelling * test dates in coords * cover numpy objects * consolidate the tests

juanitorduz and others added 12 commits May 2, 2024 11:00

Add tv intrecept to readme (#661)

fd6b331

* add tv intrecept to readme * add to comparison table

fix title level (#663)

b27da9e

Remove unnecessary NonImplemented errors from abstract methods (#662)

b785baa

* ignore non-implemented * remove not implemented error from abstract classes * simplify docstrings

Pass conv mode to adstock functions (#665)

19bbf6d

[Try] Fix compressed images in docs. (#667)

fc37560

Update pyproject.toml (#671)

5f54d8e

add license (#673)

0e8ac45

use grep and sed in the env line (#675)

48c88b2

add sample_kwargs (#676)

2374a0c

MMM NB Improvements (waterfall & error plots) (#664)

4dbf943

Update version.txt (#677)

61da9c0

first commit

bc15109

bwengals changed the title ~~first commit~~ Change priors for TVPs Jun 23, 2024

wd60622 added enhancement New feature or request MMM labels Jun 23, 2024

wd60622 reviewed Jun 23, 2024

View reviewed changes

juanitorduz reviewed Jun 24, 2024

View reviewed changes

wd60622 mentioned this pull request Jul 11, 2024

Better default priors for TVP & TVP component #829

Closed

13 tasks

ulfaslak force-pushed the main branch 4 times, most recently from f9a38fd to 818ba39 Compare September 8, 2024 12:38

twiecki force-pushed the main branch from 35dc7f1 to 9f9c67f Compare September 10, 2024 15:25

wd60622 closed this Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change priors for TVPs #775

Change priors for TVPs #775

bwengals commented Jun 23, 2024 •

edited by github-actions bot

Loading

review-notebook-app bot commented Jun 23, 2024

wd60622 commented Jun 23, 2024

wd60622 Jun 23, 2024

juanitorduz left a comment •

edited

Loading

juanitorduz Jun 24, 2024

juanitorduz Jun 24, 2024

bwengals Jun 25, 2024

juanitorduz Jun 24, 2024

bwengals Jun 25, 2024

juanitorduz Jun 24, 2024

bwengals Jun 25, 2024

juanitorduz Jun 24, 2024

cetagostini Jul 2, 2024 •

edited

Loading

juanitorduz Jun 24, 2024

bwengals commented Jun 25, 2024

review-notebook-app bot commented Jun 25, 2024 •

edited

Loading

juanitorduz commented Jul 2, 2024

wd60622 commented Jul 11, 2024

		return pm.Deterministic(f"{name}", 1 / ell_inv)


		def approx_hsgp_hyperparams(x, x_center, lengthscale_range: list[float], cov_func: str):

		One dimensional PC prior for GP lengthscales, parameterized by tail probability:
		p(lengthscale < lower) = alpha.

		# Note: this function sometimes fails, how to handle?
		# If not using PC prior, should user set ls_mu and ls_sigma instead?

Change priors for TVPs #775

Change priors for TVPs #775

Conversation

bwengals commented Jun 23, 2024 • edited by github-actions bot Loading

Description

Related Issue

Checklist

Modules affected

Type of change

review-notebook-app bot commented Jun 23, 2024

wd60622 commented Jun 23, 2024

Choose a reason for hiding this comment

juanitorduz left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cetagostini Jul 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwengals commented Jun 25, 2024

review-notebook-app bot commented Jun 25, 2024 • edited Loading

juanitorduz commented Jul 2, 2024

wd60622 commented Jul 11, 2024

bwengals commented Jun 23, 2024 •

edited by github-actions bot

Loading

juanitorduz left a comment •

edited

Loading

cetagostini Jul 2, 2024 •

edited

Loading

review-notebook-app bot commented Jun 25, 2024 •

edited

Loading