Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time varying intercept #628

Merged
merged 53 commits into from
May 1, 2024
Merged

Time varying intercept #628

merged 53 commits into from
May 1, 2024

Conversation

ulfaslak
Copy link
Contributor

@ulfaslak ulfaslak commented Apr 15, 2024

This PR adds a time varying coefficient for the intercept. It's near-identical to #598, except it does not add support for time varying media contributions (this, we will add later).

Description

The API is now:

mmm = DelayedSaturatedMMM(
    date_column=my_date_column_name,
    channel_columns=my_media_column_names,
    time_varying_intercept=True,            # <-- This is new
)

Moreover, the model_config can now contain arguments for the time varying prior:

mmm = DelayedSaturatedMMM(
    ...
    model_config={
        ...,
        "tvp_kwargs": {
            "m": 200,
            "L": 100,  # If left to None, is estimated from the data (supporting 1 year of future OOS prediction)  
            "eta_lam": 1,
            "ls_mu": 52,  # If left to None, is estimated from the data (supporting 1 year of future OOS prediction)    
            "ls_sigma": 10,
            "cov_func": myCovFunc,  # If left to None, default Matern52 is used (good default).
        },
    }
)

Changes are also made to the plotting and data setter functions, to support out of sample predictions.

Checklist

Modules affected

  • MMM
  • CLV

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

📚 Documentation preview 📚: https://pymc-marketing--628.org.readthedocs.build/en/628/

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

codecov bot commented Apr 15, 2024

Codecov Report

Attention: Patch coverage is 94.52055% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 91.78%. Comparing base (e63929a) to head (f0e59e6).

Files Patch % Lines
pymc_marketing/mmm/base.py 83.33% 3 Missing ⚠️
pymc_marketing/mmm/delayed_saturated_mmm.py 93.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #628      +/-   ##
==========================================
+ Coverage   91.73%   91.78%   +0.05%     
==========================================
  Files          22       24       +2     
  Lines        2322     2386      +64     
==========================================
+ Hits         2130     2190      +60     
- Misses        192      196       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@wd60622 wd60622 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the API change is fine. Just some comments to package up the code a bit more with extendability and reusability in mind in the future. Hopefully then being available for other cases and useful even with upcoming API changes

Would love to hear the thoughts

pymc_marketing/mmm/utils.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/tvp.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/delayed_saturated_mmm.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/base.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/delayed_saturated_mmm.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@juanitorduz juanitorduz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulfaslak Super nice PR and fun to review! I left some small comments :)

pymc_marketing/mmm/base.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/base.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/delayed_saturated_mmm.py Show resolved Hide resolved
pymc_marketing/mmm/delayed_saturated_mmm.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/delayed_saturated_mmm.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/tvp.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/tvp.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/tvp.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/tvp.py Outdated Show resolved Hide resolved
pymc_marketing/mmm/utils.py Outdated Show resolved Hide resolved
@juanitorduz
Copy link
Collaborator

@ulfaslak can you please the notebook name in https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/docs/source/notebooks/index.md so that it gets passed tot he docs 🙏

@juanitorduz juanitorduz added docs Improvements or additions to documentation enhancement New feature or request MMM labels Apr 16, 2024
@juanitorduz
Copy link
Collaborator

@ulfaslak you might also wanna update the branch as we solved an issue in #633 🙏

Copy link

review-notebook-app bot commented Apr 16, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-04-16T10:42:09Z
----------------------------------------------------------------

can you add black lines so that the items are spaced correctly :)


Copy link

review-notebook-app bot commented Apr 16, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-04-16T10:42:10Z
----------------------------------------------------------------

Would you mind adding

warnings.filterwarnings("ignore")

az.style.use("arviz-darkgrid")
plt.rcParams["figure.figsize"] = [12, 7]
plt.rcParams["figure.dpi"] = 100

%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = "retina"

To make the style consistent :)


Copy link

review-notebook-app bot commented Apr 16, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-04-16T10:42:10Z
----------------------------------------------------------------

  • can you explain a bit what the do operator does for users who are not familiar with it?
  • Can you comment on the choice of the true_params?Think about someone who just knows the concepts from the MMM example
  • add x and y labels to the plot :)

Copy link

review-notebook-app bot commented Apr 16, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-04-16T10:42:11Z
----------------------------------------------------------------

Line #30.    plt.title("Target")

can we use the fig, ax - plt.subplots() syntax?


Copy link

review-notebook-app bot commented Apr 16, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-04-16T10:42:12Z
----------------------------------------------------------------

Line #7.        sampler_config={"chains": 2, "draws": 10, "tune": 10, "nuts_sampler": "numpyro"},

Just 10 draws?


Copy link

review-notebook-app bot commented Apr 16, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-04-16T10:42:13Z
----------------------------------------------------------------

Can you please give more explanation to these plots (titles, x and y labels ?) :D

Also, final words on the key takeaways :)


@ulfaslak
Copy link
Contributor Author

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-04-16T10:42:12Z ----------------------------------------------------------------

Line #7. sampler_config={"chains": 2, "draws": 10, "tune": 10, "nuts_sampler": "numpyro"},
Just 10 draws?

ya that was for debugging. I'm revising this whole notebook and making it nice now :). Thanks for all the comments!

If provided `X` to `_data_setter` was *not* the immediate sequence following the training set, the time_index would we wrong. With this fix, the `date["time_index"]` which gets set upon providing a new `X`, is inferred from the `self.date_column` column of provided `X` (by comparing it to same date column in the training data `self.X`).
@ulfaslak
Copy link
Contributor Author

There is one test failing tests/mmm/test_tvp.py::test_calling_without_default_args

I'll address this tomorrow.

SO CLOSE!!

Copy link
Collaborator

That would be an interesting observation indeed!


View entire conversation on ReviewNB

Copy link

review-notebook-app bot commented Apr 30, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-04-30T21:07:01Z
----------------------------------------------------------------

These zoom in plots are great! thanks for adding them !!!


ulfaslak commented on 2024-05-01T07:36:06Z
----------------------------------------------------------------

I'll start using ax.annotate ALL THE TIME from now on 😈

@juanitorduz
Copy link
Collaborator

There is one test failing tests/mmm/test_tvp.py::test_calling_without_default_args

I'll address this tomorrow.

SO CLOSE!!

The notebook looks amazing! There are a couple of minor comments open in the notebook, but they are not blockers, IMO (it would be nice to test if, in this case, the out-of-sample forecast accuracy breaks after one length scale). So, once you are happy with the notebook and you fix the test, we can merge this amazing addition! :)

@juanitorduz
Copy link
Collaborator

(tomorrow is 1st of May so maybe you do it on Thursday and enjoy the day :D )

Copy link
Contributor

I love the new introduction!


View entire conversation on ReviewNB

Copy link
Contributor Author

ulfaslak commented May 1, 2024

Maaa maybe. I think in reality things are more complicated than that and we can't make an observation that simple. The default length-scale is 2 years (104.36 when data is weekly, 730.5 when it is daily)... When I lower length-scale in the last example, to one year, OOS uncertainty actually get smaller. I could try to dig into the relationship between LS and OOS performance, but the finding and nuanced answer would probably be *it depends*, and then—to bake that into this notebook—would require a new section. That might clutter the narrative and would take me a bunch of time = slow down release. So while interesting that you note this, I am against adding it for now (could be material for another time though, let's take note).


View entire conversation on ReviewNB

Copy link
Contributor Author

ulfaslak commented May 1, 2024

I'll start using ax.annotate ALL THE TIME from now on 😈


View entire conversation on ReviewNB

@juanitorduz
Copy link
Collaborator

Maaa maybe. I think in reality things are more complicated than that and we can't make an observation that simple. The default length-scale is 2 years (104.36 when data is weekly, 730.5 when it is daily)... When I lower length-scale in the last example, to one year, OOS uncertainty actually get smaller. I could try to dig into the relationship between LS and OOS performance, but the finding and nuanced answer would probably be it depends, and then—to bake that into this notebook—would require a new section. That might clutter the narrative and would take me a bunch of time = slow down release. So while interesting that you note this, I am against adding it for now (could be material for another time though, let's take note).

View entire conversation on ReviewNB

You are right! Let's skip this for this first iteration :)

@ulfaslak
Copy link
Contributor Author

ulfaslak commented May 1, 2024

(tomorrow is 1st of May so maybe you do it on Thursday and enjoy the day :D )

Appreciate the gesture 🙏. I'm already two cups of coffee deep though, so I might as well get it wrapped up 😅

@ulfaslak
Copy link
Contributor Author

ulfaslak commented May 1, 2024

@juanitorduz @wd60622 @carlosagostini OK I think it's done now. Fixed the test and did some last minute copy fixes using grammarly (lotta typos when writing in free hand in code editor). Letting the CI spinners converge now, and hopefully we can merge today on May 1st 🎉.

@juanitorduz
Copy link
Collaborator

@juanitorduz @wd60622 @carlosagostini OK I think it's done now. Fixed the test and did some last minute copy fixes using grammarly (lotta typos when writing in free hand in code editor). Letting the CI spinners converge now, and hopefully we can merge today on May 1st 🎉.

How do you see Grammarly on notebooks? Or do you convert it to markdown?

@juanitorduz
Copy link
Collaborator

Fantastic 🚀 !

@wd60622 do you have any additional comments :) ?

Copy link
Contributor

@wd60622 wd60622 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Epic

@juanitorduz juanitorduz merged commit f0097e2 into main May 1, 2024
10 checks passed
@juanitorduz juanitorduz deleted the time-varying-intercept branch May 1, 2024 09:38
@ulfaslak
Copy link
Contributor Author

ulfaslak commented May 1, 2024

@juanitorduz @wd60622 @carlosagostini OK I think it's done now. Fixed the test and did some last minute copy fixes using grammarly (lotta typos when writing in free hand in code editor). Letting the CI spinners converge now, and hopefully we can merge today on May 1st 🎉.

How do you see Grammarly on notebooks? Or do you convert it to markdown?

I have Grammarly as a Chrome extension, and then I just load the notebook in the browser the old school way.

@wd60622 wd60622 mentioned this pull request Jun 3, 2024
@ulfaslak ulfaslak restored the time-varying-intercept branch September 8, 2024 11:39
twiecki pushed a commit that referenced this pull request Sep 10, 2024
* Add time-varying prior functionality to DelayedSaturatedMMM

* resolve wd's comments

* resolve failing pre-commits

* add tvp_kwargs to model_config

* fix typo

* replace softplus

* resolve minor review comments

* Add option to supply `ax` to `plot_posterior_predictive`

* bugfix: time_index was not set correctly for OOS

If provided `X` to `_data_setter` was *not* the immediate sequence following the training set, the time_index would we wrong. With this fix, the `date["time_index"]` which gets set upon providing a new `X`, is inferred from the `self.date_column` column of provided `X` (by comparing it to same date column in the training data `self.X`).

* Clean up example notebook

* Make utility function `transform_1d_array`

* 'tvp_kwargs' -> 'intercept_tvp_kwargs'

* move `infer_time_index` into utils

* add tests for new utils

* small fixes (found in tests)

* add tests to cover all added cases

* fix ruff check

* update typehints

* resolve review comments

* refactor model logic for tv intercept

* address review comment for util test

* .

* fix documentation link

* change variable name

* fix hsgp_dims

* update time_varying_prior to be centered on 1

* review fixes

* fix broken test

* add final tests

* fix coverage issues

* Update tests/mmm/test_tvp.py

Co-authored-by: Will Dean <[email protected]>

* Update pymc_marketing/mmm/tvp.py

Co-authored-by: Will Dean <[email protected]>

* Update tests/mmm/test_tvp.py

Co-authored-by: Will Dean <[email protected]>

* Update tests/mmm/test_tvp.py

Co-authored-by: Will Dean <[email protected]>

* significant improvements to notebook

* fix heading

* update notebook to make it EVEN better

* update legend, add watermark

* fix intro

* fix broken test

* copy sweep with grammarly

---------

Co-authored-by: Will Dean <[email protected]>
twiecki pushed a commit that referenced this pull request Sep 10, 2024
* Add time-varying prior functionality to DelayedSaturatedMMM

* resolve wd's comments

* resolve failing pre-commits

* add tvp_kwargs to model_config

* fix typo

* replace softplus

* resolve minor review comments

* Add option to supply `ax` to `plot_posterior_predictive`

* bugfix: time_index was not set correctly for OOS

If provided `X` to `_data_setter` was *not* the immediate sequence following the training set, the time_index would we wrong. With this fix, the `date["time_index"]` which gets set upon providing a new `X`, is inferred from the `self.date_column` column of provided `X` (by comparing it to same date column in the training data `self.X`).

* Clean up example notebook

* Make utility function `transform_1d_array`

* 'tvp_kwargs' -> 'intercept_tvp_kwargs'

* move `infer_time_index` into utils

* add tests for new utils

* small fixes (found in tests)

* add tests to cover all added cases

* fix ruff check

* update typehints

* resolve review comments

* refactor model logic for tv intercept

* address review comment for util test

* .

* fix documentation link

* change variable name

* fix hsgp_dims

* update time_varying_prior to be centered on 1

* review fixes

* fix broken test

* add final tests

* fix coverage issues

* Update tests/mmm/test_tvp.py

Co-authored-by: Will Dean <[email protected]>

* Update pymc_marketing/mmm/tvp.py

Co-authored-by: Will Dean <[email protected]>

* Update tests/mmm/test_tvp.py

Co-authored-by: Will Dean <[email protected]>

* Update tests/mmm/test_tvp.py

Co-authored-by: Will Dean <[email protected]>

* significant improvements to notebook

* fix heading

* update notebook to make it EVEN better

* update legend, add watermark

* fix intro

* fix broken test

* copy sweep with grammarly

---------

Co-authored-by: Will Dean <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Improvements or additions to documentation enhancement New feature or request MMM priority: high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants