Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sample_posterior_predictive doesn't scale X data #450

Closed
kylejcaron opened this issue Dec 1, 2023 · 3 comments
Closed

sample_posterior_predictive doesn't scale X data #450

kylejcaron opened this issue Dec 1, 2023 · 3 comments

Comments

@kylejcaron
Copy link

kylejcaron commented Dec 1, 2023

In the docs example for MMMs, the line of code to sample the posterior predictive is

mmm.sample_posterior_predictive(X, extend_idata=True, combined=True)

When X is on the scale of dollars, the data wont get automatically scaled from mmm.sample_posterior_predictive, so it causes the results to be off. The example doesn't run into this problem because channel spend is simulated as X ~ U(0,1). I personally got pretty confused adapting the code. to a real dataset and seeing the posterior predictive plot look so off and had to dig through the codebase to figure out what was going on.

I think the following is a more general approach that would lead to less confusion for now in the docs

mmm.sample_posterior_predictive(mmm.preprocessed_data['X'], extend_idata=True, combined=True)

An alternative to changing the docs could be to scale the X data in the mmm.sample_posterior_predictive method

@wd60622
Copy link
Contributor

wd60622 commented Dec 2, 2023

Yes, I am addressing this in PR #444 to scale the new data based on the data the model was fit.

The Fourier features and control channel transformations will be handled appropriately after that PR as well

That PR will be merged in after the custom prior and time varying parameters get added. So there are a few items scheduled before this will be fixed

@geirrlod
Copy link

Hi,

Also find this a bit confusing as well when I work on real non scaled spend. Do you think PR444 will be pushed out in the next couple of weeks, or should we try to find a way around the problem?

@kylejcaron
Copy link
Author

kylejcaron commented Dec 20, 2023

@geirrlod I made this quick change to get it working for me for now

mmm.sample_posterior_predictive(mmm.preprocessed_data['X'], extend_idata=True, combined=True)

also there's this event tomorrow from pymc-labs that you might be interested in here, it sounds like time varying covariates is a main bottleneck for this update, maybe they'll discuss that tomorrow?

edit: AHHH sorry I accidentally hit close issue - just reopened!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants