-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more content to the Gamma-Gamma Notebook #573
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #573 +/- ##
=======================================
Coverage 91.79% 91.79%
=======================================
Files 22 22
Lines 2267 2267
=======================================
Hits 2081 2081
Misses 186 186 ☔ View full report in Codecov by Sentry. |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-03-14T19:09:43Z Why not link directly to the CLV Quickstart csv? It's the same dataset. |
@juanitorduz looks great. Could you just install from conda to get rid of those ugly C-API warnings? |
I cleaned the outputs :) |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-03-21T13:50:22Z Typo; change to (no longer maintained).
Not sure it's worth mentioning, but juanitorduz commented on 2024-04-03T07:04:27Z Thanks! Added |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-03-21T13:50:22Z Line #5. from lifetimes.datasets import load_cdnow_summary_data_with_monetary_value Remove this and refer to comment in cell where this function is used. juanitorduz commented on 2024-04-03T07:09:33Z I think we are "forced" by the ruff notebook style guidelines to have all the imports on top |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-03-21T13:50:23Z Line #9. az.style.use("arviz-darkgrid") Maybe add a comment that lines 9-16 pertain to the plotting config for the notebook? Optional, but I like to comment code blocks like this where the purpose isn't immediately apparent in a variable name. juanitorduz commented on 2024-04-03T07:14:34Z Good point! I will add it to the other notebooks as well. |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-03-21T13:50:24Z Line #1. summary_with_money_value = load_cdnow_summary_data_with_monetary_value()
I'd prefer to keep the legacy
The same dataset is loaded from the repo in the Quickstart notebook: summary_with_money_value = pd.read_csv("https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/datasets/cdnow_transactions.csv")
juanitorduz commented on 2024-04-03T07:19:46Z Got it! Done! |
Looking good so far. In addition to minor remarks below, it may be worth adding a practical example to illustrate (#594). A synthetic monetary value column with crazy huge values like this may do the trick:
|
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-03-21T14:05:59Z Line #1. ggf = GammaGammaFitter(penalizer_coef=0) Let's remove the
. . . Actually, I take that back. Maybe we can highlight the flaws of this as an lead-in to using priors for regularization instead? juanitorduz commented on 2024-04-03T07:20:44Z No, I suggest we leave it out and focus on our model. As you mentioned, we can reduce mentioning the |
Also, it's an important modeling assumption that frequency and monetary value be uncorrelated (less than 0.3). Add this to a cell before the modeling section:
|
1975222
to
148af3a
Compare
Thanks! Added View entire conversation on ReviewNB |
I think we are "forced" by the ruff notebook style guidelines to have all the imports on top View entire conversation on ReviewNB |
Good point! I will add it to the other notebooks as well. View entire conversation on ReviewNB |
Got it! Done! View entire conversation on ReviewNB |
No, I suggest we leave it out and focus on our model. As you mentioned, we can reduce mentioning the View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-04-04T17:34:28Z The CDNOW dataset line seems redundant as it's also mentioned in the Read Data section. juanitorduz commented on 2024-04-04T19:03:35Z Right! I removed it! |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-04-04T17:34:28Z If you re-run this cell, it should clear the NumPy Warning, which may raise questions. juanitorduz commented on 2024-04-04T19:21:02Z I actually do not know where they are coming from. Even if I run everything in conda I still get them :( ColtAllen commented on 2024-04-04T21:47:25Z A clunky workaround would be to add some code suppressing the warnings, then deleting said code prior to pushing to GitHub. |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-04-04T17:34:29Z This is important, but does it need to be in a headline font like this? juanitorduz commented on 2024-04-04T19:03:14Z hehe I do not why is it displayed like this but it looks ok on the render docs XD |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-04-04T17:34:30Z Purely optional, but this and the related code cells can be moved to the prior section. |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-04-04T17:34:31Z If you wish to mention the specific optimizer, it's juanitorduz commented on 2024-04-04T19:06:11Z Good catch! Thanks! |
View / edit / reply to this conversation on ReviewNB ColtAllen commented on 2024-04-04T17:34:32Z I believe it's worth adding some remarks mentioning how the choice of prior will impact the parameter estimates. Up to you if you also want to add a code example. juanitorduz commented on 2024-04-04T19:19:28Z Let me skip this for now as i wanna keep this relative simple. I agree is a great point to maybe we could have a prior sensitivity analysis notebook :) |
Good point! Added a small remark! View entire conversation on ReviewNB |
Great idea! View entire conversation on ReviewNB |
Let me skip this for now as i wanna keep this relative simple. I agree is a great point to maybe we could have a prior sensitivity analysis notebook :) View entire conversation on ReviewNB |
I actually do not know where they are coming from. Even if I run everything in conda I still get them :( View entire conversation on ReviewNB |
Changed the sampler params a bit and now it looks much better View entire conversation on ReviewNB |
Thanks for the review, @ColtAllen. I think I addressed all of your comments (except the pytensor warnings I am trying to remove). Let me know what you think about this version :) |
A clunky workaround would be to add some code suppressing the warnings, then deleting said code prior to pushing to GitHub. View entire conversation on ReviewNB |
c9ae8ca
to
4d261db
Compare
Are you struggling to install pytensor from conda? Shouldn't be getting warnings :) |
I am using mamba because conda is too slow 🙈 . I will delete and re-create all my envs next week 🤞 |
mamba/conda is the same. I also use mamba. Still you shouldn't be getting those warnings. You may not be using the right environment in the NB? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few unresolved formatting and style remarks in ReviewNB, but they're quite minor. Overall I think this is good to merge. Great work!
* improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually
* improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually
* feat(streamlit_explainer): Pushing files for Streamlit explainer app, to illustrate saturation, adstock and prior concepts in an intuitive, visual way to stakeholders and new MMMers * chore(readme): Adding a readme for the app * fix(env): Updating dependencies to include those needed for the Streamlit app * Drop python 3.9 support (#615) * drop python 3.9 * try python 3.12 * undo try python 3.12 * add lift tests check * Add more content to the Gamma-Gamma Notebook (#573) * improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually * Add more content to the BG/NBD Notebook (#571) * add more info to the notebook * hide plots code * fix plot y labels * fix plot outputs and remove model build * improve final note probability plots * address comments * use quickstart dataset * feedback part 3 * remowe warnings manually * feedback part 4 * Improve MMM Docs (#612) * improve mmm docs init * add more code examples to docstrings * minor improvemeents * typo * better phrasing * add thomas suggestion * Fix `clv` plotting bugs and edits to Quickstart (#601) * move fixtures to conftest * docstrings and moved set_model_fit to conftest * fixed pandas quickstart warnings * revert to MockModel and add ParetoNBD support * quickstart edit for issue 609 * notebook edit * [pre-commit.ci] pre-commit autoupdate (#616) * improve coords matching (#623) * python 3.12 attempt (#618) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor(saturation): Using pymc-marketing saturation functions rather than coding my own: Removing tanh, logistic and michaelis menten * refactor(saturation): Remove Hill and Root saturations, as they aren't supported by pymc-marketing currently * refactor(geometric_adstock): Removing custom adstock and using pymc-marketing adstock function to demo decay. Also updating latex to align with pymc-marketing, where decay factor is represented by alpha rather than beta * refactor(delayed_adstock): Using pymc-marketing delayed geometric function rather than custom one * fix(requirements): Adding pymc-marketing to Streamlit requirements for deployment * Added Dev Container Folder * refactor(weibull_cdf): Using pymc-marketing function for Weibull CDF * fix(weibull_cdf): Fixing incorrect dataframe var name for CDF plotting df * refactor(weibull_pdf): Using pymc-marketing function for WeibullPDF * refactor(custom_functions): Removing adstock_saturation_functions.py file now that it is no longer required * chore: Removing devcontainer created by Streamlit * fix(requirements): Adding preliz to requirements * refactor(prior_viz): Reworking the prior visualisation to use Preliz instead of custom function, as well as remove the tab-design. Prior distributions can now be specified programmatically. * refactor(prior_functions.py): Deleting the draw_samples function and replacing it with a programmatic PreliZ function, such that the distribution object is returned when the user passes in the name of a distribution * fix(requirements): Delete obsolete pymc requirement, which should fix deployment dependency conflicts * chore(readme): Updating with guidelines on how to add additional distributions or transformation functions to the app * refactor(plot_config): Moving height and width specifications into constants at top of Adstock and Saturation files, so the plot sizes are set programmatically --------- Co-authored-by: Juan Orduz <[email protected]> Co-authored-by: Colt Allen <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Trujillo <[email protected]>
* improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually
* feat(streamlit_explainer): Pushing files for Streamlit explainer app, to illustrate saturation, adstock and prior concepts in an intuitive, visual way to stakeholders and new MMMers * chore(readme): Adding a readme for the app * fix(env): Updating dependencies to include those needed for the Streamlit app * Drop python 3.9 support (#615) * drop python 3.9 * try python 3.12 * undo try python 3.12 * add lift tests check * Add more content to the Gamma-Gamma Notebook (#573) * improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually * Add more content to the BG/NBD Notebook (#571) * add more info to the notebook * hide plots code * fix plot y labels * fix plot outputs and remove model build * improve final note probability plots * address comments * use quickstart dataset * feedback part 3 * remowe warnings manually * feedback part 4 * Improve MMM Docs (#612) * improve mmm docs init * add more code examples to docstrings * minor improvemeents * typo * better phrasing * add thomas suggestion * Fix `clv` plotting bugs and edits to Quickstart (#601) * move fixtures to conftest * docstrings and moved set_model_fit to conftest * fixed pandas quickstart warnings * revert to MockModel and add ParetoNBD support * quickstart edit for issue 609 * notebook edit * [pre-commit.ci] pre-commit autoupdate (#616) * improve coords matching (#623) * python 3.12 attempt (#618) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor(saturation): Using pymc-marketing saturation functions rather than coding my own: Removing tanh, logistic and michaelis menten * refactor(saturation): Remove Hill and Root saturations, as they aren't supported by pymc-marketing currently * refactor(geometric_adstock): Removing custom adstock and using pymc-marketing adstock function to demo decay. Also updating latex to align with pymc-marketing, where decay factor is represented by alpha rather than beta * refactor(delayed_adstock): Using pymc-marketing delayed geometric function rather than custom one * fix(requirements): Adding pymc-marketing to Streamlit requirements for deployment * Added Dev Container Folder * refactor(weibull_cdf): Using pymc-marketing function for Weibull CDF * fix(weibull_cdf): Fixing incorrect dataframe var name for CDF plotting df * refactor(weibull_pdf): Using pymc-marketing function for WeibullPDF * refactor(custom_functions): Removing adstock_saturation_functions.py file now that it is no longer required * chore: Removing devcontainer created by Streamlit * fix(requirements): Adding preliz to requirements * refactor(prior_viz): Reworking the prior visualisation to use Preliz instead of custom function, as well as remove the tab-design. Prior distributions can now be specified programmatically. * refactor(prior_functions.py): Deleting the draw_samples function and replacing it with a programmatic PreliZ function, such that the distribution object is returned when the user passes in the name of a distribution * fix(requirements): Delete obsolete pymc requirement, which should fix deployment dependency conflicts * chore(readme): Updating with guidelines on how to add additional distributions or transformation functions to the app * refactor(plot_config): Moving height and width specifications into constants at top of Adstock and Saturation files, so the plot sizes are set programmatically --------- Co-authored-by: Juan Orduz <[email protected]> Co-authored-by: Colt Allen <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Trujillo <[email protected]>
* improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually
* feat(streamlit_explainer): Pushing files for Streamlit explainer app, to illustrate saturation, adstock and prior concepts in an intuitive, visual way to stakeholders and new MMMers * chore(readme): Adding a readme for the app * fix(env): Updating dependencies to include those needed for the Streamlit app * Drop python 3.9 support (#615) * drop python 3.9 * try python 3.12 * undo try python 3.12 * add lift tests check * Add more content to the Gamma-Gamma Notebook (#573) * improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually * Add more content to the BG/NBD Notebook (#571) * add more info to the notebook * hide plots code * fix plot y labels * fix plot outputs and remove model build * improve final note probability plots * address comments * use quickstart dataset * feedback part 3 * remowe warnings manually * feedback part 4 * Improve MMM Docs (#612) * improve mmm docs init * add more code examples to docstrings * minor improvemeents * typo * better phrasing * add thomas suggestion * Fix `clv` plotting bugs and edits to Quickstart (#601) * move fixtures to conftest * docstrings and moved set_model_fit to conftest * fixed pandas quickstart warnings * revert to MockModel and add ParetoNBD support * quickstart edit for issue 609 * notebook edit * [pre-commit.ci] pre-commit autoupdate (#616) * improve coords matching (#623) * python 3.12 attempt (#618) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor(saturation): Using pymc-marketing saturation functions rather than coding my own: Removing tanh, logistic and michaelis menten * refactor(saturation): Remove Hill and Root saturations, as they aren't supported by pymc-marketing currently * refactor(geometric_adstock): Removing custom adstock and using pymc-marketing adstock function to demo decay. Also updating latex to align with pymc-marketing, where decay factor is represented by alpha rather than beta * refactor(delayed_adstock): Using pymc-marketing delayed geometric function rather than custom one * fix(requirements): Adding pymc-marketing to Streamlit requirements for deployment * Added Dev Container Folder * refactor(weibull_cdf): Using pymc-marketing function for Weibull CDF * fix(weibull_cdf): Fixing incorrect dataframe var name for CDF plotting df * refactor(weibull_pdf): Using pymc-marketing function for WeibullPDF * refactor(custom_functions): Removing adstock_saturation_functions.py file now that it is no longer required * chore: Removing devcontainer created by Streamlit * fix(requirements): Adding preliz to requirements * refactor(prior_viz): Reworking the prior visualisation to use Preliz instead of custom function, as well as remove the tab-design. Prior distributions can now be specified programmatically. * refactor(prior_functions.py): Deleting the draw_samples function and replacing it with a programmatic PreliZ function, such that the distribution object is returned when the user passes in the name of a distribution * fix(requirements): Delete obsolete pymc requirement, which should fix deployment dependency conflicts * chore(readme): Updating with guidelines on how to add additional distributions or transformation functions to the app * refactor(plot_config): Moving height and width specifications into constants at top of Adstock and Saturation files, so the plot sizes are set programmatically --------- Co-authored-by: Juan Orduz <[email protected]> Co-authored-by: Colt Allen <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Trujillo <[email protected]>
Similar to #571.
📚 Documentation preview 📚: https://pymc-marketing--573.org.readthedocs.build/en/573/