Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add spaces, increase indentation, and fix number order to Pareto notebook #651

Merged
merged 2 commits into from
Apr 27, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 19 additions & 14 deletions docs/source/notebooks/clv/pareto_nbd.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -476,27 +476,32 @@
"id": "e10b0672-8967-4ecc-9870-f8c08133f9ee",
"metadata": {},
"source": [
"# Model Definition\n",
"## Model Definition\n",
"The Pareto/NBD model is based on the following assumptions for each customer:\n",
"1. Customers are active for an unobserved period of time, then become permanently inactive.\n",
" \n",
"### Purchasing Process\n",
"#### Purchasing Process\n",
"\n",
"2. While active, the the number of transactions made by a customer follows a Poisson process with transaction rate $\\lambda$:\n",
" \n",
" $$P(X(t)=x|\\lambda) = \\frac{(\\lambda t)^{x}e^{-\\lambda t}}{x!}, x=0,1,2,...$$\n",
" \n",
" This is equivalent to assuming time between transactions is exponentially distributed with transaction rate $\\lambda$:\n",
" \n",
" $$f(t_{j}-t_{j-1}| \\lambda) = \\lambda e^{-\\lambda (t_{j} - t_{j - 1})}, \\quad t_{j} \\geq t_{j - 1} \\geq 0$$\n",
" \n",
" Where $t$ is the time period of the $j$th purchase.\n",
"3. Heterogeneity in $\\lambda$ follows a Gamma distribution with shape parameter $r$ and scale parameter $\\alpha$:\n",
"\n",
" $$g(\\lambda|r, \\alpha) = \\frac{\\alpha^{r}\\lambda^{r - 1}e^{-\\lambda \\alpha}}{\\Gamma(r)}$$\n",
"### Dropout Process\n",
"5. The duration of a customer's unobserved active lifetime is exponentially distributed with dropout rate $\\mu$.\n",
"#### Dropout Process\n",
"4. The duration of a customer's unobserved active lifetime is exponentially distributed with dropout rate $\\mu$.\n",
"\n",
"6. Heterogeneity in $\\mu$ also follows a Gamma distribution with shape parameter $s$ and scale parameter $\\beta$:\n",
"5. Heterogeneity in $\\mu$ also follows a Gamma distribution with shape parameter $s$ and scale parameter $\\beta$:\n",
"\n",
" $$g(\\mu|s, \\beta) = \\frac{\\beta^{s}\\mu^{s - 1}e^{-\\mu \\beta}}{\\Gamma(s)}$$\n",
"7. Transaction rate $\\lambda$ and time until dropout $\\mu$ vary independently for each customer.\n",
" \n",
"6. Transaction rate $\\lambda$ and time until dropout $\\mu$ vary independently for each customer.\n",
"\n",
"If we take the expectation across the distributions of $\\lambda$ and $\\mu$, we can derive a likelihood function to estimate parameters $r$, $\\alpha$, $s$, and $\\beta$ across the customer population. For more details on the `ParetoNBD` likelihood please refer to the [docs](https://www.pymc-marketing.io/en/stable/api/generated/pymc_marketing.clv.distributions.ParetoNBD.html#pymc_marketing.clv.distributions.ParetoNBD)."
]
Expand All @@ -506,15 +511,15 @@
"id": "bee69f5b-1b9e-4aa4-bdd4-5358c866453c",
"metadata": {},
"source": [
"# Model Fitting"
"## Model Fitting"
]
},
{
"cell_type": "markdown",
"id": "325d5448",
"metadata": {},
"source": [
"## `lifetimes` Benchmark Model\n",
"### `lifetimes` Benchmark Model\n",
"\n",
"Let's time travel back to July 2020 and use the old `lifetimes` library to fit a Pareto/NBD model with Maximum Likelihood Estimation (MLE). The `Nelder-Mead` optimizer from `scipy.optimize` is ran under the hood to estimate scalar values for $r$, $\\alpha$, $s$, and $\\beta$."
]
Expand Down Expand Up @@ -772,7 +777,7 @@
"id": "512e5ef6-8fac-43fa-8d54-d6cfa14f64a6",
"metadata": {},
"source": [
"### Prior and Posterior Predictive Checks\n",
"#### Prior and Posterior Predictive Checks\n",
"PPCs allow us to check the efficacy of our priors, and the peformance of the fitted posteriors. PPCs aren't usually an option with MAP fitted models, but here we're actually sampling from the latent $\\lambda$ and $\\mu$ Gamma distributions, so PPCs are possible for `ParetoNBDModel` regardless of the fit method!\n",
"\n",
"Let's see how the model performs in a *prior* predictive check, where we sample from the default priors before fitting the model: "
Expand Down Expand Up @@ -1118,7 +1123,7 @@
"id": "25724a17-538a-4ec5-9df8-8dd28b547a86",
"metadata": {},
"source": [
"# Full Bayesian Inference"
"## Full Bayesian Inference"
]
},
{
Expand Down Expand Up @@ -3484,7 +3489,7 @@
"id": "2ec3ff94-83fe-4a6c-99ef-e25e0ee12cef",
"metadata": {},
"source": [
"After fititing, models can be persisted for later use:"
"After fitting, models can be persisted for later use:"
]
},
{
Expand Down Expand Up @@ -3519,7 +3524,7 @@
"id": "370d109c",
"metadata": {},
"source": [
"# Predictive Methods\n",
"## Predictive Methods\n",
"\n",
"The Pareto/NBD model supports a variety of predictive methods:\n",
"\n",
Expand Down Expand Up @@ -4095,7 +4100,7 @@
"id": "7d2b668e-a53d-487b-9b05-0090a2dfa2e3",
"metadata": {},
"source": [
"# Time-Invariant Covariates"
"## Time-Invariant Covariates"
]
},
{
Expand Down Expand Up @@ -4436,7 +4441,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.14"
"version": "3.10.9"
},
"nteract": {
"version": "[email protected]"
Expand Down
Loading