Budget allocation and optimal point estimations #329

cetagostini · 2023-07-25T17:08:58Z

Hello team!

Context

As marketers, we're perpetually hunting for ways to amp up the effectiveness of the campaigns we churn out regularly.

Hence, possessing reliable methods to allocate our budget efficiently—wherein we maximize our outcomes—is paramount in ensuring the success of our marketing investments.

Competitors

Other tools like Robyn and Lightweight MMM offer features to create budget allocation aimed at maximizing the target. This highlights the importance it holds within the community.

Goal

Integrating a budget allocator function within PyMC-Marketing that allows users to conveniently select a budget and allocate their resources to those channels that maximize the value of their target variable.

Secondary goals (Value proposition)

Including optional variables that allow the use of business rules within the budget optimizer, so users can tinker more freely and optimize the variable based on their own conceptions (Priors).
Incorporating additional information that allows users to better understand where they should not be overspending, and where their spend start to diminish.
Add uncertainty around the curves.

Hypotesis

As we are dealing with diminishing return curves, which have a sigmoidal shape, meaning, as x becomes excessively positive (approaching infinity), the sigmoid function approaches its upper limit. Based on my interpretation of the logistic_saturation function on mmm/transformers.py.

We should find different sections on the curve:

One where our curve increase and gets more results, even if those results are just a few on absolute numbers.
Other where our curve increases very slowly and gets a lot of results on absolute numbers, but at a higher cost.

These two sections of the curve would be divided by the elbow, basically where our curve changes direction and starts to be flatted.

Solution

If we know the function that defines the curve of our data, we can project it to discern which channels have the most potential for development according to our model and which do not. This empowers the budget allocator to distribute our resources non-linearly, based on the data.

Example output:

Work.

To achieve our goal, the first step was to find a function that could interpret the points in a coherent manner. Using the original function was challenging due to its scaled nature, and it operates with values and parameters between 0 and 1, which are difficult to retrieve after the model is trained.

I chose the function of Michael Mentes, which is expressed as follows:
y = L * x / K + x

where:

L is the maximum value the curve approaches as x goes to infinity (akin to the saturation level).
k is the x value when the function reaches half its maximum value. This can be thought of as a measure of the steepness of the curve or the position of the "elbow".

How look like?

When k=0.5, the curve saturates quickly and then plateaus. This is because the "elbow" is closer to the y-axis.
As k increases, the curve starts to look more like an linear curve, with the elbow moving closer to the x-axis.
ps: This given the same value for L

How do these changes affect the current curve fitting?

Since we now have a quadratic function, at times the model tends to believe that after passing the maximum point, any additional input will yield negative returns. This is indeed true when discussing marginal returns, but in this case, our curve is dealing with absolute values.

Example 1: Current fit

Example 2: Modify fit (Michael Mentes)

Great to hear your opinions here: @ricardoV94 @juanitorduz

Questions

First of all, I would like to get your feedback on the reasoning and logic behind, it and if makes sense to you to move in this direction with PyMC-Marketing. Based on my experience, makes a lot of sense but I could be biased.
Even when we use our model to fit the data into a logistic (sigmoid) function, within the response curve plots, we make use of a quadratic (polynomial) fit. Given by default on Seaborn. Why?

Help

Unit tests, I already build some but I'm not so familiarized. It would be amazing if you could check the functions built and validate are working as expected.

Important

This PR is a synthesis of the one previously opened here. The idea is to take all the feedback received earlier to create this cleaner and more succinct draft for new contributors. I hope you find it appropriate. I'm also open to suggestions.

I still need to run this on some notebooks to ensure that the results are coherent. Although everything has worked as custom functions in my test notebook, I must try installing this version of the PR.

Code snippet

#Estimate your curve (Michaelis Menten)
parameters = mmm.compute_channel_estimate_points_original_scale()

#Budget allocation based on the estimations
mmm.budget_allocation(
    total_budget=5,
    parameters = parameters,
    budget_bounds = {'x1':[1,30],
                     'x2':[1,60]
                     }
)

#Check the curve
mmm.plot_direct_contribution_curves(show_estimations=True)

Example notebook

Google Colab: #329 PR

cc: @juanitorduz @ricardoV94 @twiecki @cluhmann

📚 Documentation preview 📚: https://pymc-marketing--329.org.readthedocs.build/en/329/

ferrine · 2023-07-26T10:22:00Z

pymc_marketing/mmm/utils.py

@@ -33,3 +36,43 @@ def generate_fourier_modes(
            for func in ("sin", "cos")
        }
    )
+
+
+def michaelis_menten(x, L, k) -> float:


This is also useful as a dedicated saturation function in mmm.transformers

I like this idea, checking Google MMM lightweight they had included several model configurations (Hill, Carryover, Adstock) which you could choose to define Saturation and Lagging.
Example:

They refer to the following wiki where the saturation mentioned it is very similar to the Michaelis Menten. I could imagine we can implement something similar here, in order to be more extensive to how different existing data sets can fit better to different saturation and lagging functions depending on their own conditions.

juanitorduz · 2023-08-09T08:01:30Z

pymc_marketing/mmm/base.py

+        fig_estimations, ax_estimations = plt.subplots(figsize=(8, 6))
+
+        L, k = estimate_menten_parameters(channel, self.X, channel_contributions)
+        plateau_x = k * (0.99 * L / (L * 0.01))


Why do you use these 0.99 and 0.01 ?

It's a practical solution because the Michaelis-Menten equation is given by: y = (k + x) / (L * x) where k represents the substrate concentration at which the y is half of L. In other words, when x = k and y = L/2. As x increases the equation approaches: y≈L.

As obvious because y becomes saturated; adding more x doesn't significantly increase y. The value L is an asymptote for the function, meaning the curve approaches L but never quite reaches it.

Using 0.99 and 0.01 calculates the x point when y is 99% of L. Simplified, it results in 99k.

Thanks! Do you want to add a shorter summary to the doc strings? :)

juanitorduz

This was very nice and easy o review thanks! I left some small comments about code style.

I will continue testing the budget allocator in the mmm example notebook :)

pymc_marketing/mmm/base.py

juanitorduz · 2023-08-09T08:10:31Z

pymc_marketing/mmm/base.py

+        parameters: Optional[Dict[str, Tuple[float, float]]],
+        budget_bounds: Optional[Dict[str, Tuple[float, float]]],


If we have Optional type it mean that they can be None ? Otherwise is not an Optional type. See https://mypy.readthedocs.io/en/stable/kinds_of_types.html

Optional[...] does not mean a function argument with a default value. It simply means that None is a valid value for the argument. This is a common confusion because None is a common default value for arguments.

Totally correct, in this case, parameters should not be optional, only budget bounds. Change budget bounds to be optional and modify parameters based on this.

def budget_allocator( total_budget: int = 1000, channels: Union[List[str], Tuple[str, ...]] = [], parameters: Dict[str, Tuple[float, float]] = {}, budget_ranges: Optional[Dict[str, Tuple[float, float]]] = None, ) -> DataFrame:

pymc_marketing/mmm/base.py

pymc_marketing/mmm/budget_optimizer.py

juanitorduz · 2023-08-09T08:30:13Z

pymc_marketing/mmm/budget_optimizer.py

+    return contributions
+
+
+def objective_distribution(x, channels, parameters):


add type hints

pymc_marketing/mmm/budget_optimizer.py

juanitorduz · 2023-08-09T10:01:15Z

@cetagostini Would you mind sharing an example code of how this will be used in our example? You can assume the mmm object is already fitted. Thank would help the review.

I want to run:

mmm.budget_allocation(
    total_budget=1000,
)

and I get the (expected error)

TypeError                                 Traceback (most recent call last)
Cell In[43], line 1
----> 1 mmm.budget_allocation(
      2     total_budget=1000,
      3 )

TypeError: BaseMMM.budget_allocation() missing 2 required positional arguments: 'parameters' and 'budget_bounds'

So a reproducible code snipped will help the review :)

cetagostini · 2023-08-10T16:43:57Z

@juanitorduz Again, thank you very much for your review!

Before sharing, I did a small verification and found a couple of issues to be corrected. It took me a bit longer, but here is the snippet and a small Colab to check the results, so you can play with it.

I'll be working during this week around all your mentions, I believe It should be ready by Monday! 🚀

Code snippet

#Estimate your curve (Michaelis Menten)
parameters = mmm.compute_channel_estimate_points_original_scale()

#Budget allocation based on the estimations
mmm.optimize_channel_budget_for_maximum_contribution(
    total_budget=5,
    parameters = parameters,
    budget_bounds = {'x1':[1,30],
                     'x2':[1,60]
                     }
)

#Check the curve
mmm.plot_direct_contribution_curves(show_estimations=True)

Example notebook

Google Colab: #329 PR

cetagostini · 2023-08-10T21:21:23Z

@juanitorduz All the changes requested are already applied.

tests/mmm/test_utils.py

juanitorduz

Looks very nice and works on the example notebook. I left minor comments regarding naming and type hints. After that I think we can merge this an add an expermiental warning to get feedback from users 💪

pymc_marketing/mmm/base.py

pymc_marketing/mmm/budget_optimizer.py

juanitorduz · 2023-08-16T09:27:15Z

pymc_marketing/mmm/base.py

+            budget_ranges=budget_bounds,
+        )
+
+    def compute_channel_estimate_points_original_scale(self) -> Dict:


I would rename this method as compute_channel_plateat_points_original_scale

I switch to compute_channel_curve_parameters_original_scale given the fact we can use a differents curve fit now. What do you think?

juanitorduz · 2023-08-16T09:29:48Z

pymc_marketing/mmm/base.py

+        }
+
+    def plot_direct_contribution_curves(
+        self, show_estimations: bool = False, x_stop=None


can re rename the parameter show_estimations to show_michaelis_menten_fit

By default we should keep the original range. In the example notebook when I run

fig = mmm.plot_direct_contribution_curves(show_estimations=True) [ax.set(xlabel="x") for ax in fig.axes]

I get:

I would rename x_stop to xlim_max

Otherwise, using x_stop = 1.5 I get

juanitorduz · 2023-08-16T09:35:10Z

pymc_marketing/mmm/budget_optimizer.py

+            "estimated_contribution": calculate_expected_contribution(
+                parameters, optimal_budget
+            ),
+            "optimal_budget": optimal_budget,


From the example notebook I get

Should we add the expected total value instead of NaN?

Correct, already working as expected.

juanitorduz · 2023-08-16T09:37:27Z

@cetagostini I also suggest opening a followup documentation PR where we load the mmm model from the example (we can use the model builder here) to run and explain this optimization procedure.

…on_branch

review-notebook-app · 2023-09-11T12:51:17Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

…on_branch

codecov · 2023-09-11T15:31:40Z

Codecov Report

Merging #329 (aeb7faf) into main (3a9db51) will decrease coverage by 5.82%.
Report is 1 commits behind head on main.
The diff coverage is 43.12%.

@@            Coverage Diff             @@
##             main     #329      +/-   ##
==========================================
- Coverage   94.52%   88.71%   -5.82%     
==========================================
  Files          20       21       +1     
  Lines        1663     1869     +206     
==========================================
+ Hits         1572     1658      +86     
- Misses         91      211     +120

Files	Coverage Δ
pymc_marketing/mmm/utils.py	`89.58% <87.80%> (-10.42%)`	⬇️
pymc_marketing/mmm/budget_optimizer.py	`84.44% <84.44%> (ø)`
pymc_marketing/mmm/base.py	`64.79% <13.60%> (-32.72%)`	⬇️

…on_branch Updating branch

juanitorduz · 2023-09-14T11:14:11Z

Tests and lint are 🟢 ! Yay! I will take a look in the upcoming days :)

twiecki · 2023-09-14T20:55:40Z

Can we merge?

…on_branch

cetagostini · 2023-09-19T17:21:17Z

Can we merge?

I think we are set. Not sure if the team has other questions @twiecki

From my side:

We have already all checks except codecov. Still, I added a few new unit tests to improve and increase the target.
The branch is updated up to September 19th.
I ran a test on the current notebook using my branch and everything works properly (Test here).
I ran a test on the new example notebook for budget allocation and it works as well (Test here)

I added some flexibility to the curve_fit functions to give users the possibility to use them outside of the mmm class. This also gives the chance to handle non-fit errors given by the optimizer. These new options are added to the budget allocation notebook example here

The pre-commit looks correct when I test it locally:

The error given now by docs/readthedocs.org:pymc-marketing is related to the Numpy new version, but I assume this should be solved from the GitHub config checks. Right?

Wait for your feedback!

cc: @ricardoV94 @juanitorduz

juanitorduz · 2023-09-19T17:55:39Z

Can we add a warning saying that this feature is experimental? What do you think ?

cetagostini · 2023-09-19T21:01:33Z

Can we add a warning saying that this feature is experimental? What do you think ?

Sure, where do you think can we added? Notebook? @juanitorduz

juanitorduz · 2023-09-22T08:01:42Z

Can we add a warning saying that this feature is experimental? What do you think ?

Sure, where do you think can we added? Notebook? @juanitorduz

Sorry for the late reply 🙈 ! What about the public methods regarding optimization in pymc_marketing/mmm/base.py?

cetagostini · 2023-09-30T11:24:48Z

I added an alert to the docstrings located under base.py. For each new function, It is okay? @juanitorduz

juanitorduz · 2023-09-30T18:33:17Z

I added an alert to the docstrings located under base.py. For each new function, It is okay? @juanitorduz

Thank you @cetagostini ! In addition, we can add a warning in the code itself as:

import warnings

...

warnings.warn("This budget allocator method is experimental", UserWarning)

After that we can merge from my side :)

juanitorduz

@cetagostini Thank for for this contribution! IMO let us ship this as experimental and get feedback from users and iterate!

WDYT @ricardoV94 ? @twiecki ?

twiecki · 2023-10-02T12:04:09Z

Yes, we should merge this.

twiecki · 2023-10-02T12:04:38Z

Incredible contribution @cetagostini!

cetagostini · 2023-10-05T15:25:27Z

Hey guys! Didn't have the time to reply properly a few days ago, just wan to highlight a few things.

Really thank you for the support, I'm thrilled to see the merge finally happening 🙌🏻 I'm quite inspired by this momentum, and I'll surely try to help with the other issues that are now active, since this one is closed.
A great mention to @juanitorduz without all the time he dedicated and helped me get this PR forward, I would surely have stayed halfway. A great source of inspiration in the community, as well a great person to debate and share ideas reliably in the community.

ps: I'll probably be doing a little self-promotion this week to invite users to test.

Vamos PyMC-Marketing! 🚀

cetagostini added 4 commits July 25, 2023 17:05

Creating utils & Budget optimizer function.

b8555cd

Modifying plot

8cd82ce

Adding docstring on the function

2bbd10f

Modifying unit test

f449f6d

ferrine reviewed Jul 26, 2023

View reviewed changes

juanitorduz reviewed Aug 9, 2023

View reviewed changes

juanitorduz requested changes Aug 9, 2023

View reviewed changes

cetagostini added 6 commits August 9, 2023 18:36

modifying quantile method

01c1abc

Updating

17d9da6

Update output docstring

f0ad14a

Parameter order error

56b9a37

Format code using black

7a595e8

Modifying plot curves

a2f31d4

Applying Juan corrections

2e008ea

juanitorduz requested changes Aug 16, 2023

View reviewed changes

tests/mmm/test_utils.py Outdated Show resolved Hide resolved

juanitorduz requested changes Aug 16, 2023

View reviewed changes

cetagostini added 10 commits August 26, 2023 00:21

Adding new model saturation function

265ab69

Merge remote-tracking branch 'upstream/main' into dev_budget_allocati…

abcde2d

…on_branch

Checking methods, modify names

d9ca6f5

Changes

036bc1a

Adjusting error

c492005

Adding docstring to functions

98d0418

changing parameters

7d66ac9

Debugging error

2076ff3

Returning values

03fa95f

Debugging lines out

3ae0a67

update branch

1ebeddb

cetagostini added 2 commits September 11, 2023 12:58

Reverting update

a572a3a

Merge remote-tracking branch 'upstream/main' into dev_budget_allocati…

5d2832f

…on_branch

Merge remote-tracking branch 'upstream/main' into dev_budget_allocati…

38aae8e

…on_branch Updating branch

new unit test to increase coverage report

bd4cb15

cetagostini added 4 commits September 19, 2023 14:46

Merge remote-tracking branch 'upstream/main' into dev_budget_allocati…

c679b88

…on_branch

Increase ranges to find best_fit

331971a

Adding flexibility to estimation functions

2af4f0a

Small corrections

51afe7c

cetagostini added 2 commits September 28, 2023 20:21

Correcting error on worflows

975a89d

Adding Experimental note

cef8c86

cetagostini added 3 commits September 30, 2023 19:33

Adding warnings.

1c4f093

deleting by default values

7562714

Correcting position values

aeb7faf

juanitorduz approved these changes Oct 2, 2023

View reviewed changes

twiecki merged commit 8e4fe3e into pymc-labs:main Oct 2, 2023
11 of 12 checks passed

PabloRoque mentioned this pull request Sep 23, 2024

michaelis_menten transformation as pt.TensorVariable #1054

Merged

13 tasks

		parameters: Optional[Dict[str, Tuple[float, float]]],
		budget_bounds: Optional[Dict[str, Tuple[float, float]]],

		return contributions


		def objective_distribution(x, channels, parameters):

Budget allocation and optimal point estimations #329

Budget allocation and optimal point estimations #329

Conversation

cetagostini commented Jul 25, 2023 • edited Loading

Context

Competitors

Goal

Secondary goals (Value proposition)

Hypotesis

Solution

Work.

How look like?

How do these changes affect the current curve fitting?

Questions

Help

Important

Code snippet

Example notebook

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cetagostini Aug 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juanitorduz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juanitorduz commented Aug 9, 2023 • edited Loading

cetagostini commented Aug 10, 2023 • edited Loading

Code snippet

Example notebook

cetagostini commented Aug 10, 2023 • edited Loading

juanitorduz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juanitorduz commented Aug 16, 2023 • edited Loading

review-notebook-app bot commented Sep 11, 2023

codecov bot commented Sep 11, 2023 • edited Loading

Codecov Report

juanitorduz commented Sep 14, 2023 • edited Loading

twiecki commented Sep 14, 2023

cetagostini commented Sep 19, 2023 • edited Loading

juanitorduz commented Sep 19, 2023

cetagostini commented Sep 19, 2023

juanitorduz commented Sep 22, 2023

cetagostini commented Sep 30, 2023 • edited Loading

juanitorduz commented Sep 30, 2023

juanitorduz left a comment

Choose a reason for hiding this comment

twiecki commented Oct 2, 2023

twiecki commented Oct 2, 2023

cetagostini commented Oct 5, 2023 • edited Loading

cetagostini commented Jul 25, 2023 •

edited

Loading

cetagostini Aug 10, 2023 •

edited

Loading

juanitorduz commented Aug 9, 2023 •

edited

Loading

cetagostini commented Aug 10, 2023 •

edited

Loading

cetagostini commented Aug 10, 2023 •

edited

Loading

juanitorduz commented Aug 16, 2023 •

edited

Loading

codecov bot commented Sep 11, 2023 •

edited

Loading

juanitorduz commented Sep 14, 2023 •

edited

Loading

cetagostini commented Sep 19, 2023 •

edited

Loading

cetagostini commented Sep 30, 2023 •

edited

Loading

cetagostini commented Oct 5, 2023 •

edited

Loading