plot_slopes and slopes #699

GStechschulte · 2023-07-20T09:02:38Z

This draft PR introduces slopes and plot_slopes. Slopes can be defined as the "partial derivative of the regression equation with respect to (wrt) a regressor of interest" (definition taken from marginaleffects). slopes and plot_slopes allow the user to view a summary dataframe and or plot the slope of the regressor of interest (wrt). This PR supports the following quantities of interest:

unit level slopes
average slopes
average by group slopes
user defined value slopes

Additionally, this PR also supports the interpretation of slopes as an elasticity (percent change in $x$ is associated with a percentage change in $y$). If the regressor of interest is categorical, computing an elasticity is not possible. For example, there is no % change in penguin species. Furthermore, if wrt is categorical when calling slopes, then comparisons is called to compute the difference in contrast means.

slopes and comparisons are closely related. Thus, I have refactored the code such that these two functions can use the same code when building the summary dataframe and plotting. Lastly, I simplified the code to build the summary data frames.

GStechschulte · 2023-07-20T09:11:53Z

A demo is shown below:

import bambi as bmb
from bambi.plots import slopes, plot_slopes

data = bmb.load_data("mtcars")
model = bmb.Model("mpg ~ hp * wt + drat", data=data, family="gaussian")
idata = model.fit(draws=1000, chains=2)

Unit level slopes

slopes(
    model,
    idata,
    wrt="hp",
    conditional=None
).head(10)

term	estimate_type	value	drat	wt	estimate	lower_3.0%	upper_97.0%
hp	dydx	(110.0, 110.0001)	3.90	2.620	-0.046077	-0.061826	-0.027424
hp	dydx	(110.0, 110.0001)	3.90	2.875	-0.039438	-0.055346	-0.024399
hp	dydx	(93.0, 93.0001)	3.85	2.320	-0.053887	-0.074023	-0.033815
hp	dydx	(110.0, 110.0001)	3.08	3.215	-0.030586	-0.045647	-0.016320
hp	dydx	(175.0, 175.0001)	3.15	3.440	-0.024728	-0.040884	-0.009271
hp	dydx	(105.0, 105.0001)	2.76	3.460	-0.024207	-0.040335	-0.008568
hp	dydx	(245.0, 245.0001)	3.21	3.570	-0.021343	-0.039777	-0.006217
hp	dydx	(62.0, 62.0001)	3.69	3.190	-0.031237	-0.046218	-0.016832
hp	dydx	(95.0, 95.0001)	3.92	3.150	-0.032278	-0.046864	-0.017400
hp	dydx	(123.0, 123.0001)	3.92	3.440	-0.024728	-0.040884	-0.009271

Average slopes

slopes(
    model,
    idata,
    wrt="hp",
    conditional=None,
    average_by=True
)

term	estimate_type	estimate	lower_3.0%	upper_97.0%
hp	dydx	-0.030527	-0.051032	-0.01007

Average by group slopes (and plot)

fig, ax = plot_slopes(
    model,
    idata,
    wrt="hp",
    conditional=None,
    average_by="wt"
)
fig.set_size_inches(7, 3)

Condition on weight and plot elasticity

fig, ax = plot_slopes(
    model,
    idata,
    wrt="hp",
    conditional="wt",
    slope="eyex"
)
fig.set_size_inches(7, 3)

User provided values

fig, ax = plot_slopes(
    model,
    idata,
    wrt="hp",
    conditional={"wt": [2, 3, 5], "drat": [3.5, 4, 4.5]},
)
fig.set_size_inches(7, 3)

bambi/plots/__init__.py

bambi/plots/create_data.py

bambi/plots/plot_types.py

bambi/plots/utils.py

bambi/plots/plotting.py

tomicapretto

Providing a quick review. I couldn't find big issues with the code, it's already almost done. Still, I'm requesting some changes (stylistic and minor updates). As always, fantastic work. Thanks a lot!

Edit It would be good to explain what is the value column in the returned data frame

…uilding summary dataframe

…d comparisons

…mproved parsing of uncertainty dict

…ple values

GStechschulte · 2023-07-22T08:32:09Z

Providing a quick review. I couldn't find big issues with the code, it's already almost done. Still, I'm requesting some changes (stylistic and minor updates). As always, fantastic work. Thanks a lot!

Edit It would be good to explain what is the value column in the returned data frame

Thanks a lot for the review. :) Much appreciated. In the latest commits I realised I forgot to explain what the value column is. I will add this as an inline comment and also explain it in the docs.

GStechschulte · 2023-07-22T08:35:05Z

To Do:

add tests
resolve pylint errors
add the following semi-elasticity slopes (currently slopes only supports eyex): eydx and dyex

…e values to other methods

GStechschulte · 2023-07-26T17:50:55Z

The latest commits added the following functionality:

semi-elasticities (eydx, dyex)
user provided multiple values with wrt

and added tests for plot_slopes. Below, I provide an example of added functionality (using the model and data from above):

User provided multiple values with wrt. The value column represents the values of the wrt arg. used to compute the derivative. If the user passes multiple values, a small amount $\epsilon$ eps is added to each value and then divided by that eps to obtain the "instantaneous rate of change":

slopes(
    model,
    idata,
    wrt={"hp": [150, 200, 250]},
    conditional=["wt"]
)

term	estimate_type	value	wt	drat	estimate	lower_3.0%	upper_97.0%
hp	dydx	(150.0, 150.0001)	1.513000	3.596563	-0.074025	-0.104541	-0.041595
hp	dydx	(200.0, 200.0001)	1.513000	3.596563	-0.072006	-0.102318	-0.041779
hp	dydx	(250.0, 250.0001)	1.513000	3.596563	-0.069988	-0.099388	-0.040974

eydx: unit increase in $x$ (wrt) is associated with a % change in $y$:

plot_slopes(
    model,
    idata,
    wrt={"hp": 150},
    conditional=["wt"],
    slope="eydx"
)

dyex: % change in $x$ (wrt) is associated with a unit increase in $y$:

fig, ax = plot_slopes(
    model,
    idata,
    wrt={"hp": 150},
    conditional=["wt"],
    slope="dyex"
)

codecov-commenter · 2023-07-26T18:21:21Z

Codecov Report

Merging #699 (541621b) into main (cbbf955) will increase coverage by 0.70%.
The diff coverage is 90.00%.

@@            Coverage Diff             @@
##             main     #699      +/-   ##
==========================================
+ Coverage   88.87%   89.58%   +0.70%     
==========================================
  Files          43       43              
  Lines        3362     3523     +161     
==========================================
+ Hits         2988     3156     +168     
+ Misses        374      367       -7

Files Changed	Coverage Δ
bambi/plots/plotting.py	`84.96% <87.01%> (-3.82%)`	⬇️
bambi/plots/effects.py	`89.24% <87.89%> (+11.08%)`	⬆️
bambi/plots/utils.py	`86.76% <93.22%> (+5.91%)`	⬆️
bambi/plots/__init__.py	`100.00% <100.00%> (ø)`
bambi/plots/create_data.py	`100.00% <100.00%> (+24.39%)`	⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

…rror messages

bambi/plots/create_data.py

bambi/plots/utils.py

bambi/plots/effects.py

bambi/plots/plotting.py

bambi/plots/plot_types.py

tomicapretto

Hi @GStechschulte! I'm requesting some changes. In terms of code, very minor things. But in general, I'm asking to add many docstrings.

GStechschulte · 2023-08-06T08:36:51Z

Hi @GStechschulte! I'm requesting some changes. In terms of code, very minor things. But in general, I'm asking to add many docstrings.

Thanks for the review! In regard to docstrings, I feel I must explain myself. The reason for me not adding docstrings to non-public modules was because I was following the PEP 8 section on Documentation Strings:

Write docstrings for all public modules, functions, classes, and methods. Docstrings are not necessary for non-public methods, but you should have a comment that describes what the method does. This comment should appear after the "def" line.

Nonetheless, I recognise that there are docstrings for non-public modules in the Bambi codebase, and that they help in explainability. I will add them, it is no problem 😄

tomicapretto · 2023-08-07T19:15:28Z

Hi @GStechschulte! I'm requesting some changes. In terms of code, very minor things. But in general, I'm asking to add many docstrings.

Thanks for the review! In regard to docstrings, I feel I must explain myself. The reason for me not adding docstrings to non-public modules was because I was following the PEP 8 section on Documentation Strings:

Write docstrings for all public modules, functions, classes, and methods. Docstrings are not necessary for non-public methods, but you should have a comment that describes what the method does. This comment should appear after the "def" line.

Nonetheless, I recognise that there are docstrings for non-public modules in the Bambi codebase, and that they help in explainability. I will add them, it is no problem 😄

I agree it's not necessary to add docstrings to everything, especially when it starts to become redundant. However, some internal documentation is also nice to help others understand how things work. Thanks for the flexibility in adding the requested docstrings :)

tomicapretto reviewed Jul 20, 2023

View reviewed changes

bambi/plots/__init__.py Outdated Show resolved Hide resolved

tomicapretto reviewed Jul 20, 2023

View reviewed changes

bambi/plots/__init__.py Show resolved Hide resolved

tomicapretto reviewed Jul 20, 2023

View reviewed changes

bambi/plots/create_data.py Outdated Show resolved Hide resolved

tomicapretto reviewed Jul 20, 2023

View reviewed changes

bambi/plots/plot_types.py Outdated Show resolved Hide resolved

tomicapretto reviewed Jul 20, 2023

View reviewed changes

bambi/plots/plot_types.py Outdated Show resolved Hide resolved

tomicapretto reviewed Jul 20, 2023

View reviewed changes

bambi/plots/utils.py Outdated Show resolved Hide resolved

tomicapretto reviewed Jul 20, 2023

View reviewed changes

bambi/plots/plotting.py Outdated Show resolved Hide resolved

tomicapretto requested changes Jul 20, 2023

View reviewed changes

GStechschulte added 12 commits July 22, 2023 10:27

add slopes and plot_slopes

b125a47

common create data function for comparisons and slopes

bb4ec6a

add slopes and PredictiveDifference class for computing effects and b…

da1b6d4

…uilding summary dataframe

Assign colors for single covariates by @tjburch

c7f6410

add plot_slopes and common plotting function for slopes and comparisons

689641a

common VariableInfo class and default value computation for slopes an…

6935ceb

…d comparisons

reorder imports alphabetically

8bd459c

improved docstring for plot_slopes

ff60f08

move private inner functions outside of 'create_differences_data' func

4a4482b

slopes supports multiple values, added args. to 'get_estimate', and i…

dd8bd43

…mproved parsing of uncertainty dict

add color='C0' to fix color bug

dbb7cc1

update to VariableInfo class to allow slopes with user provided multi…

e326a6f

…ple values

GStechschulte force-pushed the slopes branch from 562f43e to e326a6f Compare July 22, 2023 08:29

GStechschulte added 6 commits July 24, 2023 21:11

add support for semi-elasticities, move slopes and setting of variabl…

392625f

…e values to other methods

update ValueError to include semi-elasticities

a8f72b3

raise ValueError is slopes not in semi-elasticities

d76e3e1

run black code formatting

b883df6

if slopes effect, convert columns that are not 'wrt' to original dtype

c04858e

pass transforms as arg. to PredictiveDifferences

eebfb34

GStechschulte marked this pull request as ready for review July 26, 2023 17:39

GStechschulte added 2 commits July 26, 2023 20:33

docstring fixes / enhancements

dc77947

run black formatting

27a6360

GStechschulte requested a review from tomicapretto July 27, 2023 05:32

bug fixed when user provided values > 3 and better error raise ValueE…

7d6f82d

…rror messages

GStechschulte mentioned this pull request Aug 1, 2023

slopes documentation #701

Merged

GStechschulte added 3 commits August 2, 2023 21:27

improved error handling for and

9dbd181

improved error handling for and

e426978

run black formatting

785e88d