-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plot_slopes and slopes #699
Conversation
A demo is shown below: import bambi as bmb
from bambi.plots import slopes, plot_slopes
data = bmb.load_data("mtcars")
model = bmb.Model("mpg ~ hp * wt + drat", data=data, family="gaussian")
idata = model.fit(draws=1000, chains=2) Unit level slopes slopes(
model,
idata,
wrt="hp",
conditional=None
).head(10)
Average slopes slopes(
model,
idata,
wrt="hp",
conditional=None,
average_by=True
)
Average by group slopes (and plot) fig, ax = plot_slopes(
model,
idata,
wrt="hp",
conditional=None,
average_by="wt"
)
fig.set_size_inches(7, 3) Condition on weight and plot elasticity fig, ax = plot_slopes(
model,
idata,
wrt="hp",
conditional="wt",
slope="eyex"
)
fig.set_size_inches(7, 3) User provided values fig, ax = plot_slopes(
model,
idata,
wrt="hp",
conditional={"wt": [2, 3, 5], "drat": [3.5, 4, 4.5]},
)
fig.set_size_inches(7, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Providing a quick review. I couldn't find big issues with the code, it's already almost done. Still, I'm requesting some changes (stylistic and minor updates). As always, fantastic work. Thanks a lot!
Edit It would be good to explain what is the value
column in the returned data frame
…uilding summary dataframe
…mproved parsing of uncertainty dict
Thanks a lot for the review. :) Much appreciated. In the latest commits I realised I forgot to explain what the value column is. I will add this as an inline comment and also explain it in the docs. |
To Do:
|
…e values to other methods
The latest commits added the following functionality:
and added tests for User provided multiple values with slopes(
model,
idata,
wrt={"hp": [150, 200, 250]},
conditional=["wt"]
)
plot_slopes(
model,
idata,
wrt={"hp": 150},
conditional=["wt"],
slope="eydx"
)
fig, ax = plot_slopes(
model,
idata,
wrt={"hp": 150},
conditional=["wt"],
slope="dyex"
) |
Codecov Report
@@ Coverage Diff @@
## main #699 +/- ##
==========================================
+ Coverage 88.87% 89.58% +0.70%
==========================================
Files 43 43
Lines 3362 3523 +161
==========================================
+ Hits 2988 3156 +168
+ Misses 374 367 -7
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @GStechschulte! I'm requesting some changes. In terms of code, very minor things. But in general, I'm asking to add many docstrings.
Thanks for the review! In regard to docstrings, I feel I must explain myself. The reason for me not adding docstrings to non-public modules was because I was following the PEP 8 section on Documentation Strings:
Nonetheless, I recognise that there are docstrings for non-public modules in the Bambi codebase, and that they help in explainability. I will add them, it is no problem 😄 |
I agree it's not necessary to add docstrings to everything, especially when it starts to become redundant. However, some internal documentation is also nice to help others understand how things work. Thanks for the flexibility in adding the requested docstrings :) |
This draft PR introduces
slopes
andplot_slopes
. Slopes can be defined as the "partial derivative of the regression equation with respect to (wrt) a regressor of interest" (definition taken from marginaleffects).slopes
andplot_slopes
allow the user to view a summary dataframe and or plot the slope of the regressor of interest (wrt). This PR supports the following quantities of interest:Additionally, this PR also supports the interpretation of slopes as an elasticity (percent change in$x$ is associated with a percentage change in $y$ ). If the regressor of interest is categorical, computing an elasticity is not possible. For example, there is no % change in penguin species. Furthermore, if
wrt
is categorical when callingslopes
, thencomparisons
is called to compute the difference in contrast means.slopes
andcomparisons
are closely related. Thus, I have refactored the code such that these two functions can use the same code when building the summary dataframe and plotting. Lastly, I simplified the code to build the summary data frames.