The SEXIT (Sequential Effect eXistence and sIgnificance Test) framework #237

DominiqueMakowski · 2019-10-06T03:49:45Z

DominiqueMakowski
Oct 6, 2019
Maintainer

Here are some ideas that have been in the back of my mind about this thing I've been mentioning here and there, put here to open the discussion about it and guide the development of a possible future direction:

Motivation

There's a debate between which index is the best, and which one to report. This leads to the mindless report of all possible indices, with that the reader will be satisfied, but often without the writer understanding and interpreting them. Indeed, it's complicated to juggle between so many indices at the same time, that have complicated definitions and subtle differences.

The focus should be put on intuitiveness and explicitness of the indices' interpretation (rely as little as possible on magical numbers or arbitrary rules), and practical meaningfulness and usefulness of the indices.

To that end, we suggest a system of description of parameters that would be intuitive, easy to learn and apply, mathematically accurate and useful for taking decision.

Theoretical idea

Improvement over NHST
Reframe effect testing as a sequence of questions/answers
For instance: Direction certainty -> Significance -> Uncertainty and central tendency -> Size interpretation
This IMO makes pragmatic sense. Let's take the applied context (although quite specific) of the efficiency of a given treatment. The first thing to assess is whether the treatment is not harmful (i.e., the probability that the effect is for instance positive). Once we are assured that it is not harmful, the following question is "does is actually help?" i.e., is it related to a significant improvement or a negligible one?

Ideas

Which would translate into something like give pd, if pd > 99% (for example), then assess significance (ROPE). Also, report characteristics of the posterior (median & CI). Some points of debate:

Unidirectional ROPE and inversed ROPE

Once we assessed the direction of the effect, and said that it's for instance likely positive, it seems sound to investigate a ROPE defined on the positive side, to investigate the probability that the effect is bigger than a given threshold. But then, it creates a practical issue, because let's say 99% of the posterior is positive, and 3% of this posterior is in the ROPE -0.1; 0.1. If we only take interest in the 0; 0.1 ROPE, what to do with the 1% of posterior that is negative? Thus, a more straightforward and solving approach is not to report the percentage of the posterior that is in the ROPE, but the percentage of posterior that is NOT in the ROPE (i.e., is bigger). This inversed take on ROPE, (which could be re-named Region of Practical Significance (ROPS)) might also "speak" more to people. In the case above (pd = 99%, 3% in ROPE [-0.1, 0.1]) we would say that that there is 99% of probability that the effect is positive and 97% of probability that its size is significant (i.e., bigger than the negligibility threshold).

Probabilistic ROPE

Another issue with the ROPE is the discrete and hard bounds. In the example above (threshold 0.1), 0.999 is considered as negligible and 0.10001 as not negligible. Although practical, this seems to contradict the underlying probabilistic perspective. I've tried in the past (e.g.), then abandoned, the idea of a probabilistic definition of ROPE. I am still not sure if that makes any sense, but the idea is to define the ROPE as a distribution, e.g., normal (mean 0, SD 0.1/3), in which the "weight" or density increases as we go closer to 0. What I tried to do with this distribution is to take the overlap between the posterior and the ROPE distribution, to take into account the fact that 0.0001 is different from 0.099999. The overlap might not be the best approach to it, but it keeps coming to my mind as a solution to soften the ROPE boundaries.

Probabilistic Effect size labelling (https://github.com/easystats/report/issues/24)

Instead of interpreting (i.e. labelling) the effect size of the point-estimate, it makes more sense to give the proportion of the posterior within each "category" (small, medium, large etc.).

Uncertainty interpretation A standardized (i.e., absolute) index of uncertainty? #221

Should we try establishing guidelines for actually interpreting the uncertainty, rather than just reporting the indices? I.e., for instance the width of the standardized CI, or the SD, so that we can conclude something like "the estimation yielded precise parameters" or something

Thoughts?

DominiqueMakowski · 2020-09-27T10:38:32Z

DominiqueMakowski
Sep 27, 2020
Maintainer Author

Note to future me: #339 made me think that it would be useful to have convenience functions (could be smart wrappers) for simple and straightforward tests (correlations, t-tests and this kind of jazz). These test_* functions (test_correlation(), test_difference()) could implement best practices (such as the to-be-developed SEXIT framework), nice informative output etc. Would probably have to go in a new package.

0 replies

DominiqueMakowski · 2020-10-14T10:21:33Z

DominiqueMakowski
Oct 14, 2020
Maintainer Author

Once the thresholds for non-significance (i.e., the ROPE) and the one for a "large" or "moderate" effect are explicitly defined, the SEXIT framework does not make any interpretation. I.e., it does not label the effects, but just gives the 3 sequential probabilities as-is and the description of the posterior. It provides a lot of information about the posterior distribution (the probabilities of different sections) in a clear and meaningful way.

Example of potential formulations:

There is 99% of probability that the effect of X is positive, 97% of probability that it is significant and 35% of probability that it is large (median, 95% CI)
There are 99%, 97% and 35% of probabilities that the effect of X is positive, significant and large, respectively (median, 95% CI).
The effect of X (median, 95% CI) is positive, significant and large with respectively 99%, 97%, and 35% of probability.
The effect of X (median, 95% CI) is positive (p > 0), significant (p > 0.1) and large (p > 0.3) with respectively 99%, 97%, and 35% of probability.
The effect of X (median, 95% CI) is positive (p > 0 = 99%), significant (p > 0.1 = 97%) and ... (p > 0.3 = 35%) [this one requires to label]

which one is the best?

0 replies

mattansb · 2020-10-14T10:27:53Z

mattansb
Oct 14, 2020
Maintainer

I like this very much!

The first one makes most sense to me, as saying "the effect is large (0.02%)" seems confusing.

0 replies

DominiqueMakowski · 2020-10-14T10:34:55Z

DominiqueMakowski
Oct 14, 2020
Maintainer Author

True but is also the longest :( I'm thinking about how it would fit into a manuscript when you have to report a handful of parameters for a dozen of models ^^, so being compact is a plus.

For instance, I find reminding the thresholds used (like in is positive (p > 0), significant (p > 0.1) and large (p > 0.3)) is a good idea to underline to what these values correspond and their relative arbitrariness (is that a word?). But then this kind of information might be better put once in the model's description (since it's will mostly depend for instance on the data for the ROPE) then repeating it for each parameter... so IDK 😅

0 replies

DominiqueMakowski · 2020-10-14T10:39:51Z

DominiqueMakowski
Oct 14, 2020
Maintainer Author

There are 99%, 97% and 35% of probabilities that the effect of X is positive, significant and large, respectively (median, 95% CI).
There are 99%, 97% and 35% of probabilities that the effect of X (median, 95% CI) is positive, significant and large, respectively.

Might be good compromises? the information is nicely grouped, and presented in a relatively concise way

0 replies

mattansb · 2020-10-14T10:40:00Z

mattansb
Oct 14, 2020
Maintainer

Then maybe:

"There is 99% of probability that the effect of X is positive (median, 95% CI, Pr_sig=0.97, Pr_large=0.35)"

0 replies

mattansb · 2020-10-14T10:40:50Z

mattansb
Oct 14, 2020
Maintainer

There are 99%, 97% and 35% of probabilities that the effect of X is positive, significant and large, respectively (median, 95% CI).

Or this one?

0 replies

DominiqueMakowski · 2020-10-14T10:41:24Z

DominiqueMakowski
Oct 14, 2020
Maintainer Author

"There is 99% of probability that the effect of X is positive (median, 95% CI, Prsig=0.97, Prlarge=0.35)"

But then I have the feeling that the practical significance and large probabilities get kind of lost in the parentheses with the other indices

0 replies

DominiqueMakowski · 2020-10-14T10:42:16Z

DominiqueMakowski
Oct 14, 2020
Maintainer Author

Do you think it would be relevant to add somewhere in there some BF or would it be redundant?

0 replies

DominiqueMakowski · 2020-10-14T11:08:25Z

DominiqueMakowski
Oct 14, 2020
Maintainer Author

Though this one wouldn't be bad either, it puts the base information first and then the sexit stuff

The effect of X (median, 95% CI) has 99%, 97% and 35% probability of being positive, significant and large.

And in the case of models with few parameters (like a t-test or a correlation etc.), one could just insert the thresholds:

The effect of X (median, 95% CI) has 99%, 97% and 35% probability of being positive (> 0), significant (> 0.1) and large (> 0.3).

And / or we could think about explicating the "the effect" part:

An increase of 1 of X (0.4 SD)* has 99%, 97% and 35% probability of having a positive, significant and large change (median, 95% CI) on Y.

*(in the case of non-standardized data, to give an idea of what it represents in terms of variance)

This could make the reports of interaction clearer, for instance for X1:X2:

An increase of 1 of X2 (0.4 SD) has 99%, 97% and 35% probability of having a positive, significant and large change (median, 95% CI) on the effect of X1.

0 replies

mattansb · 2020-10-14T12:28:11Z

mattansb
Oct 14, 2020
Maintainer

I like these 👆

Do you think it would be relevant to add somewhere in there some BF or would it be redundant?

Not redundant, a BF can definitely be added alongside the p-ROPE - as part of the "existence" testing, if priors have been specified.

Just to complicate things more, you can also have multiple BFs to correspond to the multiple ps: 😅

library(rstanarm)
#> Loading required package: Rcpp
#> This is rstanarm version 2.21.1
#> - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
#> - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
#> - For execution on a local, multicore CPU with excess RAM we recommend calling
#>   options(mc.cores = parallel::detectCores())
library(bayestestR)

mtcars_Z <- effectsize::standardize(mtcars)

m <- stan_glm(mpg ~ cyl + am,
              family = gaussian(),
              data = mtcars_Z,
              prior = normal(0, 1, 1),
              refresh = 0)

m_prior <- unupdate(m)
#> Sampling priors, please wait...


# point
(b <- bayesfactor_parameters(m, m_prior, null = 0))
#> Loading required namespace: logspline
#> # Bayes Factor (Savage-Dickey density ratio)
#> 
#> Parameter   |       BF
#> ----------------------
#> (Intercept) |    0.037
#> cyl         | 7446.539
#> am          |    0.791
#> 
#> * Evidence Against The Null: [0]
plot(b) + ggplot2::coord_cartesian(xlim = c(-2,2))

# rope range
(b <- bayesfactor_parameters(m, m_prior, null = c(-0.1, 0.1)))
#> # Bayes Factor (Null-Interval)
#> 
#> Parameter   |       BF
#> ----------------------
#> (Intercept) |    0.013
#> cyl         | 5081.259
#> am          |    0.513
#> 
#> * Evidence Against The Null: [-0.1, 0.1]
plot(b) + ggplot2::coord_cartesian(xlim = c(-2,2))

# medium-large slope
(b <- bayesfactor_parameters(m, m_prior, null = c(-0.5, 0.5)))
#> # Bayes Factor (Null-Interval)
#> 
#> Parameter   |        BF
#> -----------------------
#> (Intercept) | 2.810e-06
#> cyl         |    32.139
#> am          |     0.005
#> 
#> * Evidence Against The Null: [-0.5, 0.5]
plot(b) + ggplot2::coord_cartesian(xlim = c(-2,2))

^{Created on 2020-10-14 by the reprex package (v0.3.0)}

0 replies

strengejacke · 2020-10-14T17:42:54Z

strengejacke
Oct 14, 2020
Maintainer

I don't think it makes sense reporting a probability of an effect being positive / negativ, when the CI does not include zero, because then this information is redundant. pd() makes most sense for "inconsistent" estimates, or when you don't have the CI at hand.
I would not report three prob-values and then the associated "measures" (i.e. not report "There are 99%, 97% and 35% of probabilities that the effect of X is positive, significant and large, respectively (median, 95% CI)."), because it requires too much thinking which prob. belongs to which measure (positive, sig., large).
I would not vote for "An increase of 1 of X2 (0.4 SD) has 99%, 97% and 35% probability of having a positive, significant and large change (median, 95% CI) on the effect of X1.", because if an effect is positive and significant, it doesn't matter how much you increase/decrease X2 - it's always significant.

0 replies

DominiqueMakowski · 2020-10-15T02:26:53Z

DominiqueMakowski
Oct 15, 2020
Maintainer Author

1

In theory yes, as the essence of SEXIT is the sequential focus. I.e., the first thing to demonstrate is the direction, and once this is clear, then one can focus on significance/non-negligibility, and once that is clear, one can focus on whether it is big.

So technically yes, if pd > 99.9%, then one could omit it and move on to non-negligibility. But that said, I think it's just clearer if the standards for reporting are consistent, i.e., if the reported indices are the same, in the same order. And it doesn't not make it particularly heavy on the eyes IMO to have something like:

X has > 99.9%, 84% and 35% probability of having a positive, significant and large change (median, 95% CI).

(i.e., to report the indices even if they are super high). After all, we do report p-values no matter what's their value.

However, it would potentially work not to report significance/strength if direction is not certain enough. But then again, I think it's easier if people stick with reporting the same thing in the same way

2

So you prefer something like:

There is 99% of probability that the effect of X is positive, 97% of probability that it is significant and 35% of probability that it is large (median, 95% CI)

Why not, but the repetition of "of probability that it is ..." bothers me a bit ^^

And TBH I reckon one could get very very quickly used to a SEXIT reading, i.e., naturally and automatically checking the first probability, and then moving on to the second if > 99.9, then eventually moving on to the third. In the end, it's even quite light on attention and memory, as there is only one information (one value) considered at a time.

3

We had this discussion somewhere, but yes IMO it is important to give to what the parameter corresponds relative to its own variance as this impacts the coefficient's scale, and therefore the p-sig and p-large.

sexit <- function(x, negligibility=0.05, strong=0.3){
  if(median(x) < 0){
    x <- -1 * x
    direction <- "negative"
  } else{
    direction <- "positive"
  }
  
  n <- length(x)
  pd <- insight::format_value(length(x[x > 0]) / n, as_percent=TRUE)
  psig <- insight::format_value(length(x[x > negligibility]) / n, as_percent=TRUE)
  pstrong <- insight::format_value(length(x[x > strong]) / n, as_percent=TRUE)
  
  paste0(pd, ", ", psig, ", ", pstrong, " of being ", direction, ", significant and strong")
}

df <- iris
x1 <- insight::get_parameters(rstanarm::stan_glm(Sepal.Length ~ Sepal.Width, data=df, refresh=0, iter=10000))[[2]]
sexit(x1)
#> [1] "92.15%, 86.52%, 31.13% of being negative, significant and strong"

df$Sepal.Width2 <- df$Sepal.Width / 100
x2 <- insight::get_parameters(rstanarm::stan_glm(Sepal.Length ~ Sepal.Width2, data=df, refresh=0, iter=10000))[[2]]
sexit(x2)
#> [1] "92.36%, 92.31%, 92.07% of being negative, significant and strong"

^{Created on 2020-10-15 by the reprex package (v0.3.0)}

Another option is to standardize the significance and importance threshold for each parameter by the scale of the variable it refers to, but then we fall back to the problem of standardized standardization which requires a full access and knowledge about the parameter's type. So by the time we have that, I think it could be easier to add the information. Though this point is a more a report-issue than a SEXIT problem per se (as it applies to all regression models).

0 replies

mattansb · 2020-10-15T05:03:29Z

mattansb
Oct 15, 2020
Maintainer

There is 99% of probability that the effect of X is positive, 97% of probability that it is significant and 35% of probability that it is large (median, 95% CI)

Maybe this can trimmed down to (with the values upfront):
"The effect of X (median, 95% CI) has a 99% probability of being positive, 97% of being significant and 35% of being large."

0 replies

strengejacke · 2020-10-15T05:40:45Z

strengejacke
Oct 15, 2020
Maintainer

We had this discussion somewhere, but yes IMO it is important to give to what the parameter corresponds relative to its own variance as this impacts the coefficient's scale, and therefore the p-sig and p-large.

I understand your intention, but it still is somehow misleading. Assume that the standardized X2 is 1 (or -1), then following sentences are all correct:

An increase of 1 of X2 (0.4 SD) has 99%, 97% and 35% probability of having a positive, significant and large change (median, 95% CI) on the effect of X1.

An increase of 3.5 of X2 (0.4 SD) has 99%, 97% and 35% probability of having a positive, significant and large change (median, 95% CI) on the effect of X1.

A decrease of 2 of X2 (0.4 SD) has 99%, 97% and 35% probability of having a negative, significant and large change (median, 95% CI) on the effect of X1.

the "1" in this case is arbitrary, and therefore somehow misleading. It implies that "1" is significant, large etc., but what about 2, 3, 4 etc?

0 replies

DominiqueMakowski · 2020-10-26T09:55:22Z

DominiqueMakowski
Oct 26, 2020
Maintainer Author

tada 🎉 (this will facilitate my life in report easystats/report#103)

library(rstanarm)
library(bayestestR)
#> Note: The default CI width might change in future versions (see https://github.com/easystats/bayestestR/issues/250).
#> To prevent any issues, please set it explicitly when using bayestestR functions, via the 'ci' argument.

model <- rstanarm::stan_glm(mpg ~ wt * cyl,
  data = mtcars,
  iter = 800, refresh = 0
)

s <- sexit(model)
s
#> # The thresholds beyond which the effect is considered as significant (i.e., non-negligible) and large are 0.30 and 1.81 (corresponding respectively to 0.05 and 0.30 of the outcome's SD).
#> 
#> (Intercept) (Median = 52.98, 95% CI [41.89, 65.71]) has a 100.00% probability of being positive (> 0), 100.00% of being significant (> 0.30), and 100.00% of being large (> 1.81)
#> - wt (Median = -8.14, 95% CI [-13.23, -3.93]) has a 99.88% probability of being negative (< 0), 99.81% of being significant (< -0.30), and 99.31% of being large (< -1.81)
#> - cyl (Median = -3.60, 95% CI [-5.49, -1.70]) has a 100.00% probability of being negative (< 0), 99.88% of being significant (< -0.30), and 96.94% of being large (< -1.81)
#> - wt:cyl (Median = 0.74, 95% CI [0.14, 1.44]) has a 98.38% probability of being positive (> 0), 92.12% of being significant (> 0.30), and 0.00% of being large (> 1.81)
#> 
#> Parameter   | Median |          95% CI | Existence (0) | Significance (0.30) | Large (1.81) |   NA |   NA
#> ---------------------------------------------------------------------------------------------------------
#> (Intercept) |  52.98 |  [41.89, 65.71] |         41.89 |               65.71 |         1.00 | 1.00 | 1.00
#> wt          |  -8.14 | [-13.23, -3.93] |        -13.23 |               -3.93 |         1.00 | 1.00 | 0.99
#> cyl         |  -3.60 |  [-5.49, -1.70] |         -5.49 |               -1.70 |         1.00 | 1.00 | 0.97
#> wt:cyl      |   0.74 |    [0.14, 1.44] |          0.14 |                1.44 |         0.98 | 0.92 | 0.00



print(s, summary=TRUE)
#> # The thresholds beyond which the effect is considered as significant (i.e., non-negligible) and large are 0.30 and 1.81 (corresponding respectively to 0.05 and 0.30 of the outcome's SD).
#> 
#> (Intercept) (Median = 52.98, 95% CI [41.89, 65.71]) has 100.00%, 100.00% and 100.00% probability of being positive (> 0), significant (> 0.30) and large (> 1.81)
#> - wt (Median = -8.14, 95% CI [-13.23, -3.93]) has 99.88%, 99.81% and 99.31% probability of being negative (< 0), significant (< -0.30) and large (< -1.81)
#> - cyl (Median = -3.60, 95% CI [-5.49, -1.70]) has 100.00%, 99.88% and 96.94% probability of being negative (< 0), significant (< -0.30) and large (< -1.81)
#> - wt:cyl (Median = 0.74, 95% CI [0.14, 1.44]) has 98.38%, 92.12% and 0.00% probability of being positive (> 0), significant (> 0.30) and large (> 1.81)

^{Created on 2020-10-26 by the reprex package (v0.3.0)}

What do you think?

0 replies

mattansb · 2020-10-26T10:06:27Z

mattansb
Oct 26, 2020
Maintainer

Looks good!
I would just add in report why type of CI was used (ETI vs HDI).

0 replies

mattansb · 2020-10-26T10:08:49Z

mattansb
Oct 26, 2020
Maintainer

Shouldn't the values for significant and large be > |x| ? (Also computation-wise?)

0 replies

DominiqueMakowski · 2020-10-26T10:12:41Z

DominiqueMakowski
Oct 26, 2020
Maintainer Author

I would just add in report why type of CI was used (ETI vs HDI).

I can add that.

Shouldn't the values for significant and large be > |x| ? (Also computation-wise?)

Yeah I thought about that. For the column names? like Significance (> |0.30|)?

0 replies

DominiqueMakowski · 2020-10-26T10:38:08Z

DominiqueMakowski
Oct 26, 2020
Maintainer Author

library(rstanarm)
library(bayestestR)
#> Note: The default CI width might change in future versions (see https://github.com/easystats/bayestestR/issues/250).
#> To prevent any issues, please set it explicitly when using bayestestR functions, via the 'ci' argument.

model <- rstanarm::stan_glm(mpg ~ wt * cyl,
  data = mtcars,
  iter = 800, refresh = 0
)

s <- sexit(model)
s
#> # Following the SEXIT framework, we report the median of the posterior distribution and its 95% CI (Highest Density Interval), along the probability of direction (pd), the probability of significance and the probability of being large.The thresholds beyond which the effect is considered as significant (i.e., non-negligible) and large are 0.30 and 1.81 (corresponding respectively to 0.05 and 0.30 of the outcome's SD).
#> 
#> (Intercept) (Median = 52.72, 95% CI [42.22, 64.87]) has a 100.00% probability of being positive (> 0), 100.00% of being significant (> 0.30), and 100.00% of being large (> 1.81)
#> - wt (Median = -8.16, 95% CI [-12.39, -3.76]) has a 99.94% probability of being negative (< 0), 99.94% of being significant (< -0.30), and 99.81% of being large (< -1.81)
#> - cyl (Median = -3.58, 95% CI [-5.40, -1.72]) has a 100.00% probability of being negative (< 0), 100.00% of being significant (< -0.30), and 96.75% of being large (< -1.81)
#> - wt:cyl (Median = 0.73, 95% CI [0.21, 1.42]) has a 99.25% probability of being positive (> 0), 92.50% of being significant (> 0.30), and 0.06% of being large (> 1.81)
#> 
#> Parameter   | Median |          95% CI | Existence (0) | Significance (0.30) | Large (1.81) |   NA |       NA
#> -------------------------------------------------------------------------------------------------------------
#> (Intercept) |  52.72 |  [42.22, 64.87] |         42.22 |               64.87 |         1.00 | 1.00 |     1.00
#> wt          |  -8.16 | [-12.39, -3.76] |        -12.39 |               -3.76 |         1.00 | 1.00 |     1.00
#> cyl         |  -3.58 |  [-5.40, -1.72] |         -5.40 |               -1.72 |         1.00 | 1.00 |     0.97
#> wt:cyl      |   0.73 |    [0.21, 1.42] |          0.21 |                1.42 |         0.99 | 0.92 | 6.25e-04


print(s, summary=TRUE)
#> # The thresholds beyond which the effect is considered as significant (i.e., non-negligible) and large are 0.30 and 1.81 (corresponding respectively to 0.05 and 0.30 of the outcome's SD).
#> 
#> (Intercept) (Median = 52.72, 95% CI [42.22, 64.87]) has 100.00%, 100.00% and 100.00% probability of being positive (> 0), significant (> 0.30) and large (> 1.81)
#> - wt (Median = -8.16, 95% CI [-12.39, -3.76]) has 99.94%, 99.94% and 99.81% probability of being negative (< 0), significant (< -0.30) and large (< -1.81)
#> - cyl (Median = -3.58, 95% CI [-5.40, -1.72]) has 100.00%, 100.00% and 96.75% probability of being negative (< 0), significant (< -0.30) and large (< -1.81)
#> - wt:cyl (Median = 0.73, 95% CI [0.21, 1.42]) has 99.25%, 92.50% and 0.06% probability of being positive (> 0), significant (> 0.30) and large (> 1.81)

^{Created on 2020-10-26 by the reprex package (v0.3.0)}

0 replies

mattansb · 2020-10-26T11:59:22Z

mattansb
Oct 26, 2020
Maintainer

Yeah I thought about that. For the column names? like Significance (> |0.30|)?

Yes, exactly.

0 replies

DominiqueMakowski · 2020-10-26T14:26:47Z

DominiqueMakowski
Oct 26, 2020
Maintainer Author

library(rstanarm)
library(bayestestR)
#> Note: The default CI width might change in future versions (see https://github.com/easystats/bayestestR/issues/250).
#> To prevent any issues, please set it explicitly when using bayestestR functions, via the 'ci' argument.

model <- rstanarm::stan_glm(mpg ~ wt * cyl,
  data = mtcars,
  iter = 800, refresh = 0
)

s <- sexit(model)
s
#> # Following the SEXIT framework, we report the median of the posterior distribution and its 95% CI (Highest Density Interval), along the probability of direction (pd), the probability of significance and the probability of being large.The thresholds beyond which the effect is considered as significant (i.e., non-negligible) and large are 0.30 and 1.81 (corresponding respectively to 0.05 and 0.30 of the outcome's SD).
#> 
#> (Intercept) (Median = 52.58, 95% CI [39.64, 64.09]) has a 100.00% probability of being positive (> 0), 100.00% of being significant (> 0.30), and 100.00% of being large (> 1.81)
#> - wt (Median = -8.03, 95% CI [-12.57, -3.31]) has a 99.81% probability of being negative (< 0), 99.81% of being significant (< -0.30), and 99.44% of being large (< -1.81)
#> - cyl (Median = -3.57, 95% CI [-5.55, -1.52]) has a 100.00% probability of being negative (< 0), 99.88% of being significant (< -0.30), and 95.38% of being large (< -1.81)
#> - wt:cyl (Median = 0.72, 95% CI [0.09, 1.37]) has a 98.44% probability of being positive (> 0), 89.88% of being significant (> 0.30), and 0.56% of being large (> 1.81)
#> 
#> Parameter   | Median |          95% CI | Existence (> |0|) | Significance (> |0.30|) | Large (> |1.81|)
#> -------------------------------------------------------------------------------------------------------
#> (Intercept) |  52.58 |  [39.64, 64.09] |              1.00 |                    1.00 |             1.00
#> wt          |  -8.03 | [-12.57, -3.31] |              1.00 |                    1.00 |             0.99
#> cyl         |  -3.57 |  [-5.55, -1.52] |              1.00 |                    1.00 |             0.95
#> wt:cyl      |   0.72 |    [0.09, 1.37] |              0.98 |                    0.90 |         5.62e-03


print(s, summary=TRUE)
#> # The thresholds beyond which the effect is considered as significant (i.e., non-negligible) and large are 0.30 and 1.81 (corresponding respectively to 0.05 and 0.30 of the outcome's SD).
#> 
#> (Intercept) (Median = 52.58, 95% CI [39.64, 64.09]) has 100.00%, 100.00% and 100.00% probability of being positive (> 0), significant (> 0.30) and large (> 1.81)
#> - wt (Median = -8.03, 95% CI [-12.57, -3.31]) has 99.81%, 99.81% and 99.44% probability of being negative (< 0), significant (< -0.30) and large (< -1.81)
#> - cyl (Median = -3.57, 95% CI [-5.55, -1.52]) has 100.00%, 99.88% and 95.38% probability of being negative (< 0), significant (< -0.30) and large (< -1.81)
#> - wt:cyl (Median = 0.72, 95% CI [0.09, 1.37]) has 98.44%, 89.88% and 0.56% probability of being positive (> 0), significant (> 0.30) and large (> 1.81)

^{Created on 2020-10-26 by the reprex package (v0.3.0)}

Next step is to mention it in the vignettesi, guidelines, README and then popularize it ☺️

0 replies

mattansb · 2020-10-26T15:20:07Z

mattansb
Oct 26, 2020
Maintainer

Some thoughts:

For pd, I don't think Existence (> |0|) is a good visual-textual representation if the pd (in the table and the text) - perhaps just Pr(Direction) in the table (or something like that?) and without anything in the parentheses in the text?
I would also add the > |threshold| is the text as well (in the relevant parentheses).

0 replies

strengejacke · 2020-12-02T07:42:21Z

strengejacke
Dec 2, 2020
Maintainer

I believe, to make this feature more prominent, it would be better to find a different acronym... Else, it will be underutilized.

0 replies

DominiqueMakowski · 2020-12-02T08:36:15Z

DominiqueMakowski
Dec 2, 2020
Maintainer Author

We can have an alternative, one for sad people and one for fun people 🤷

Others acronym are:

Sequential Effect Existence and Significance Test: SEEST...

So if you inSEEST we can add that as an alias 😅

0 replies

mattansb · 2020-12-02T08:37:46Z

mattansb
Dec 2, 2020
Maintainer

We can have an alternative, one for sad people and one for fun people 🤷

😂😂😂😂

0 replies

strengejacke · 2020-12-02T08:40:13Z

strengejacke
Dec 2, 2020
Maintainer

I personally find the name clear and easy to remember, but think of what we needed to do when submitting the effectsize-paper... :-|

0 replies

mattansb · 2020-12-02T08:41:14Z

mattansb
Dec 2, 2020
Maintainer

How about Sequential Existence and Magnitude Inference Testing of Effects?
That way we can call anyone who is apposed to this framework an antiSEMITE?

0 replies

DominiqueMakowski · 2020-12-02T08:51:29Z

DominiqueMakowski
Dec 2, 2020
Maintainer Author

omg haha 💣

0 replies

strengejacke · 2020-12-02T14:01:41Z

strengejacke
Dec 2, 2020
Maintainer

But this is nothing we need to decide for the next update, I'd say

0 replies

The SEXIT (Sequential Effect eXistence and sIgnificance Test) framework #237

DominiqueMakowski Oct 6, 2019 Maintainer

Motivation

Theoretical idea

Ideas

Replies: 41 comments

DominiqueMakowski Sep 27, 2020 Maintainer Author

DominiqueMakowski Oct 14, 2020 Maintainer Author

mattansb Oct 14, 2020 Maintainer

DominiqueMakowski Oct 14, 2020 Maintainer Author

DominiqueMakowski Oct 14, 2020 Maintainer Author

mattansb Oct 14, 2020 Maintainer

mattansb Oct 14, 2020 Maintainer

DominiqueMakowski Oct 14, 2020 Maintainer Author

DominiqueMakowski Oct 14, 2020 Maintainer Author

DominiqueMakowski Oct 14, 2020 Maintainer Author

mattansb Oct 14, 2020 Maintainer

strengejacke Oct 14, 2020 Maintainer

DominiqueMakowski Oct 15, 2020 Maintainer Author

1

2

3

mattansb Oct 15, 2020 Maintainer

strengejacke Oct 15, 2020 Maintainer

DominiqueMakowski Oct 26, 2020 Maintainer Author

mattansb Oct 26, 2020 Maintainer

mattansb Oct 26, 2020 Maintainer

DominiqueMakowski Oct 26, 2020 Maintainer Author

DominiqueMakowski Oct 26, 2020 Maintainer Author

mattansb Oct 26, 2020 Maintainer

DominiqueMakowski Oct 26, 2020 Maintainer Author

mattansb Oct 26, 2020 Maintainer

strengejacke Dec 2, 2020 Maintainer

DominiqueMakowski Dec 2, 2020 Maintainer Author

mattansb Dec 2, 2020 Maintainer

strengejacke Dec 2, 2020 Maintainer

mattansb Dec 2, 2020 Maintainer

DominiqueMakowski Dec 2, 2020 Maintainer Author

strengejacke Dec 2, 2020 Maintainer

DominiqueMakowski
Oct 6, 2019
Maintainer

DominiqueMakowski
Sep 27, 2020
Maintainer Author

DominiqueMakowski
Oct 14, 2020
Maintainer Author

mattansb
Oct 14, 2020
Maintainer

DominiqueMakowski
Oct 14, 2020
Maintainer Author

DominiqueMakowski
Oct 14, 2020
Maintainer Author

mattansb
Oct 14, 2020
Maintainer

mattansb
Oct 14, 2020
Maintainer

DominiqueMakowski
Oct 14, 2020
Maintainer Author

DominiqueMakowski
Oct 14, 2020
Maintainer Author

DominiqueMakowski
Oct 14, 2020
Maintainer Author

mattansb
Oct 14, 2020
Maintainer

strengejacke
Oct 14, 2020
Maintainer

DominiqueMakowski
Oct 15, 2020
Maintainer Author

mattansb
Oct 15, 2020
Maintainer

strengejacke
Oct 15, 2020
Maintainer

DominiqueMakowski
Oct 26, 2020
Maintainer Author

mattansb
Oct 26, 2020
Maintainer

mattansb
Oct 26, 2020
Maintainer

DominiqueMakowski
Oct 26, 2020
Maintainer Author

DominiqueMakowski
Oct 26, 2020
Maintainer Author

mattansb
Oct 26, 2020
Maintainer

DominiqueMakowski
Oct 26, 2020
Maintainer Author

mattansb
Oct 26, 2020
Maintainer

strengejacke
Dec 2, 2020
Maintainer

DominiqueMakowski
Dec 2, 2020
Maintainer Author

mattansb
Dec 2, 2020
Maintainer

strengejacke
Dec 2, 2020
Maintainer

mattansb
Dec 2, 2020
Maintainer

DominiqueMakowski
Dec 2, 2020
Maintainer Author

strengejacke
Dec 2, 2020
Maintainer