Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function for Weighted Effect Coding? #363

Open
mattansb opened this issue Feb 8, 2023 · 4 comments
Open

Add function for Weighted Effect Coding? #363

mattansb opened this issue Feb 8, 2023 · 4 comments

Comments

@mattansb
Copy link
Member

mattansb commented Feb 8, 2023

Is this something we can add here?
(After I built the function, I saw it was implemented in the {wec} package....)

contr.wsum <- function(x, ref, ...) {
  x <- as.factor(x)
  lvls <- levels(x)
  n <- nlevels(x)
  
  if (!missing(ref)) {
    if (!ref %in% lvls) stop("")
    lvls <- c(setdiff(lvls, ref), ref)
    x <- factor(x, levels = lvls)
  } else {
    ref <- lvls[n]
  }
  
  M <- contr.sum(n)
  rownames(M) <- lvls
  
  tab <- proportions(table(x))
  M[ref,] <- -unname(tab[-n] / tab[n])
  M
}

contr.wsum(mtcars$cyl)
#>         [,1] [,2]
#> 4  1.0000000  0.0
#> 6  0.0000000  1.0
#> 8 -0.7857143 -0.5

# same as:  
wec::contr.wec(factor(mtcars$cyl), "8")
#>            4    6
#> 1  1.0000000  0.0
#> 2  0.0000000  1.0
#> 3 -0.7857143 -0.5

Usage:

mtcars$cyl_f <- factor(mtcars$cyl)
contrasts(mtcars$cyl_f) <- contr.wsum(mtcars$cyl_f)
m <- lm(mpg ~ cyl_f, mtcars)
coef(m)[1]
#> (Intercept) 
#>    20.09062
mean(mtcars$mpg)
#> [1] 20.09062

Created on 2023-02-08 with reprex v2.0.2

@etiennebacher
Copy link
Member

@easystats/core-team WDYT? (I have no idea what this is)

@bwiernik
Copy link
Contributor

bwiernik commented Feb 17, 2023

Don't contrasts need to sum to 0?or is this for something like different levels of dosage? What are these used for?

@bwiernik
Copy link
Contributor

Oh, it's so that the parameters are deviations from the sample mean, rather than the grand mean of the group means

https://journal.r-project.org/archive/2017/RJ-2017-017/RJ-2017-017.pdf

I'm good with adding this, but we should be sure to include some of the language from the wec paper clearly describing what this is for and how it differs from unweighted effect coding

In effect coding (also known as deviation contrast or ANOVA coding), parameters represent the deviation of each category from the grand mean across all categories (i.e., the sum of arithmetic means in all categories divided by the number of categories). To achieve this, the sum of all parameters is constrained to 0. This implies that the possibly different numbers of observations in categories is not taken into account. In weighted effect coding, the parameters represent the deviation of each category from the sample mean, corresponding to a constraint in which the weighted sum of all parameters is equal to zero. The weights are equal to the number of observations per category.

@etiennebacher
Copy link
Member

bump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants