Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Revert" defaults to mirror GLM #47

Open
ablaom opened this issue Feb 26, 2024 · 3 comments
Open

"Revert" defaults to mirror GLM #47

ablaom opened this issue Feb 26, 2024 · 3 comments
Labels

Comments

@ablaom
Copy link
Member

ablaom commented Feb 26, 2024

Context

@OkonSamuel Can you please address the question posed in the linked comment above?

@ablaom
Copy link
Member Author

ablaom commented Feb 26, 2024

cc @tiemvanderdeure

@OkonSamuel
Copy link
Member

(I also just noticed that neither LinearCountRegressor nor LinearBinaryClassifier have a dropcollinear field. I think this might be an oversight?)

It wasn't an oversight. At that time the version of GLM.jl (i.e v1.4.0) only supported the dropcolinear argument when calling GLM.lm function and not GLM.glm function.

@OkonSamuel Do you remember a reason for making the new default false?

There wasn't any reason for overriding the GLM defaults. It wasn't my intent to override the defaults. My aim at that time was to replace the previous deprecated kwarg allowrankdeficient with the new kwarg dropcolinear . Both allowrankdeficient and dropcolinear keyword arguments weren't supported by GLM.glm function. So it was causing errors on something I was working on at that time.
So we can stick to the GLM defaults and mention in the docs that by default the behavior is to drop collinear columns.

Secondly, it appears the position of intercept in a formula matters. StatsModels.@formula will always put intercept as the first term. In this package, we put it at the end in glm_formula. Reversing this so the intercept is the first term will actually also fix the error (I have no clue why, but it does).

@tiemvanderdeure What Error where you referring to here?

@tiemvanderdeure
Copy link
Contributor

tiemvanderdeure commented Feb 27, 2024

@tiemvanderdeure What Error where you referring to here?

It was an error where fit would throw an error when it should have dropped collinear variables, because the position of the intercept term does not correspond to what is expected by StatsModels. But it got fixed by #45, so we don't need to worry about it.

dropcollinear got implemented for generalized models in GLM v1.8.1, but I see no reason why we couldn't require that if we make a breaking version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: priority low / straightforward
Development

No branches or pull requests

3 participants