-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PC prior distribution for Student T dof #252
base: main
Are you sure you want to change the base?
Conversation
) | ||
|
||
|
||
def tri_gamma_approx(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is already implemented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approximation will be much more performant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw you added trigamma recently, I'll give that a try. I used this approx because at the time the gradient wasn't implement yet, where the gradient for the approx is easy. Wasn't concerned with performance at the time, but will take another look
@@ -216,3 +226,62 @@ def moment(rv, size, mu, sigma, xi): | |||
if not rv_size_is_none(size): | |||
mode = pt.full(size, mode) | |||
return mode | |||
|
|||
|
|||
class PCPriorStudentT_dof_RV(RandomVariable): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs a docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we don't document the RV, but the Distribution class, which doesn't have a docstring either
) | ||
|
||
|
||
def tri_gamma_approx(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approximation will be much more performant
NU_MIN = 2.0 + 1e-6 | ||
nu = np.concatenate((np.linspace(NU_MIN, 2.4, 2000), np.linspace(2.4 + 1e-4, 4000, 10000))) | ||
return UnivariateSpline( | ||
studentt_kld_distance(nu).eval()[::-1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an eval is a bit dangerous. If it comes up from an RV you're going to get a random value. The safe thing to do is to constant_fold
and raise if it can't be done.
Or create a PyTensor Op that wraps UnivariateSpline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we have such an op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ricardoV94 It only comes from nu
, which is passed in above as a fixed numpy array, so I think eval
is safe here (unless I'm missing your point). I'm using this to get a spline approximation to the inverse of this function, which is what the [::-1]
bit at the end of the inputs is about.
Thanks @ferrine, will look into that. I remember needing to use UnivariateSpline
this way because I needed this particular behavior
if ext=3 of ‘const’, return the boundary value.
as nu
goes to infinity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I missed the inputs were constant, nvm on my end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's always a known constant could you use .data
instead of .eval()
?
@classmethod | ||
def get_lam(cls, alpha=None, U=None, lam=None): | ||
if (alpha is not None) and (U is not None): | ||
return -np.log(alpha) / studentt_kld_distance(U) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return -np.log(alpha) / studentt_kld_distance(U) | |
return -pt.log(alpha) / studentt_kld_distance(U) |
To update where this is at, tests were passing except for windows and I couldn't figure out why. Not sure what here would depend on windows so maybe need to dig into that in a VM or something to figure that out. Definitely some room to improve how some of this is structured, but was working well enough I was comfortable using it. |
@bwengals the biggest difference with Windows generally is that they sometimes default to int32 dtypes where linux defaults to int64 IIRC. Maybe something in the np.linspace overflows in windows? (I didn't look at which test was even failing, so apologies if this is completely off the target) |
Ah I bet that's it! The test produces inf and it shouldn't. Well thanks for giving me a starting place |
@bwengals interested in updating this one? |
Moving this over from here.
What is this PR about?
Adding penalized complexity prior for the Student-T degrees of freedom parameter. Useful in models where the likelihood was normal, but you'd need some robustness so you switch to a Student T likelihood. It's implemented already in INLA.
The reason this is useful for modeling is you can "robustify" your Gaussian likelihood by making it a student t in a more principled way. You can use this prior to express in a meaningful way "I think there is a 50% or 20% or whatever chance that the degrees of freedom is over 30 (~normal likelihood)", whereas if you use a Gamma(2, 0.1) or worse fix the degrees of freedom to some value, you risk watering down the information coming from the data via the likelihood.
Outstanding issues
BoundedContiuous
instead ofPositiveContinuous
? Went down that rabbit hole some already, might need some help here. Seems to work fine now for normal use, but I haven't tested too many weird edge cases.