Divergences in examples/generalized_linear_models/GLM-ordinal-regression.ipynb #649

erik-werner · 2024-03-21T15:13:25Z

Regression Models with Ordered Categorical Outcomes:
https://www.pymc.io/projects/examples/en/latest/generalized_linear_models/GLM-ordinal-regression.html:

Issue description

After make_model() is defined, five different models are created and sampled from. Unfortunately, models 1-2 have hundreds of divergences. Models 3-4 sample fine. Model 5 is identical to model 4, presumably the argument logit=False is missing. If I add this argument, 2 of 4 chains are completely stuck when sampling.

These divergences were not automatically reported because the numpyro sampler was used. However, since the most recent release we get warnings for divergences also for this sampler. 👍

Proposed solution

Fix the models so that they can be sampled from. Or if that turns out to be difficult, consider whether they can be removed. Or add a discussion of the problem with these models.

The text was updated successfully, but these errors were encountered:

erik-werner · 2024-03-21T15:13:33Z

@NathanielF would you like to have a look at this? Or do you want me to see if I can fix it?

NathanielF · 2024-03-21T15:21:02Z

Feel free to have a look @erik-werner ! Thanks for bringing it up

NathanielF · 2024-03-25T16:31:46Z

Just a note here @erik-werner

Had a quick look. I think the code you want to run is :


def make_model(priors, model_spec=1, constrained_uniform=False, logit=True):
    with pm.Model() as model:
        if constrained_uniform:
            cutpoints = constrainedUniform(K, 0, K)
        else:
            sigma = pm.Exponential("sigma", priors["sigma"])
            cutpoints = pm.Normal(
                "cutpoints",
                mu=priors["mu"],
                sigma=sigma,
                transform=pm.distributions.transforms.univariate_ordered,
            )

        if model_spec == 1:
            beta = pm.Normal("beta", priors["beta"][0], priors["beta"][1], size=1)
            mu = pm.Deterministic("mu", beta[0] * df.salary)
        elif model_spec == 2:
            beta = pm.Normal("beta", priors["beta"][0], priors["beta"][1], size=2)
            mu = pm.Deterministic("mu", beta[0] * df.salary + beta[1] * df.work_sat)
        elif model_spec == 3:
            beta = pm.Normal("beta", priors["beta"][0], priors["beta"][1], size=3)
            mu = pm.Deterministic(
                "mu", beta[0] * df.salary + beta[1] * df.work_sat + beta[2] * df.work_from_home
            )
        if logit:
            y_ = pm.OrderedLogistic("y", cutpoints=cutpoints, eta=mu, observed=df.explicit_rating)
        else:
            y_ = pm.OrderedProbit("y", cutpoints=cutpoints, eta=mu, observed=df.explicit_rating)
        if not logit:
            idata = pm.sample(idata_kwargs={"log_likelihood": True})
        else:
            idata = pm.sample(nuts_sampler="numpyro", idata_kwargs={"log_likelihood": True})
        idata.extend(pm.sample_posterior_predictive(idata))
    return idata, model

priors = {"sigma": 1, "beta": [0, 1], "mu": np.linspace(0, K, K - 1)}
idata1, model1 = make_model(priors, model_spec=1)
idata2, model2 = make_model(priors, model_spec=2)
idata3, model3 = make_model(priors, model_spec=3)
idata4, model4 = make_model(priors, model_spec=3, constrained_uniform=True)
idata5, model5 = make_model(priors, model_spec=3, constrained_uniform=True, logit=False)

Models 1 and 2 are kind of underspecified so you will get divergences. The interesting contrasts are between the models 3 and 4 where we use the constrained uniform on the cutpoints. If you fix the fifth model to use the default sampler rather than numpyro you should be able to run the probit fit.

This is also arguably an interesting contrast just to see how the parameter values shift...
I think i wouldn't remove models 1 and 2 but i'd add text discussing how a better specified model avoids the issues with divergences that you've highlighted

erik-werner · 2024-03-25T16:50:08Z

@NathanielF These were basically my conclusions as well 🙂
One more thing: It seems like the divergences for models 1&2 occur when sigma is very close to 0, so in principle one could probably get rid of most divergences with a non-centered parametrisation. But that doesn't combine well with the ordered transform.

I wonder why the numpyro sampler cannot handle the probit model.

Anyway, since you have a better overview of the whole flow, and seem to have an idea about what to write, it would be great if you'd like to update the notebook.

NathanielF · 2024-03-25T16:56:46Z

Sure thing @erik-werner. Will do.

On numpyro i think it has to do with the lack of parity between the jax implementation of ercf function in scipy.

Thanks again for bringing the issue to my attention.

NathanielF mentioned this issue Mar 25, 2024

Fix Bug on Ordinal Regression Notebook #650

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Divergences in examples/generalized_linear_models/GLM-ordinal-regression.ipynb #649

Divergences in examples/generalized_linear_models/GLM-ordinal-regression.ipynb #649

erik-werner commented Mar 21, 2024

erik-werner commented Mar 21, 2024 •

edited

Loading

NathanielF commented Mar 21, 2024

NathanielF commented Mar 25, 2024

erik-werner commented Mar 25, 2024 •

edited

Loading

NathanielF commented Mar 25, 2024

Divergences in examples/generalized_linear_models/GLM-ordinal-regression.ipynb #649

Divergences in examples/generalized_linear_models/GLM-ordinal-regression.ipynb #649

Comments

erik-werner commented Mar 21, 2024

Issue description

Proposed solution

erik-werner commented Mar 21, 2024 • edited Loading

NathanielF commented Mar 21, 2024

NathanielF commented Mar 25, 2024

erik-werner commented Mar 25, 2024 • edited Loading

NathanielF commented Mar 25, 2024

erik-werner commented Mar 21, 2024 •

edited

Loading

erik-werner commented Mar 25, 2024 •

edited

Loading