Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GP Mauna Loa #73

Open
OriolAbril opened this issue Mar 30, 2021 · 9 comments
Open

GP Mauna Loa #73

OriolAbril opened this issue Mar 30, 2021 · 9 comments
Labels
tracker id Issues used as trackers in the notebook update project, do not close!

Comments

@OriolAbril
Copy link
Member

File: https://github.com/pymc-devs/pymc-examples/blob/main/examples/gaussian_processes/GP-MaunaLoa.ipynb
Reviewers:

The sections below may still be pending. If so, the issue is still available, it simply doesn't
have specific guidance yet. Please refer to this overview of updates

Known changes needed

Changes listed in this section should all be done at some point in order to get this
notebook to a "Best Practices" state. However, these are probably not enough!
Make sure to thoroughly review the notebook and search for other updates.

General updates

ArviZ related

Changes for discussion

Changes listed in this section are up for discussion, these are ideas on how to improve
the notebook but may not have a clear implementation, or fix some know issue only partially.

General updates

ArviZ related

Notes

Exotic dependencies

Computing requirements

@OriolAbril OriolAbril added the tracker id Issues used as trackers in the notebook update project, do not close! label Mar 30, 2021
@danhphan
Copy link
Member

Hi @OriolAbril, I can work on this task.

Just want to confirm that the main tasks are upgrading the notebook into PyMC v4 and new ArviZ version right? Thanks

@OriolAbril
Copy link
Member Author

Thanks!

Yes, the main tasks are updating to use v4 and ArviZ and to update the metadata and style, see pymc-devs/pymc#5460 for example. Everything should be written in the jupyter style guide and the webinar from the PyMC-Data Umbrella series, if you have any questions let me know

@danhphan
Copy link
Member

danhphan commented Apr 17, 2022

I re-run this notebook with the latest pymc version 4.0.0b6.

There seems a problem in pm.find_MAP.

with pm.Model() as model:
    .........................
    # The Gaussian process is a sum of these three components
    gp = gp_seasonal + gp_medium + gp_trend
    # Since the normal noise model and the GP are conjugates, we use `Marginal` with the `.marginal_likelihood` method
    y_ = gp.marginal_likelihood("y", X=t, y=y, noise=cov_noise)
    # this line calls an optimizer to find the MAP
    mp = pm.find_MAP(include_transformed=True)

The results of estimated parameters from pymc v4 are:

['period:1.0',
 'α:2.5',
 'η_med:0.10000000000000002',
 'η_noise:0.05000000000000001',
 'η_per:1.0',
 'η_trend:2.0',
 'σ:0.05000000000000001',
 'ℓ_med:2.6666666666666665',
 'ℓ_noise:0.5',
 'ℓ_pdecay:133.33333333333337',
 'ℓ_psmooth :1.3333333333333333',
 'ℓ_trend:40.0']

While the expected (in pymc 3.9.3) should be:

['period:1.0003904457696988',
 'α:1.1826444920780308',
 'η_med:0.02432146950697816',
 'η_noise:0.007821354118990929',
 'η_per:0.09941238570897516',
 'η_trend:1.8391949691730372',
 'σ:0.0067995598981081',
 'ℓ_med:1.4999510109600107',
 'ℓ_noise:0.17053127928697115',
 'ℓ_pdecay:125.90347217600561',
 'ℓ_psmooth :0.7309756296087601',
 'ℓ_trend:53.345284025570486']

Model fitting:
bokeh_plot (1)

The wrong estimated params from find_MAP lead to a big standard deviation in model prediction

bokeh_plot

My local PyMC v4 environment config:

Python version       : 3.9.10
IPython version      : 8.2.0
pymc  : 4.0.0b6
aesara: 2.5.1
arviz : 0.12.0
pandas: 1.4.2
numpy : 1.22.3
Watermark: 2.3.0

@danhphan
Copy link
Member

There seems a similar issue at pymc-devs/pymc#5443 (comment)

@ricardoV94
Copy link
Member

ricardoV94 commented Apr 17, 2022

What happens without that include_transformed argument? And how are you printing the parameter values, I don't see any transformed variable names in the output of MAP that you printed. There would be a bunch of _log__ suffixes

@danhphan
Copy link
Member

danhphan commented Apr 18, 2022

The same [wrong] results when set include_transformed=False or remove it in pm.find_MAP()

with pm.Model() as model:
    # ........ removed .........
    # The Gaussian process is a sum of these three components
    gp = gp_seasonal + gp_medium + gp_trend
    # Since the normal noise model and the GP are conjugates, we use `Marginal` with the `.marginal_likelihood` method
    y_ = gp.marginal_likelihood("y", X=t, y=y, noise=cov_noise)
    # this line calls an optimizer to find the MAP
    mp = pm.find_MAP(include_transformed=False)

This code prints the variables:

# display the results, dont show transformed parameter values
sorted([name + ":" + str(mp[name]) for name in mp.keys() if not name.endswith("_")])

Outputs:

['period:1.0',
 'α:2.5',
 'η_med:0.10000000000000002',
 'η_noise:0.05000000000000001',
 'η_per:1.0',
 'η_trend:2.0',
 'σ:0.05000000000000001',
 'ℓ_med:2.6666666666666665',
 'ℓ_noise:0.5',
 'ℓ_pdecay:133.33333333333337',
 'ℓ_psmooth :1.3333333333333333',
 'ℓ_trend:40.0']

May be you want to see all variables, so I print it all here:

# display the results, dont show transformed parameter values
sorted([name + ":" + str(mp[name]) for name in mp.keys()])

Outputs:

['period:1.0',
 'α:2.5',
 'α_log__:0.9162907318741551',
 'η_med:0.10000000000000002',
 'η_med_log__:-2.3025850929940455',
 'η_noise:0.05000000000000001',
 'η_noise_log__:-2.995732273553991',
 'η_per:1.0',
 'η_per_log__:0.0',
 'η_trend:2.0',
 'η_trend_log__:0.6931471805599453',
 'σ:0.05000000000000001',
 'σ_log__:-2.995732273553991',
 'ℓ_med:2.6666666666666665',
 'ℓ_med_log__:0.9808292530117262',
 'ℓ_noise:0.5',
 'ℓ_noise_log__:-0.6931471805599453',
 'ℓ_pdecay:133.33333333333337',
 'ℓ_pdecay_log__:4.892852258439873',
 'ℓ_psmooth :1.3333333333333333',
 'ℓ_psmooth _log__:0.28768207245178085',
 'ℓ_trend:40.0',
 'ℓ_trend_log__:3.6888794541139363']

@danhphan
Copy link
Member

Hi @OriolAbril, I have also re-run this notebook in pymc3 v3.11.4, and it works.

Is it okie if I do some General updates and ArviZ related things and make a PR to move this one to Best practices (v3) tag first.

@OriolAbril
Copy link
Member Author

As long as notebooks move progressively towards the Done column all PRs will be welcome 😄

@quantheory
Copy link

The problem with using this notebook in v4 is possibly due to the buggy find_MAP behavior mentioned in pymc-devs/pymc#5923.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tracker id Issues used as trackers in the notebook update project, do not close!
Projects
Status: Best practices (v3)
Development

No branches or pull requests

4 participants