Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RCAL-511 Implement Uneven ramp fitting #175

Merged
merged 26 commits into from
Aug 11, 2023

Conversation

stscieisenhamer
Copy link
Collaborator

@stscieisenhamer stscieisenhamer commented Jun 22, 2023

Resolves RCAL-511

This PR implements the Casertino, et.al., 2021 algorithm for fitting unevenly spaced ramps. Initial implementation based primarily on code from romanisim.

Checklist

  • added entry in CHANGES.rst (either in Bug Fixes or Changes to API)
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)

@codecov
Copy link

codecov bot commented Jun 22, 2023

Codecov Report

Patch coverage: 62.82% and project coverage change: -0.30% ⚠️

Comparison is base (b5bd209) 74.41% compared to head (2004ac5) 74.11%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #175      +/-   ##
==========================================
- Coverage   74.41%   74.11%   -0.30%     
==========================================
  Files          29       33       +4     
  Lines        5944     6100     +156     
==========================================
+ Hits         4423     4521      +98     
- Misses       1521     1579      +58     
Files Changed Coverage Δ
setup.py 0.00% <0.00%> (ø)
src/stcal/ramp_fitting/ols_cas22_fit.py 40.47% <40.47%> (ø)
src/stcal/ramp_fitting/ols_cas22_util.py 100.00% <100.00%> (ø)
tests/test_ramp_fitting_cas22.py 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stscieisenhamer stscieisenhamer marked this pull request as ready for review June 26, 2023 19:16
@stscieisenhamer stscieisenhamer requested a review from a team as a code owner June 26, 2023 19:16
@stscieisenhamer
Copy link
Collaborator Author

Developer note: None of the model and reference file handling has yet to be added. Currently planned on handling this in romancal. Part of the reason is to get the basic code out for further development of the jump algorithm, which is closely related

@stscieisenhamer stscieisenhamer changed the title RCAL-511 Implement Uneven ramp fitting WIP: RCAL-511 Implement Uneven ramp fitting Jun 27, 2023
@stscieisenhamer stscieisenhamer marked this pull request as draft June 27, 2023 12:06
@stscieisenhamer
Copy link
Collaborator Author

Just realized there was a commit that was neglected. Will be re-applying it shortly.

Copy link
Collaborator

@kmacdonald-stsci kmacdonald-stsci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many functions are too long and complicated.

If reasonable, attempts should be made to keep variable names across the code base the same. For example, nreads and variations were often used interchangeably with group and frame time, making the code confusing to follow. Also, not sure why use the variable name resultant, rather than group, since they seem to be used for the same thing.

The code appears to properly implement the algorithm from Casertano's paper.

for j in range(resstart[i], resend[i] + 1):
# Casertano+22 Eq. 37
# Note that we are replacing ww with kk to save memory; we don't
# need ww again.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say "we don't need kk again"?

The first resultant in this ramp
resend : np.ndarray[nramp]
The last resultant in this ramp.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is 100 lines long and does many things. In general functions should be able to completely fit on a screen, so roughly 35-45 lines, and shorter if possible. It would be easier to read and maintain if broken down into smaller functions.

"""
firstreads = np.array([x[0] for x in ma_table])
nreads = np.array([x[1] for x in ma_table])
meantimes = read_time * firstreads + read_time * (nreads - 1) / 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is read_time frame time or group time? In other ramp fitting software, read_time or something similar was used interchangeably with group and frame time, causing confusion in how the code works.


Parameters
----------
resultants : np.ndarry[nresultants, npixel]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use the term "resultant", rather than "group"? It would be good to try to keep nomenclature across files the same, when it makes sense.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roman and JWST are using different terminology for this. Roman terms this "resultant", so its not clear how to name this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that, but this is STCAL. It's supposed to be telescope agnostic. Developing parallel code bases with different nomenclature for the same thing undermines being agnostic.

assert result == expected


def test_ramp(test_table=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of things being tested in this one test. It should be broken into smaller tests, with each test testing fewer things.


Returns
-------
par : np.ndarray[..., 2] (float)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use the variable name par?

the read noise, Poisson source noise, and total noise.
"""

# Get the Multi-accum table, either as given or from the read pattern
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very long function, over 150 lines and should be broken down into smaller functions.

@schlafly
Copy link
Collaborator

FWIW, some of the nomenclature answers:

  • read_time: I think this is frame time in JWST. It's one read of the array and is ~3.04 s for Roman. The reads/frames get averaged into resultants/groups.
  • resultant vs. group: yes, these are the same things. No, I don't know why Roman adopted the "resultant" language and JWST the "group" language, but "Roman" is pretty insistent on using resultants as names in place of groups. I don't feel strongly about global search and replace resultant for group in this codebase, but more broadly we will just have to live with the fact that the two missions went with different words for the same quantities.
  • par and var were short for parameters and variances, but these could be given other names.

Copy link
Collaborator

@ddavis-stsci ddavis-stsci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo the given comments, LGTM.
Since I don't see JWST using this for their ramp analysis so the Roman specific nomenclature should not be too confusing.

@stscieisenhamer
Copy link
Collaborator Author

Finally addressing comments. @kmacdonald-stsci : Agree with most everything.

Driver of not doing too much refactoring/naming conventions are primarily that this algorithm is still undergoing algorithmic/science discussion. Kept the implementation close to the original so that discussion would not be confused by refactor/rename issues. Also, the JWST use is still down-the-road so the terminology issue, though definitely needing to be addressed, will be so later on.

Of course, this is all modulo what RCAL would like to see in the current PR.

@schlafly
Copy link
Collaborator

Hi Jonathan,

This looks good. Thinking about next steps:

  • I would go ahead and get rid of the utility code supporting the ramp fit interpolator that has been removed. I'll keep it in the simulator if we ever need it. I think removing it would mean dropping
    • construct_covar
    • construct_ramp_fitting_matrices
    • construct_ki_and_variances
    • ki_and_variance_grid
    • resultants_to_differences
    • associated tests
  • Let's either pull in simulate_many_ramps for test_simulated_ramps or drop that test. I think I'm in favor of pulling it in; that is a pretty good test.
  • I'm happy to sign on to s/resultants/r/groups to improve terminology consistency, etc..

@stscieisenhamer
Copy link
Collaborator Author

Some external feedback on the terminology issue: At the 20230720 TIPS, the terminology issue was brought up, of which Casertano defended the current uneven/Roman centric form. Unless there is strong reason to change in this PR, I propose to deal with it in a later PR.

@WilliamJamieson
Copy link
Collaborator

Some external feedback on the terminology issue: At the 20230720 TIPS, the terminology issue was brought up, of which Casertano defended the current uneven/Roman centric form. Unless there is strong reason to change in this PR, I propose to deal with it in a later PR.

IMOP I think terminology should follow whatever the reference paper uses if possible as we may have to refer back to the paper in the future. It is unfortunate that the two telescopes could not agree on terminology.

@stscieisenhamer
Copy link
Collaborator Author

@schlafly The simulate_many_ramps has further romanisim and galsim dependencies. The current solution allows full testing if the relevant packages are present. If there is strong desire to have this test execute on CI, romanisim can be a test dependency. Otherwise, galsim.GaussianDeviate(seed) will need to be dealt with, plus copy of at least two other functions.

@schlafly
Copy link
Collaborator

Oh no, you're right, I don't want to bring in l1.apportion_counts_to_resultants. That's more detailed than we need for this. Let me rig up a simulate_many_ramps replacement that would be appropriate here; it's not so bad.

# For Roman, the read time of the detectors is a fixed value and is currently
# backed into code. Will need to refactor to consider the more general case.
# Used to deconstruct the MultiAccum tables into integration times.
READ_TIME = 3.04
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, this is Roman specific. We need to pull this out as an argument.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems there was some recent discussion of this value, potentially coming from PPS. I am not seeing anything like this in the schema. What is the process to retrieve from PPS and add as a meta value to the Level 1 file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also note: this is already an argument. the above is the default. However, I will move this into romancal and make the argument required.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the value to use: https://github.com/spacetelescope/rad/blob/main/src/rad/resources/schemas/exposure-1.0.0.yaml#L268
I don't know if we're populating it yet usually.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed that you were just using FRAME_TIME as a default. I agree with you that it makes more sense to have that on the romancal side.

@schlafly
Copy link
Collaborator

With the caveat that I haven't tested anything, so I'm sure it's full of bugs, here's a simpler version of simulate_many_ramps...

def simulate_many_ramps(ntrial=100, flux=100, readnoise=5, ma_table=None):
    if ma_table is None:
        ma_table = [[1, 4], [5, 1], [6, 3], [9, 10], [19, 3], [22, 15]]
    nread = np.array([x[1] for x in ma_table])
    tij = ma_table_to_tij(ma_table)
    resultants = np.zeros((len(ma_table), ntrial), dtype='f4')
    buf = np.zeros(ntrial, dtype='i4')
    for i, ti in enumerate(tij):
        for t0 in ti:
            buf += np.random.poisson(READ_TIME * flux, ntrial)
        resultants[i] = (buf / len(ti)).astype('f4')
    resultants += np.random.randn(len(ma_table), ntrial) * (
        readnoise / np.sqrt(nread)).reshape(len(ma_table), 1)
    return (ma_table, flux, readnoise, resultants)

This is a good simulation of a ramp in the context of the assumptions inherent in the algorithm. That's better than trying to bring over stuff from l1.apportion_counts_to_resultants, sorry for the bad pointer.

Jonathan Eisenhamer and others added 22 commits August 2, 2023 22:24
Changes:
- fix circular import between `matable_fit` and `matable_fit_cas2022`.
- Update tests from `romanisim` to work completely within `stcal`.
- Allow test to use `romanisim` if present.
Initial implementation done. However, untested and tests
need to be implemented.
@stscieisenhamer
Copy link
Collaborator Author

Got the romanisim test dependency completed. However, I want to point out a C-ism that was causing issues. In
ols_cas22.pyx:85, which is

fabs((tbar[start + i] - tbarmid) / tscale) ** weight_power)

generates an error concerning not being able to coerce a complex number to double. This error appeared after taking in the memory enhancement changes Eddy made in the romanisim version. The condition when this occurs seems to be when tbar[start + i] == tbarmid. I can only guess that the floating math is creating a -0.0 situation. Note that this is not reproducible with pure Python.

The error does suggest setting of a cython parameter, cython.cpow(True), to mitigate the situation. Doing so, as best as I can tell, does remediate the exception, and produces the expected results.

@schlafly
Copy link
Collaborator

schlafly commented Aug 3, 2023

Oops, I hit this one too. This happened when cython 3 came out a week or two ago. I tried pinging you here but should have followed up. spacetelescope/romanisim#67 (comment)

@stscieisenhamer stscieisenhamer changed the title WIP: RCAL-511 Implement Uneven ramp fitting RCAL-511 Implement Uneven ramp fitting Aug 3, 2023
@stscieisenhamer stscieisenhamer merged commit 0c7cb32 into spacetelescope:main Aug 11, 2023
16 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants