-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JP-3683: fix abs_deriv handling of off-edge and nan values #311
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #311 +/- ##
==========================================
+ Coverage 86.21% 86.38% +0.16%
==========================================
Files 47 49 +2
Lines 8812 8899 +87
==========================================
+ Hits 7597 7687 +90
+ Misses 1215 1212 -3 ☔ View full report in Codecov by Sentry. |
Looks like the RT run didn't recognize the provided branch name (in addition to the gcleary issue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend avoid using single letter variable names and using more descriptive variable names.
Oof. Yeah I ran into "issues" :) The link should be updated to the correct one now: |
A new set of regression tests now that artifactory is back up: EDIT: maybe I spoke too soon as jwst main from last night is showing artifactory errors. We'll see. |
Romancal regtest shows 1 unrelated error and 2 failures:
That are expected since those outputs are different since this PR changes the number of detected outliers. |
JWST regtests show several expected differences due to this PR changing the number of outlier detected. Most differences are small (<1%) However some are several % with test_nirspec_mos_fs_spec3 being one of the larger (>5%). Looking at the first slit the difference is largely due to the "nan dilation" on main preventing a large number of pixels from being flagged as outliers. On main the first slit has 7447 outliers whereas with this PR there are 8171. For outlier detection the science data contains 59% finite values whereas the error (computed as the blotted median following spacetelescope/jwst#8828) contains 27% finite values. Looking at a portion of the slit with some finite pixels for data (streched to be able to see some structure in the finite pixels) |
@braingram - I'd like to look into this. Which slit was this? And which product is in the screenshots? |
Thanks! The 7447 vs 8171 outliers are for one of the It looks like I left off the |
Hmm, I can't reproduce those images, working from jwst on main and stcal on your branch. The below is what I see for the NRS2 blot file from S200A1. Top is SCI, bottom is ERR. The large area marked NaN at the right is expected: this is a consequence of the default parameters weight_type = 'ivm' and maskpt = 0.70. NIRSpec is aware of this issue and considering different default parameter combinations -- see JP-3347 for discussion. It is not expected that there would be a significant difference between the number of NaN values in the SCI and ERR arrays. In quick spot checks for the regression test outputs for MOS, MOS/FS, and FS, I haven't found any blot or median products that show a NaN discrepancy -- they all look like the below image. |
Thanks for checking. The 'sci' I was referring to was the 'sci_data' passed into "flag_resampled_crs" (which comes from the "data" of the input file) stcal/src/stcal/outlier_detection/utils.py Line 157 in dfe1d6d
This is different from the blotted median (which I believe you plotted in the upper subplot in your comment). All 3 ('sci', 'err', 'blot') contribute to outlier detection in a way that a 'nan' pixel in any of the arrays will prevent it from being an outlier. As this PR is only changing the blot_deriv (used on the 'blot', blotted median 'sci' data) I think my comments about the err being largely nan were a distraction.
|
Ah! Okay, I understand now. In that case, yes, this is expected. Blotted median sci and err will match up. The input data will not - the extra NaNs in the blotted product are due to the extra masking in that product. Either way, I agree, that's a side issue from the fixes for the abs_deriv function. The changes here to fix the extra NaN dilation in the blot deriv look good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked through the differences for the regression tests for NIRSpec MOS and FS, and it looks like where the changes are significant with this branch, they are improvements. In particular, discrepant edge values in the test_nirspec_mos_fs_spec3 input are better flagged with these changes.
Thanks for the fix!
Thanks for the review! Let me check with the romancal folks to see how they would like this PR scheduled with their release (since it impacts a few regression tests there as well). |
FYI merging of this should be delayed until:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks, @braingram!
Resolves JP-3683
As part of outlier detection an "absolute derivative" is computed. From the outlier detection docs:
This PR addresses 2 issues with
_abs_deriv
(used in outlier detection) where it:nan
values in themax
used for combining differences between adjacent pixels (effectively dilating nan values)With the previous code:
Note that now instead of one nan at (2, 2) there are 5 nan pixels and that edge values are larger than expected. For example (1, 4) is 9 when the largest difference with all defined pixels is 5.
With this PR the result is:
Tests are added for both issues.
It is expected this will change regression tests for both jwst and romancal as
abs_deriv
is used to compute the outlier detection threshold. With this PR more outliers will be detected (since the threshold will be lower due to edge pixels having lower values and less nans in the blot images).Romancal regtests running: https://github.com/spacetelescope/RegressionTests/actions/runs/11682867663
JWST regtests running: https://github.com/spacetelescope/RegressionTests/actions/runs/11682842450
Tasks
docs/
pageno-changelog-entry-needed
)changes/
:echo "changed something" > changes/<PR#>.<changetype>.rst
(see below for change types)"git+https://github.com/<fork>/stcal@<branch>"
)jwst
regression testromancal
regression testnews fragment change types...
changes/<PR#>.apichange.rst
: change to public APIchanges/<PR#>.bugfix.rst
: fixes an issuechanges/<PR#>.general.rst
: infrastructure or miscellaneous change