-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash when MS has multiple SPWs with different sizes #93
Comments
I'm also getting this issue. Is it possible to patch? |
@Jordatious could you make a small MS for me to test this? Perhaps extract just one scan from your failing MS, and send me that + parset? (Can also put it on ILIFU somewhere where I have access.) |
Let me find out if I can share the data |
Hey Oleg, I've copied the (whole) dataset here: |
Hey @o-smirnov, can you confirm you have a copy of the data? Please let me know how you go. |
Copying it to Rhodes now. Going along at a merry clip of ~80MB/s, so shouldn't be long! |
Right, got it. |
@Jordatious, do you know how this MS came to be in such an interesting state? The spectral windows actually overlap by one channel at the edges, which is quite difficult to handle consistently, if one wants a solution with a frequency solint that spans across SPW boundaries. I'll push a fix shortly which will at least allow for one SPW at a time to be solved for. |
Hey @o-smirnov, it was output from the IDIA pipeline, which calls
From memory one SPW had an issue and I forgot to fix that before including it in the concat. Generally we don't include this overlap, and I suppose one solution for LADUMA would be to not include the overlap until spectral-line imagine, which is presumably a later split operation. |
Yeah, it's quite a mission to get the code to support overlapping SPWs. So if it can be avoided, let's try that. I did push a fix to the |
Hey @o-smirnov, thanks for that! So if I have continuous SPWs, with nchan differing by 0 or 1 channels in each SPW (but otherwise continuous), will the default (i.e. without @KshitijT, could we have this version pulled to ilifu? |
I see we wrote the code with the anticipation of having multiple SPWs with a different number of channels, as long as the frequency increases monotonically. The initial error you were getting was due to another bit of code added later, which I've now fixed. What I'll do is extract spws 0,2,4,6 from your MS and try it on that. Stay tuned. |
OK, doesn't quite cut the mustard. Let me work on another fix... |
About the padding thing - might be quite easy to do a flux-dependent padding... For the few facets containing bright sources increase the padding by some factor... |
Wrong issue. Good to know you're reading it though. ;) |
Great idea. And move to facet-dependent resolution while we're at it! We're still hijacking the issue though... |
Oh yeah - sorry !!! :) |
@Jordatious be careful with overlapping SPW. I have done some testing with CASA gaincal with combine="spw" with bda averaged data. In that case we have chunked overlapping spw at various differing resolutions which we bin the baselines into based on how much smearing can be tolerated (ska-sa/xova). CASA 5 gaincal does not handle this correctly - the solution intervals it constructs seem to assume some form of regular spacing because it flags out entire good spw in this case. So it is not just cubical that cannot operate in this regime. @JSKenyon is working on supporting overlapping SPW for the bda case for the cubical successor quartical and I know he has recently got it to work in this overlapping regime |
Thanks for the heads up @bennahugo. It's possible I've already run into this issue. That would be a nice explanation for why my current selfcal run isn't performing as usual! Although we use @o-smirnov, I'll test with SPWs that aren't overlapping, but likely with uneven numbers of channels in each SPW. Let me know when that should work here, and which version/branch to use. |
Baseline dependent averaging
…On Tue, Dec 15, 2020 at 2:29 PM Jordan Collier ***@***.***> wrote:
Thanks for the heads up @bennahugo <https://github.com/bennahugo>. It's
possible I've already run into this issue. That would be a nice explanation
for why my current selfcal run isn't performing as usual! Although we use
combine='scan'. What's bda?
@o-smirnov <https://github.com/o-smirnov>, I'll test with SPWs that
aren't overlapping, but likely with uneven numbers of channels in each SPW.
Let me know when that should work here, and which version/branch to use.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#93 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4RE6UGJPCN5ROP4Z255L3SU5JB3ANCNFSM4EDLNIPQ>
.
--
--
Benjamin Hugo
PhD. student,
Centre for Radio Astronomy Techniques and Technologies
Department of Physics and Electronics
Rhodes University
Junior software developer
Radio Astronomy Research Group
South African Radio Astronomy Observatory
Black River Business Park
Observatory
Cape Town
|
OK, this code was less tested than I thought, but I think the Also, @landmanbester, could you please do me a favour, and run this branch against your Cyg A dataset to check that you have the same results? I'd like to make sure that nothing is broken in the case of equally-sized SPWs. |
Sure, I'll rerun and check that I get the same dirty image |
Brilliant! If you can containerise it on ilifu ( |
@KshitijT, I have pushed an image called |
I reran the calibration on Cygnus A data in A config and compared the corrected data from master and the issue-93 branch. The only part of the parset that is different between runs is the name of the output column. There is a maximum absolute difference between the columns of about 0.05 and a max relative difference of about 0.0008. I am not sure if this suffices. Do you want me to look at the difference between the dirty images as well? |
Let me give it a shot. |
Thanks @landmanbester, that's very useful. There were no changes to the solver so there should not be any difference. However, I also added a workaround for casacore/python-casacore#130, so there's a possibility master is being affected by that bug (which would be very interesting to say the least!) Let me make a branch where the workaround is disabled and try your solve with that... |
@Jordatious , I have put the new singularity cubical image in /software/astro/caracal . The image name is: cubical_frankl.sif . Let me know if there is any problem with running that. |
@KshitijT brilliant, thanks! Hopefully I'll get a chance to test this later this afternoon! |
Hold on, sorry, I'm being daft. It is just the phase ambiguity. I thought np.testing.assert_array_almost_equal compares absolute values but explicitly comparing the absolutes gives a max absolute difference of about 5e-5 so it's all good |
Ah, an even better outcome. Thanks, @landmanbester! |
So I got this working from start to end for data without the overlapping SPWs. However, the results were quite bad! I'm suspecting it's because I believe the following default of @IanHeywood's that I used here means 256 and 64 MHz, right?
This is larger than the ~30-50 MHz width of each SPW. I could try this again with a width of ~30 MHz, although I'm basically blindly copying the config here so I'm not sure if the results will be sensible. I also notice |
256 and 64 are an integer number of channels, not a bandwidth. You are
correct that the chunking only refers to how the data is partitioned and
the freq-int values refer to solution intervals.
…On Thu, 7 Jan 2021, 16:36 Jordan Collier, ***@***.***> wrote:
So I got this working from start to end for data without the overlapping
SPWs. However, the results were quite bad! I'm suspecting it's because I
believe the following default of @IanHeywood
<https://github.com/IanHeywood>'s that I used here
<https://github.com/IanHeywood/oxkat/blob/master/data/cubical/peel.parset>
means 256 and 64 MHz, right?
[g]
...
freq-int = 256
...
[de]
...
freq-int = 64
This is larger than the ~30-50 MHz width of each SPW. I could try this
again with a width of ~30 MHz, although I'm basically blindly copying the
config here so I'm not sure if the results will be sensible. I also notice freq-chunk
= 256 I presume this chunking is how it internally partitions the data,
while freq-int is the frequency interval over which it solves. Is that
correct?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#93 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABSHDWNRBKGSW7HYYD4O4DTSYXBINANCNFSM4EDLNIPQ>
.
|
Yep. And the effective freq_int can't be bigger than a chunk, so effectively it would use min(freq_int, freq_chunk). @Jordatious, if you want one solution per SPW regardless of SPW size, just set freq-int=0, freq-chunk=0. Zero in this case means "full SPW". Beyond that, there could be many reasons for bad results... one thing to try in terms of bracketing the problem -- run it on a single SPW (via |
Thanks for that. In that case, since the channel width is ~836 kHz, those intervals are 214.0 and 53.5 MHz. So maybe I try an interval / chunk of 36 (since the minimum is 30.1 MHz), or otherwise 0. Sorry, it wasn't clear whether you were suggesting I use If I run this again, will it re-write the corrected data column using the |
What I mean is, if you run with
Yes, it will overwrite CORRECTED_DATA (or whatever the output column is set to). If you run with |
As hinted in #424, this seems to have worked. However, the peel did not work as expected, which still has a lot of residual artefacts around the source (see attached). The peak goes from 175 mJy to 1.8 mJy, so the source is peeled. So I consider this done, and that the rest might be based on zeros I'm finding in the data (see below). My config is identical to the one from Oxkat, other than the following:
So I used all (non-overlapping) SPWs for this, with a frequency interval and chunk size equal to the whole SPW. It reads from corrected and writes to data. I also tried reading from corrected and writing to corrected, which achieves the same result, which is cool. The column trickery is because my I've plotted the data for Assuming this works going forward, I guess this will be pulled into master (once the errors are fixed in #424), and usable from one of the standard Stimela containers, right? |
Something seems a bit off. In particular, I am concerned that some of the sources in the left half of the image have worse artefacts after peeling, when one would expect them to be largely unaffected (unless they are part of the peeling model?). Are you certain that your model is correct? |
Exactly. They're not part of the model that should be peeled. Other than the zeros, the model should be good. The sky model is based on what was fed into selfcal, which did sensible things. It's a fairly small sky model, over only 5k, with 1.5 arcsec pixels (so ~2 degrees diameter). Not sure if that's a problem. The model for the problem source seems sensible, corresponding to a double source with one component brighter than the other, and it's quite smooth. I thought they should be brighter in the plot, so I'm not sure about that yet. |
Interesting. It is not impossible that there are calibration artefacts ("ghosts") being introduced by the DD solve, but I wouldn't anticipate it given the solution intervals (though if there are unflagged zeros, I guess that could be an issue). As a sanity check, could you possibly image your various model columns? What was the time interval for the peeled source? |
Well from an initial look, there are no zeros in the The config largely follows the one from Oxkat, which looks like it uses a 72 second time interval, assuming I've understood this correctly.
|
That's in units of integration times, which is assumed to be 8 seconds, so 72 times 8 seconds = 9.6 minutes. |
Ok thanks Ian. I imaged the model column and it came out ok, almost entirely identical to the corrected data from after selfcal, with the problem source still present, as expected. I also imaged Here is a gif blinking between these two columns. |
* partial fix for #93. Should allow for one spw at a time * fixes #93. Adds workaround for casacore/python-casacore#130 * added future-fstrings * correctly named future-fstrings * ...and install it for all pythons * back to mainstream sharedarray * lowered message verbosity * droppping support for python<3.6 * Issues 427 and 429 (#430) * Fixes #427 * Fixes #429 * Update __init__.py Bump version * Update Jenkinsfile.sh (#447) * Update Jenkinsfile.sh Remove python 2 building * Update Jenkinsfile.sh * Issue 431 (#432) * Beginning of nobeam functionality. * Remove duplicated source provider. * Change syntax for python 2.7 compatibility. * Update montblanc version in setup to avoid installation woes. Co-authored-by: JSKenyon <[email protected]> * Remove two uses of future_fstrings. * make pypi compatible (#446) * make pypi compatible * error * Update setup.py Only support up to v0.6.1 of montblanc * Update setup.py Co-authored-by: Benjamin Hugo <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: JSKenyon <[email protected]> * Update __init__.py * Fix warning formatting Co-authored-by: Oleg Smirnov <[email protected]> Co-authored-by: Benjamin Hugo <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: Gijs Molenaar <[email protected]> Co-authored-by: JSKenyon <[email protected]>
Thanks @SpheMakh for looking at this for LADUMA. To briefly clarify the previous post, this is also showing how the other column came out. So this blinks between data, model and corrected. I haven't managed to image the |
* Added different slope modes for slope solvers. Experimental. * initial rehashing to support a more geneal slope structure (e.g. iono+delay) * fixes #428 * fixes #433 * no-X polcal experiments * added support for arbitrary Jones terms in *first* chain position * fixed typo * Noxcal (#450) * partial fix for #93. Should allow for one spw at a time * fixes #93. Adds workaround for casacore/python-casacore#130 * added future-fstrings * correctly named future-fstrings * ...and install it for all pythons * back to mainstream sharedarray * lowered message verbosity * droppping support for python<3.6 * Issues 427 and 429 (#430) * Fixes #427 * Fixes #429 * Update __init__.py Bump version * Update Jenkinsfile.sh (#447) * Update Jenkinsfile.sh Remove python 2 building * Update Jenkinsfile.sh * Issue 431 (#432) * Beginning of nobeam functionality. * Remove duplicated source provider. * Change syntax for python 2.7 compatibility. * Update montblanc version in setup to avoid installation woes. Co-authored-by: JSKenyon <[email protected]> * Remove two uses of future_fstrings. * make pypi compatible (#446) * make pypi compatible * error * Update setup.py Only support up to v0.6.1 of montblanc * Update setup.py Co-authored-by: Benjamin Hugo <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: JSKenyon <[email protected]> * Update __init__.py * Fix warning formatting Co-authored-by: Oleg Smirnov <[email protected]> Co-authored-by: Benjamin Hugo <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: Gijs Molenaar <[email protected]> Co-authored-by: JSKenyon <[email protected]> * fixed bug when outer chain term was DD and incorrect model was used. Properly implemented scalar updates * added scalar and unislope update types * added auto-generated Stimela2 schemas * added schemas to manifest * use / separator for stimela schema; omit defaults in schema; fix path * minor bugfixes * added stimela_cabs file. Fixed problem with legacy-only flags on output * added stimela_cabs.yml * changed / to . * fixed problem with IFR gain flags overpropagating * added parset argument --------- Co-authored-by: Oleg Smirnov <[email protected]> Co-authored-by: Lexy Andati <[email protected]> Co-authored-by: Benjamin Hugo <[email protected]> Co-authored-by: Gijs Molenaar <[email protected]>
* Added different slope modes for slope solvers. Experimental. * initial rehashing to support a more geneal slope structure (e.g. iono+delay) * fixes #428 * fixes #433 * no-X polcal experiments * added support for arbitrary Jones terms in *first* chain position * fixed typo * Noxcal (#450) * partial fix for #93. Should allow for one spw at a time * fixes #93. Adds workaround for casacore/python-casacore#130 * added future-fstrings * correctly named future-fstrings * ...and install it for all pythons * back to mainstream sharedarray * lowered message verbosity * droppping support for python<3.6 * Issues 427 and 429 (#430) * Fixes #427 * Fixes #429 * Update __init__.py Bump version * Update Jenkinsfile.sh (#447) * Update Jenkinsfile.sh Remove python 2 building * Update Jenkinsfile.sh * Issue 431 (#432) * Beginning of nobeam functionality. * Remove duplicated source provider. * Change syntax for python 2.7 compatibility. * Update montblanc version in setup to avoid installation woes. Co-authored-by: JSKenyon <[email protected]> * Remove two uses of future_fstrings. * make pypi compatible (#446) * make pypi compatible * error * Update setup.py Only support up to v0.6.1 of montblanc * Update setup.py Co-authored-by: Benjamin Hugo <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: JSKenyon <[email protected]> * Update __init__.py * Fix warning formatting Co-authored-by: Oleg Smirnov <[email protected]> Co-authored-by: Benjamin Hugo <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: Gijs Molenaar <[email protected]> Co-authored-by: JSKenyon <[email protected]> * fixed bug when outer chain term was DD and incorrect model was used. Properly implemented scalar updates * added scalar and unislope update types * added auto-generated Stimela2 schemas * added schemas to manifest * use / separator for stimela schema; omit defaults in schema; fix path * minor bugfixes * added stimela_cabs file. Fixed problem with legacy-only flags on output * added stimela_cabs.yml * changed / to . * fixed problem with IFR gain flags overpropagating * Update __init__.py * Stimelation (#480) * fixes to AIPS leakage plotter (more table format support); started an AIPS bandpass plotter but this is WIP * add check for INDE value in PRTAB leakages * changed to hierarchical config --------- Co-authored-by: JSKenyon <[email protected]> Co-authored-by: Jonathan <[email protected]> Co-authored-by: Lexy Andati <[email protected]> Co-authored-by: Benjamin Hugo <[email protected]> Co-authored-by: Gijs Molenaar <[email protected]> Co-authored-by: JSKenyon <[email protected]> Co-authored-by: Athanaseus Javas Ramaila <[email protected]>
table.getcol() raises an error if SPWs have different sizes. It maybe better to first get the number of SPWs then loop over them when getting data. Or something similar to how different scans are treated.
The text was updated successfully, but these errors were encountered: