Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

facet imaging picks up artifacts because no proper mask is used #151

Closed
rvweeren opened this issue Nov 1, 2016 · 57 comments
Closed

facet imaging picks up artifacts because no proper mask is used #151

rvweeren opened this issue Nov 1, 2016 · 57 comments

Comments

@rvweeren
Copy link
Collaborator

rvweeren commented Nov 1, 2016

I noticed that WSClean now runs only once when making the facet image (during the calibration). It looks like no mask is used anymore (only for the facet boundaries). Now I get clean components on artifacts around sources which subsequently end up in the "facet model". This is potentially very dangerous because if these artifacts belong to a source from a neighboring facet, which has not been calibrated yet, you can never get rid of them afterwards when that neighbor facet is calibrated. (because the artifacts now belong to another facet model which components are not added back)

I see the desire to speed up the imaging, but we should be careful here not pushing things too far here.

((Ideally we can do source detection on the fly in WSClean at each major cycle (via a sort of restart/resume option), that would remove the need to run WSClean twice.))

@darafferty
Copy link
Collaborator

There is a masking step (called "mask5" I think) that is supposed to filter the clean components so that they lie only in sources detected by PyBDSM (it's basically the same as the one used between the two WSClean runs during the full facet imaging). I will check that it is actually doing what it's supposed to do...

@aroffringa
Copy link

aroffringa commented Nov 2, 2016

A new possibly related feature: In WSClean 2.0 (and snapshot wsclean 1.12i or so) there's a "-continue" option, which might save time in an approach like 1) run shallow WSClean; 2) perform source detection on image; 3) continue WSClean with -continue option and use mask and lower threshold. To use -continue, you can't use -no-update-model-required in the first run, since WSClean will need the model column filled in the continued run. It also involves one extra inversion round compared to what Reinout suggests, but it is faster than restarting the second run from scratch.

I've also been thinking about more automated ways of cleaning without a mask. These can be ideas similar to what Reinout suggests (using source detection within a major iteration), and I think it might be interesting to discuss them during the next CITT meeting. I've been thinking about determining the threshold automatically during cleaning (determine noise in image and stop at user value x sigma), and once reached automatically continuing with a mask of the components currently covered in the model. A variable threshold over the image would also be an option, to be more robust to source artefacts.

@darafferty
Copy link
Collaborator

The filter seems to work in my tests -- you can check the *.mask5 image to see which components are included in the facet model (minus the calibrator, the model for which comes from the selfcal model images).

The ability to continue a WSClean run is a great addition -- I will do some tests and add an option to Factor to use it.

@rvweeren
Copy link
Collaborator Author

rvweeren commented Nov 2, 2016

Ah ok, that's a smart way to handle this issue.

@rvweeren rvweeren closed this as completed Nov 2, 2016
@soumyajitmandal
Copy link

I am using the latest version of Factor. I saw an issue with the masking.

screen shot 2016-11-15 at 11 23 43

during the selfcal, the masking is fine and its doing a good job.

But while the facet imaging, the masking is not picked up. So during the facet imaging the bright source is getting cleaned, but not deep enough. You can also see the diffuse emission is getting cleaned even without the mask.

screen shot 2016-11-15 at 11 24 52

screen shot 2016-11-15 at 11 24 26

@darafferty
Copy link
Collaborator

Is this the facet image from selfcal? If so, the mask is made only to filter the clean components from the model image (so no artifacts are included in the model). It is not used during cleaning, where only a clean mask that excludes regions outside of the facet is used.

@soumyajitmandal
Copy link

soumyajitmandal commented Nov 15, 2016

The second image is the: full-MFS-image.fits
the 3rd image is the: full-MFS-image.mask5

so if I specify a region in the directions.txt file, it should be picked up right?

@AHorneffer AHorneffer reopened this Nov 15, 2016
@AHorneffer
Copy link
Contributor

So you want to say, that the bright source left off the green cross is not picked up by the mask? That indeed look like a problem to me.

I guess that is because the wavelet module is not switched on in pyBDSM. (See also #159)
Well, I just saw that the wavelet module is hardcoded off for the full facet in facetselfcal_pipeline.parset.

@AHorneffer
Copy link
Contributor

Ah, yes, if you specify a clean mask for the full facet, then this is merged with the mask from pyBDSM. So that is the short-term fix / workaround for your problem.

@soumyajitmandal
Copy link

yes that is what I usually do if masking goes wrong. Since this bright source was missed only during the facet imaging (it was perfectly fine in the selfcal steps) I thought of bringing up the issue. Indeed the wavelet module is hardcoded.

may be it can be edited as an option in the parset? by the way, the calibrator is 1.4 arcmin in size.

@AHorneffer
Copy link
Contributor

Did it go wrong in the full-facet imaging in the facetselfcal step, or in the re-imaging of the facet when the selfcal for all facets is done?

@soumyajitmandal
Copy link

It went wrong in the full-facet imaging in the facetselfcal step. I am not yet to the reimaging yet.

@darafferty
Copy link
Collaborator

darafferty commented Nov 16, 2016

Sorry, I didn't explain it very well before. The full-MFS-image.fits image is made only to pick up bright, non-calibrator sources to further improve the subtraction (hence it's made with only a fraction of the full bandwidth). It's made with a clean mask that includes everything inside the facet (and excludes everything outside it). After the imaging, another mask is made that is then used to filter the clean components before subtraction. The calibrator region is excluded from this mask, as we want to use the better model from selfcal for the calibrator sources.

@soumyajitmandal
Copy link

hmm that is why the noise is much higher in the full-MFS-image.fits than the image42-MFS-image.fits

So will if I want to see the final facet image, would it be created during reimaging step?

@soumyajitmandal
Copy link

soumyajitmandal commented Nov 30, 2016

During the "Image1" step (I guess this is the new name for reimaging), the source is not getting cleaned enough even though its present in the mask. Is there an option to set this?

the full2 image:
screen shot 2016-11-30 at 17 36 18

the selfcaled image is attached earlier.

@rvweeren
Copy link
Collaborator Author

rvweeren commented Dec 2, 2016

Why does the selfcal image look fine, but the facet image shows these negative stuff to the east and west of the calibrator source ? Jit, you blame it on the clean not going deep enough, but I am surprised that this negative stuff around the source would be related to the shallow depth of the cleaning, but maybe I'm wrong.... (to me it more more seems like the calibration going wrong...)

Jit: can you show the "c" checkfactor images ?

@soumyajitmandal
Copy link

hmm the selfcal image looks fine indeed but the facet one does not. Initially I thought since I was using 6 bands for subtraction, it might be bad, but after reimaging, I still see the negative stuff. also the residual image has these negative stuff. hereby is the attached selfcal steps:

screen shot 2016-12-03 at 00 39 36

@rvweeren
Copy link
Collaborator Author

rvweeren commented Dec 3, 2016

So I think we can conclude the "negative emission" has nothing to do with how the masking is done (which I was already suspecting....).

The "deep" facet image (containing all blocks after re-imaging, full2?) should be more or less identical to the last DDE selfcal image. If it is not, this indicates something is not going right in terms of calibration solutions or how the source is subtracted/added back.

Can you display/show the last selfcal image (image42_iter9) and the deep re-imaged facet image side-by-side in ds9 with the same scaling ?

@rvweeren
Copy link
Collaborator Author

rvweeren commented Dec 3, 2016

I see the same problem as Git is seeing in my factor runs (I updated factor a few days ago). Attached image shows the last DDE selfcal image (left) + the deep facet image (right) of the same source (same scaling). This clearly show some strange negative emission (as if the amplitudes, or slow gains, are corrupted by some process, or applied in the wrong way).

factorproblem

David: this problem seems to be present at least for the last month or so, possible longer. Essentially, the last DDE selfcal image looks better than the deep (& shallow 6 block) facet image. Given that the shallow 6 block facet image already shows the artifacts the problem is occurring somewhere before (it seems to be related to how the DDE selfcal solution are applied ?). The DDE selfcal process itself seems fine, but then something goes bad with slow gains solutions it seems. (the same problem seems to repeat itself for all facets)

@rvweeren
Copy link
Collaborator Author

rvweeren commented Dec 3, 2016

BTW It is striking how similar to artifacts looks to Jit's images (I've trouble understanding what could cause that, but it clearly points to a serious problem affecting the visibilities of the facet-imaging). So far I've only seen those artifacts for the DDE calibrators, not for other sources in the field. In some way it almost looks like that the other sources in the facet have no (!) DDE correction applied to them (but not fully sure about it....)? Is that possible ?

@darafferty
Copy link
Collaborator

Now that I look closely, I can also see this problem (although it is much less severe). My shallow facet image (6 bands, on left) shows some holes N-S and some excess E-W over the sefcal image (24 bands, right):

screen shot 2016-12-05 at 13 53 10

Between the end of selfcal and the shallow image of the facet, the following is done:

  • merge the selfcal parmdbs (fast and slow) into a single parmdb (step merge_selfcal_parmdbs). This is just for convenience and is not used by Factor
  • merge and convert the selfcal parmdbs (fast and slow, plus the preapplied one if required) into a single gain solution (phase and amp) per timeslot and channel (step convert_merged_selfcal_parmdbs). This is the parmdb used to make the facet images and for subtraction of the new model. This step is done as it simplifies things quite a bit and speeds up the DPPP steps
  • all sources for the facet are added to empty datasets with dir-indep models and solutions, the dir-dep solutions are applied (from convert_merged_selfcal_parmdbs), and averaging is done
  • WSClean is then run on the averaged data

I suspect the problem is in the convert_merged_selfcal_parmdbs step, but I don't see anything obvious. Jit and/or Reinout, can you check whether there is anything strange in the solutions in this parmdb (the file should end with .convert_merged_selfcal_parmdbs)? BTW, if you want to check it for yourself, the script that makes this parmdb is factor/scripts/convert_solutions_to_gain.py.

@rvweeren
Copy link
Collaborator Author

rvweeren commented Dec 5, 2016

Did a quick check: Amplitudes look the same.

(it's hard to compare the phases because parmdbplot.py does not allow me to sum the slow phases to the TEC+CSP)

A potential issue: the dimension of freq axis of convert_merged_selfcal_parmdbs seems the number of blocks. If you apply TEC to the full resolution data it should correct it all way the down the channel resolution and hence the dimension of the freq axis of convert_merged_selfcal_parmdbs should be the total number of channels in the dataset (which is much higher than the number of blocks).

@darafferty
Copy link
Collaborator

Ah yes -- I completely forgot about the need to interpolate on a finer frequency grid. I will implement that and see if it helps (though it doesn't seem like it would cause the problems above...)

@rvweeren
Copy link
Collaborator Author

rvweeren commented Dec 6, 2016

I agree the freq interpolation error is probably not the cause these specific artifacts.

I'm still very mystified why the artifact pattern looks so similar between different observations (i.e., mine and Jit's).. The spoke patterns from the ionosphere are not visible, only these large-scale negative-positive patterns suggesting the gains are to blame.

Do other people see these artifact patterns as well. I think that all factor releases after ~Nov 1 (possibly before) are affected by this error ?

@soumyajitmandal
Copy link

The amplitudes look fine for me as well. By the way, I processed one more facet with this version earlier, and the effect seems to be there as well. Here is the image:

the left one is the selfcaled image and right one is from the facet image.

276_weird

@rvweeren
Copy link
Collaborator Author

rvweeren commented Dec 7, 2016

I also see the effect/problem in every facet.

@darafferty
Copy link
Collaborator

Looking over ~10 facets or so of a recent run, I noticed that the severity of this problem varies a lot from facet to facet and seems to be related to the distance of the calibrator to the facet center (getting worse at larger distances). Also, the negative artifacts seem to point roughly towards the facet center. I even seem to see it now in selfcal images in which there are multiple sources. Does anyone else see this too?

@twshimwell
Copy link

Is this all done with something like nbands_selfcal_facet_image = 6. I wonder if it is worth trying nbands_selfcal_facet_image = NBANDS.

@soumyajitmandal
Copy link

soumyajitmandal commented Dec 8, 2016

I have also tried nbands_selfcal_facet_image = 23 (which is basically all of them) (but am not sure if setting it as 23 forces it to use all of them since I see files till full-0006 from full-0001)

@eretana
Copy link

eretana commented Dec 8, 2016

@soumyajitmandal could post a pic of these files when they are done?

@darafferty
Copy link
Collaborator

There was a bug that caused it to ignore the nbands setting and always use 6 channels. It was fixed last week I think, so if you're running a version before that try updating and rerunning.

@soumyajitmandal
Copy link

Left one is with nbands_selfcal_facet_image = 23 and the right one is with nbands_selfcal_facet_image = 6

nbands

I have to check the version. Any ways to check how many bands it actually used from the log?

@darafferty
Copy link
Collaborator

The number of channel images for the full image step should tell you (you should 24 of them instead of 6).

@nudomarinero
Copy link
Collaborator

Just for the record. This is the difference between the dysco and non-dysco full images for one facet. They are minimal. I reckon that dysco is not directly related to this problem.
full_dysco_no_dysco

@aroffringa
Copy link

Good to know, thanks.

@twshimwell
Copy link

Could this be related to baseline based averaging?

@soumyajitmandal
Copy link

a month ago, I shared this image with you guys. I also showed this to Andre when I visited Astron couple of weeks ago. Do you think this might be related?

left one is without baseline base averaging, the right one is.

screen shot 2016-12-09 at 01 03 12

@eretana
Copy link

eretana commented Dec 9, 2016

@twshimwell I compared selfcal casa images with facet wsclean images done without baseline averaging and they look similar. No striking difference in the artifacts.

ds9
cal_casa

@twshimwell
Copy link

your wsclean images were without baseline averaging Edwin?

@eretana
Copy link

eretana commented Dec 9, 2016

Yes. Upper one is wsclean, bottom one is casa.

@twshimwell
Copy link

OK. Well maybe it is worth trying one of the obvious examples (like the compact source in Lockman that you have Jit) with and without baseline based averaging enabled and seeing what happens.

@rvweeren
Copy link
Collaborator Author

rvweeren commented Dec 9, 2016

I'm not sure if we can draw a conclusion from Edwin images (for example if the phase centers of the facet image and the selfcal images are quite similar you could get similar results). Because if David is correct the problem gets worse when the calibrator source is far away from the facet image center...

(Also, it is not clear to me if the problem is present in Edwin's image in the first place...I do not see a similar type of error as is visible in Jit's and my image...).

darafferty added a commit that referenced this issue Dec 9, 2016
@darafferty
Copy link
Collaborator

Well, it looks like Tim was right -- it appears that the BL-dependent averaging in WSClean is the culprit, as I reran a problematic facet with wsclean_bl_averaging = False and things look much better. I am trying now to understand whether there was an inadvertent change at some point in how the nwavelengths value is calculated or whether is was always wrong...

@darafferty
Copy link
Collaborator

OK, it seems the nwavelengths calculation was ignoring bandwidth smearing, which tends to be more important in Factor than time smearing. The value of nwavelengths when considering bandwidth smearing is a factor of 2-3 times lower. I will alter the calculation and test again.

@twshimwell
Copy link

Great (Reinouts suggestion but he just didn't have his computer to hand).

@darafferty
Copy link
Collaborator

Actually, I was confused -- the BL-dependent averaging in WSClean is not done over frequency, right? So, only the time smearing should be considered. Then I'm still not sure where the error is...

@aroffringa
Copy link

No, there's no BL averaging over frequency. It seems most likely to me that the data is still over-averaged, i.e. that nwavelengths is too large. What calculation is used?

The other options are that there's a bug in WSClean's bd averaging, or in the way Factor calls WSClean when BD averaging is enabled. Of course I can't rule it out 100% but I think it's not very likely there's a bug in WSClean's bd averaging, since it has been tested quite a lot by now, and simple tests show the effect of it as it is expected. BD averaging is never used when subsequent model data is used, right? (since WSClean will never fill the model data column when BD averaging...)

@darafferty
Copy link
Collaborator

OK -- thanks for the clarification. Factor calculates the nwavelengths value using the equations on the WSClean wiki (and using max_BL ~= 1 / theta_rad, where theta_rad is the resolution). Probably this was too aggressive, so I tried dividing that value by a factor of two and it improved a lot, but I still see a little effect. I will try a factor of 4 now and likely this will work fine.

No, no BD averaging for the model data -- the model data is predicted with a separate call to WSClean (with no BD averaging).

@darafferty
Copy link
Collaborator

A factor of 4 reduction over the old value of nwavelengths seems to work well for my test facet, with no discernible difference to the case without BD averaging, so I've updated the master branch to use this reduced value. Please try the new version and report here if the effect is still present.

@twshimwell
Copy link

I guess this will slow things down a fair bit. Is it worth checking with different versions of wsclean? Or has nothing changed in that part of the code? I'd have thought this issue may have been spotted earlier as we have been using based time averaging with the same averaging parameter for a while.

@aroffringa
Copy link

I did not make any changes to the way the data are averaged in WSClean since the initial bd averaging implementation, so I expect you don't see any changes between >= 2.0 and the initial bd averaging version 1.12b or so.

WSClean outputs some information about with what factor the visibilities are reduced, as well as something like "Averaging factor for longest baseline: 2 x . For the shortest: 811 x". If your longest baseline is averaged, your setting is too strong (assuming the input data have been maximally time-averaged already). If your longest baseline is not averaged (and the longest baseline is included in the imaging, i.e. you image at max resolution), and you still see artefacts when using bd averaging, it would mean the longest baselines are already over-averaged to begin with. That's just less visible without bd averaging, because it only affects the longest baseline, while if you apply the same averaging strength in a bd way, all baselines are of course affected. What are typical numbers for nwavelength, and longest/shortest b averaging?

@wndywllms
Copy link
Collaborator

Just an update, seems like this is better in my images since you reduced the nwavelengths:

image

@rvweeren
Copy link
Collaborator Author

Yes, I can confirm that the facet image looks good now (at the cost of slowing things down).

t1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants