-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
combining 3 nights data through factor #194
Comments
I prefer it if you wouldn't copy&paste parts of (generic-)pipeline logfiles here (they are bleeping hard to read and half of it is usually missing), but attach the entire logfile to the post. Indeed the smoothing fails, which is then the reason why most of the data gets flagged. (That's what NDPPP does it if encounters a NAN as a calibration value it is asked to apply to data.) @rvweeren (or @darafferty): does the @soumyajitmandal: You could try setting |
I used Now the smooth_amps.py did not work out. I ran this outside factor:
2.invalid value encountered in greated high_ind = numpy.where(amp > 5.0) I would like to point out, few months ago, I tried factor for two different nights with 40 subbands and we fixed the problem if we had two different antennas , one being flagged (so essentially two different time span) #76 The only differences in these two times are: Full subband and 3 nights data. |
The previous run was with different version of NDPPP than the recent one. So the two parmdbs1 were created with different lofar softwares. I think previously if the interpolation did not find a value, it used to put Zeros but now its putting NaNs instead. Is this the issue, probably? |
But the error message you quoted is from |
I thought putting |
|
@rvweeren: Ah, O.K. |
I checked Maybe it fails because of that and it cannot handle a very large gap (although looking at the code the spline does not directly use that time axis in the spline fit). The easiest way to debug this is to take the parmdb and run smooth_amps_spline.py manually on it and check where it fails in the script. |
Hmm, update/correction, apparently it does use if self.parset['calibration_specific']['spline_smooth2d']: (from |
It might be that the script is failing because there are NaN input values. Otherwise I cannot see why
could give an error message. I guess you need to open the scripts and do some debugging here and figure out precisely where it goes wrong, in |
yeah indeed the error was with smooth_amps.py this time. so I ran: I did a print on 'ampl' after this line: Where the values were NaNs. Whereas, in my successful run few months earlier, doing the same thing gives me Zeros. |
You need get to get back further, The question for you to answer is (1) does the input parmdb contain NaNs and (2) is that the reason why it fails (because Check |
yes |
Ok, so it looks like |
hmm okay. Is it a good idea to put zeros in place of NaNs? Or is it not a good solution? |
You should try to edit smooth_amps.py so that it is NaN proof (with minimal other changes). Without having looked at it in detail I think that should not be very difficult to do. (I probably do not have time to look at it myself over the next two weeks, after that I might have time to help with that and also check |
@soumyajitmandal: Can you put a parmDB with NANs somewhere where I can find it, to test the code? @ALL: What should we do with the flagged data? Replacing the amplitudes with the median value is straight forward, but what should we do with the phases? Setting them to zero would be the most simple. Finding a useful median for phases is not only more complicated, but I also don't know if it is a good idea. |
I did a test by putting zeroes instead of NaNs but normalisation is messed up in that process. Rather using a masked array might be useful ? Attached is the parmdb. |
Including the masked array in a different part seems to be working so far in my case. I put I changed (line 146): |
Well, having had a look at the parmDB you attached here I think it would be important to find our why you have so many NANs in the parmDB. Did you flag large parts of the data? (And why would NDPPP create parmDB entries with NANs in that case, instead of not creating the entries at all.) Or are there parts of the data where NDPPP couldn't get a solution even if there was data? |
Since there is a time gap between different nights, I thought its producing the NaNs. In general when I processed different nights data separately, I did not see the NaN issue. |
Well, the smoothing is done on single time-series (i.e. separate for antenna, polarization, and channel), and several of these time-series are fully flagged. |
Btw. here is a version of the script, that will not only work with NANs, but also doesn't produce the RuntimeWarnings: |
I will try this version, thanks a lot. I was trying with the temporary fix that I wrote in my previous comment (which I think you also put in the modified text file) and have an error. I reproduced the error outside the pipeline while using the
Traceback (most recent call last): Has anyone seen this earlier? |
I found and fixed a problem in |
Thanks David! I have tried this code last week after we chatted at the busy week so I do have the parmdbs created with the latest version. I will just do a git pull and rerun it. |
Hi David, it seems like convert_solutions_gain.py works now. Probably we need the same kind of fix in the
reset_amps.py L340794_SB000_uv.dppp.pre-cal_126400A74t_121MHz.pre-cal_chunk12_126407AFCt_4g.convert_merged_selfcal_parmdbs test_parm
Traceback (most recent call last):
File "./reset_amps.py", line 77, in
main(args.instrument_name, args.instrument_name_reset)
File "./reset_amps.py", line 32, in main
freqs = parms['Gain:1:1:Ampl:{s}'.format(s=antenna_list[0])]['freqs']
IndexError: list index out of range
|
I think this problem might be fixed by commit 4dc0157. To test it, update Factor, reset the state for the pipeline so that it repeats the |
OK, I added a check for missing stations to |
Hi David, after the last update of yesterday on CEP3 I have this error
I think it is related to the new commit. |
The error seems to be:
|
Thanks -- the 'gaps_ind' problem should be fixed now (and I updated it on CEP3). |
Okay now I think it has passed the Anyway, now its failing at the plotting solutions step. But its due to the size issue. here it is: |
I've modified the |
okay. Probably its gonna work, but it says: |
Oops -- copy and paste error. Try it now. |
plotting step passed. Plots look quite fine. now its preparing the imaging chunk dataset. I will keep updated. :) |
Factor doesn't do anything special with data from multiple nights -- they're handled just like data from a single night. So, it will smooth the amplitudes and normalize them in the same way (so a single normalization is done across all three nights). No averaging is done. It's probably good to flag the periods during night 1 when the solutions are noisy (between hours 6-7 and after hour 8). I'm not sure whether they're the cause of the poor results, though. |
Its very odd that the background noise essentially looks at the same level even though you have 3 times the data. I'd perhaps think the artefacts could stay similar but the noise should really good down a fair bit. I guess the "sources" in the 3day image that pop up near your bright source are not real? Did you check the masks throughout the calibration of the 3 day one? |
@darafferty I am also imaging them separately now (i.e: each facet has been gone through factor, I used a single model; added that to every night dataset; used gaincal-applycal. now imaging is running, by tomorrow I can see the result I hope). @darafferty @twshimwell Lets wait till the facet is finished. I checked the full image in the facetselfcal directory (which only has 1/6th ob the bandwidth, is there an option to include all band? Factor does not have that settings anymore, its by default 6). In the facetimage directory (which is running now) it should have the whole BW. I am expecting the background noise might be a bit better than what we are seeing now. |
@soumyajitmandal I was wondering: did you make sure that the same " If you have different models subtracted from the different nights, then Factor will screw up. (It will just assume that the same model was subtracted from all data and thus treat two of the nights wrong.) |
Each night should be init-subtracted separately right? Let me explain what I did: I processed three different nights independently till the init-subtract step. So after init-subtraction step, for each night I have (24, 24, 23 in total 71) |
No! (Feel free to add a few more exclamation marks, blinking effects or so.) Factor will use one model for all files in one frequency band. If you actually give it files for the same frequency band in which different models have been subtracted, then it will screw up. The way you started it, it will randomly(*) choose one of the three skymodels and use that for adding back the sources to the data from all three nights. So the two nights for which another model has been subtracted will get treated wrong. My suggestion to fix that is to choose the model of one of the three nights, and subtract that from the other two nights. (*) Well, not actually randomly, but it will be undefined behavior. |
this is the behaviour even if the models are explicitly specified in the factor parset? |
@twshimwell Yes. The internal data structure and the layout of the pipeline parsets only use one skymodel per frequency-band. |
@AHorneffer hmm. now I am rerunning with 2 different night dataset instead of 3 (should be a bit quicker). This time the msfiles contains measurement sets from both the different nights, and |
@soumyajitmandal Was the same skymodel subtracted from the measurement sets of both nights? Not explicitly specifying the model for each band in the parset is only a problem if Factor cannot find the (any) skymodel for a band. But then it will fail during the setup phase. |
That approach does not make sense (if I understand it correctly what you are attempting), because in the init-subtract you then subtracted a different skymodel (for one of the datasets) than what will be added back in factor. |
@AHorneffer: Can init-subtract step produce one single skymodel for each band when the ms inputs have different names for similar bands? Or we will even need to combine the ms's for the different nights in the pre-facet phase (i.e. concat subbands, direction independent calibration steps, etc.)? |
@duyhoang2014 I don't really understand your question! Some general answers:
|
@AHorneffer : sorry for not being clear. I meant that after the prefactor runs (to do amp. transfer, clock offset correction, direction independent phase calibration, etc.) on different nights separately, we will have concatenated MSs (of e.g. 10 subbands each, or a band) for these separate nights. These MSs for each nights have identical central frequencies (e.g. at 121MHz, 123MHz, etc); and they also have separated instruments (i.e. instrument_directionindependent). When prefactor does the initial subtraction, I can use all MSs of the nights as the input MSs (e.g. put all of the MSs into a single directory). My question is that can prefactor produce a single skymodel for all the nights at each central frequencies (e.g. 121MHz, 123MHz, etc.)? |
Yes, you can. That is indeed they way I intended the pipelines to be used when I wrote them, and for interleaved observations it is also the way they should be used. If you have several "full" nights, then this has the drawback that the imaging takes even longer, so I unsure what to recommend: imaging all nights together, or only image one night and then subtract that skymodel from the other nights.
If you want to run Factor on the data, then you need one(!) skymodel that has been subtracted from all time-chunks (e.g. nights). I've been repeating this over and over again in the last two days, so: Do I need to repeat that if you run |
@AHorneffer: Many thanks!
Ok. So prefactor (e.g. Intial-Subtract) does work with interleaved data sets.
Yes, one skymodel (created from all time-chunks) is used for each MS (each band) to run Factor, which I usually do, but forgot when asking. Sorry for asking that again.
Currently I have 2 4-hours observations of the same field, which are not too long observations. I would try to combine them all in the Initial-Subtract step. Let's see how things go... |
hi everyone,
I am combining 3 nights of data (full subband) with factor. It was going fine until the first amplitude calibration. Probably interpolating them in different time flagged all the data. Here is the message:
The text was updated successfully, but these errors were encountered: