Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Old version?) PyBDSM takes forever (or hangs) inside Stimela container #675

Open
o-smirnov opened this issue Sep 16, 2020 · 5 comments
Open
Assignees

Comments

@o-smirnov
Copy link
Collaborator

Got the following simple recipe courtesy of @Athanaseus:

import stimela
import sys

INPUT = '.'
OUTPUT = '.'
MSDIR = 'msdir'

IMAGE = sys.argv[1]
PIX_THRESH = int(sys.argv[2])
ISL_THRESH = int(sys.argv[3])


stimela.register_globals()

recipe = stimela.Recipe('Source Finder', ms_dir=MSDIR)
#JOB_TYPE='singularity', singularity_image_dir=os.environ["STIMELA_PULLFOLDER"])


recipe.add('cab/pybdsm', 'source_finder',
        {
            "filename"        :   "{}:output".format(IMAGE),
            "outfile"            :   "pybdsm_{}".format(IMAGE[:-5]),
            "thresh_pix"     :    PIX_THRESH,
            "thresh_isl"      :    ISL_THRESH,
            "clobber"          :    True,
            "catalog_type" :    "gaul",
            "group_tol"    : 12,
            "adaptive_rms_box" : True
        },
            input=INPUT,
            output=OUTPUT,
            label='src_finder:: Sourcery')

recipe.run()

Funning it in /net/young//home/oms/projects/OldDevils/selfcal-4C12.03/test with 1558752655-MFS-image.fits and thesholds of 50, 30, the thing loops forever late in the Gaussian fitting stage:

# Fitting islands with Gaussians .......... : [-] 357/390

There's one CPU core using 100%. Some of these runs eventually succeed (after e.g. 18 hours!!)

Running the same job natively in the KERN version of pybdsf, with the same options, the job finishes in <10mins. Curiously though, it only fits 319 Gaussians.

So I assume there's something different in the Stimela defaults compared to the normal PyBDSM defaults. But what? And if so, this is misleading and should be fixed.

@KshitijT
Copy link
Collaborator

Curiously though, it only fits 319 Gaussians.

Possibly some of the Gaussians were flagged in the fitting ?

@o-smirnov
Copy link
Collaborator Author

But why the different flagging behaviour inside/outside of Stimela then?

@KshitijT
Copy link
Collaborator

But why the different flagging behaviour inside/outside of Stimela then?

Actually, what's the number of Gaussians fit in the other case? Is it significantly different? (I am suspecting version differences).

@o-smirnov
Copy link
Collaborator Author

No sorry, the number of Gaussians is a red herring, I had my thresholds set backwards. When I set them consistently with the recipe above, I get 390 Gaussians in both cases. Native version still finishes in reasonable time.

I suspect the Stimela version is super old, I can see the Stimela base is 1.2.0. I tried to update this to 1.6.0, but now the container fails with

# ModuleNotFoundError: No module named 'bdsf'

I think this is because PyBDSM needs to be run with Python 2.7 still, while the new Stimela base uses Python 3 by default. There's a way to switch it to 2.7, but I need @SpheMakh to tell us how.

@o-smirnov o-smirnov changed the title PyBDSM takes forever (or hangs) inside Stimela container (Old version?) PyBDSM takes forever (or hangs) inside Stimela container Sep 16, 2020
@o-smirnov
Copy link
Collaborator Author

@SpheMakh bump. This is holding us up badly. Please either roll us a new PyBDSM container, or teach a man to fish, i.e. remind me how to make a Stimela container where run.py is forced to run in python 2.7 instead of 3.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants