Skip to content

GHRSS/ffapipe

Repository files navigation

ffapipe: The GHRSS Survey FFA Pipeline

License Stars

Contents

Citation
Sky Coverage
DM and Period Coverage
Directories
Scripts
Dependencies
Notes


This is the code being used to analyse data obtained from the Giant Meterwave Radio Telescope (GMRT) as part of the GMRT High Resolution Southern Sky (GHRSS) survey, written from scratch by Ujjwal Panda in 2019. It is written in pure Python, but it depends on the PRESTO package for some of its processing capabilities (such as dedispersion and folding). This pipeline uses riptide, an FFA implementation developed by Vincent Morello at the University of Manchester.

Citation

If you use this code, or the output it produces, in a scientific publication, do cite the following paper: The GMRT High Resolution Southern Sky Survey for Pulsars and Transients. III. Searching for Long-period Pulsars. This citation is available in the CITATION.bib file as well, for direct use with BibTeX or BibLaTeX.

DM and Period Coverage

The pipeline searches dispersion measures up to 500 pc per cubic centimeter, in steps of 0.1 (GWB) or 0.2 (GSB) pc per cubic centimeter. This amounts to 5000 or 2500 DM trials, respectively.

The pipeline searches 4 different period ranges, and the parameters of the FFA search are set accordingly. These are:

  • 0.1 to 0.5 seconds
  • 0.5 to 2.0 seconds
  • 2.0 to 10.0 seconds
  • 10.0 to 100.0 seconds

Directories

There are 4 main directories:

  1. configurations: This directory stores the configurations for the pipeline. The ghrss_config.yaml contains the configuration for the main pipeline, whereas the ffa_config directory contains the configurations for the different parameter spaces searched for by riptide. There is a manager_config.yaml that stores the overall configuration for riptide, and then there are 4 different files for each different period space searched by our FFA pipeline:

  2. sources: This directory stores the coordinates for all pointings that are observed as part of the GHRSS survey. These, along with the associated timestamp files, are used by the pipeline to construct the metadata for each raw file. There are two source lists: ghrss.1.list and ghrss.2.list.

  3. preprocessing: This directory stores configuration parameters for certain preprocessing scripts that are used by this pipeline, such as GPTool. GPTool is primarily used for RFI mitigation (in both frequency and time domains). It reads the configuration variables from the corresponding gptool.in files for each backend (see the note on backends), stored in their corresponding sub-directories here.

  4. src_scripts: This is where the main processing code resides. The primary purpose of this code is automation. It runs the necesssary Python functions/scripts and shell commands for each step, and ensures that each process waits for the previous one to finish. It also uses a simple mechanism that allows the pipeline to restart from where it left off, in case of a crash.

Scripts

Depending on how you want to run the pipeline, you can either of two scripts:

  • The single_config.py runs the pipeline on a single machine. If your machine has multiple cores, you can get a speedup by specifying the number of cores you want to use in the ghrss_config.yaml file.

  • The multi_config.py file. Originally, this script was intended for automating the run of the pipeline on multiple machines. However, I could not get a framework like paramiko to work at the time (for automating the login into each of the nodes and setting up the conda environment). This file is no diffferent from the single_config.py in any way, except for the extra node argument.

The pipeline's run can be monitored using the the_monitor.py script. This uses the curses library to construct a simple terminal user interface, where you can see both the state of the current run of the pipeline, as well as the files it has already processed. It uses the logs produced by the pipeline, as well as file system information, to do this.

Dependencies

The pipeline relies on the following Python packages:

The best way to ensure that all these dependencies are present on your machine is to use a conda environment. The the_monitor.py script relies on the curses package in the Python standard library, which in turn depends on the ncurses backend. This implies that this particular script may not run on a Windows system.

Additionally, this pipeline has the following non-Python dependencies:

There are also certain in-house scripts that this pipeline depends on for processes such as zero DM filtering, filterbank file creation, and so on. I will try to add these scripts to this repository soon 😅. If you find a bug in the pipeline or have any issues in running it on your system, let me know in the issues 😁 👍 !

Notes

  1. There are two backends in use at GMRT: the GMRT Software Backend (GSB) and the GMRT Wideband Backend (GWB). As their names indicate, the former is narrowband, while the latter is wideband (installed as a part of the upgraded GMRT, a.k.a. uGMRT). The scripts in this repository work with data from both backends. The following table lists out some of the relevant parameters for each backend:

    Bandwidth Sampling Time Center Frequency
    GSB 8, 16 or 32 MHz 61.44 microseconds 336 MHz
    GWB 100, 200, or 400 MHz 81.92 microseconds 400 MHz
  2. This pipeline uses an old version of riptide (v0.0.1, to be precise 😅). The codes here may work with v0.0.2 and v0.0.3, but they definitely would not work with any of the newer versions (v0.1.0 and beyond) because of a massive refactor of riptide's entire codebase. A version of the pipeline that works with newer versions of riptide is in the works 🔨.

Releases

No releases published

Packages

No packages published

Languages