Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on Jucier Pre, exit status 57 #5

Open
carla-hazelf opened this issue Nov 29, 2023 · 3 comments
Open

Error on Jucier Pre, exit status 57 #5

carla-hazelf opened this issue Nov 29, 2023 · 3 comments

Comments

@carla-hazelf
Copy link

carla-hazelf commented Nov 29, 2023

Hello,

Thank you for developing this pipeline.

I run the pipeline on a linux HPC system with the following input using -profile conda:

nextflow run WarrenLab/hic-scaffolding-nf \
    -profile conda --juicer-tools-jar /path/to/juicer-tools-jar.jar  \
    --extra-yahs-args "-e GATC"    \
    --contigs /path/to/fasta.fasta \
    --r1Reads path/to/hi-c/*_1.fq.gz \
    --r2Reads path/to/hi-c/*_2.fq.gz

And I receive the following error;

executor >  local (7)
[b2/63a1db] process > PRINT_VERSIONS     [100%] 1 of 1 ✔
[97/c491ea] process > SAMTOOLS_FAIDX (1) [100%] 1 of 1 ✔
[27/aa6e6b] process > CHROMAP_INDEX (1)  [100%] 1 of 1 ✔
[ca/a11554] process > CHROMAP_ALIGN (1)  [100%] 1 of 1 ✔
[ef/6e00dd] process > YAHS_SCAFFOLD (1)  [100%] 1 of 1 ✔
[fe/5e01fe] process > JUICER_PRE (1)     [100%] 1 of 1, failed: 1 ✘
[0a/392d9f] process > ASSEMBLY_STATS (1) [100%] 1 of 1 ✔
ERROR ~ Error executing process > 'JUICER_PRE (1)'

Caused by:
  Process `JUICER_PRE (1)` terminated with an error exit status (57)

Command executed:

  juicer pre -a -o out_JBAT         yahs.out.bin         yahs.out_scaffolds_final.agp         contigs.fa.fai
  
  asm_size=$(awk '{s+=$2} END{print s}' contigs.fa.fai)
  java -Xmx36G -jar /nfs/home/finnca/programmes/Juicebox-2.20.00/out/artifacts/juicer_tools_jar/juicer_tools.jar         pre out_JBAT.txt out_JBAT.hic <(echo "assembly ${asm_size}")

Command exit status:
  57

Command output:
  WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
  WARN [2023-11-28T16:55:07,678]  [Globals.java:138] [main]  Development mode is enabled
  Using 1 CPU thread(s) for primary task
  Using 10 CPU thread(s) for secondary task

Command error:
  [I::main_pre] make juicer pre input from BIN file yahs.out.bin
  [I::make_juicer_pre_file_from_bin] 0 read pairs processed
  [I::main_pre] genome size: 648299410
  [I::main_pre] scale factor: 1
  [I::main_pre] chromosome sizes for juicer_tools pre -
  PRE_C_SIZE: assembly 648299410
  [I::main_pre] JUICER_PRE CMD: java -Xmx36G -jar ${juicer_tools} pre out_JBAT.txt out_JBAT.hic <(echo "assembly 648299410")
  [I::main_pre] Version: 1.1
  [I::main_pre] CMD: juicer pre -a -o out_JBAT yahs.out.bin yahs.out_scaffolds_final.agp contigs.fa.fai
  [I::main_pre] Real time: 0.004 sec; CPU: 0.003 sec; Peak RSS: 0.001 GB
  WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
  WARN [2023-11-28T16:55:07,678]  [Globals.java:138] [main]  Development mode is enabled
  Using 1 CPU thread(s) for primary task
  Using 10 CPU thread(s) for secondary task
  out_JBAT.txt does not exist or does not contain any reads.

EDIT; I am new to Hi-C, and I did not prepare this data myself; am I misunderstanding any preprocessing steps I need to do with the HiC illumina data, or am I not understanding the code? Thank you

@esrice
Copy link
Member

esrice commented Nov 29, 2023

First off, this is unrelated to your issue, but if you're running it on a cluster, you'll need to set some additional options to tell nextflow to run the jobs on the cluster nodes instead of the head nodes. See here for more info: https://www.nextflow.io/docs/latest/executor.html

Other than that, you appear to be running the pipeline correctly. The only step that failed is the step to make a heatmap that you can open in juicebox. However, the error message says that there were 0 reads in the input, so my guess is that the scaffolding didn't work either. So first, take a look at the assembly output and the stats related to it. Does it look like the scaffolding actually worked (e.g., is the N50 bigger after scaffolding than before)? If not, how many reads did you start out with vs. how many got aligned? You can look at all the intermediate files by going into the work directory and then the first few characters of the directory for that step are in the nextflow output — for example, the alignment step's work directory should start with work/ca/a11554, so you can look there for the bam files and any error messages that step generated (in .command.err).

Hope this helps!

@gargkritika
Copy link

Hi
I am also getting the same error. Were you able to resolve this issue?

Any help is appreciated.

Best
Kritika

@carla-hazelf
Copy link
Author

Hi,
Sorry for delayed response- @esrice, thank you for your quick response at the time, it's really appreciated.
@gargkritika personally, I was having issues with nextflow more generally, so I ended up doing it manually. I followed the description here; https://github.com/GenomicsAotearoa/High-quality-genomes/blob/main/Centrostephanus/Urchin_HiCScaffolding_V4.ipynb

So, mapping my reads to my assembly using bwa (-5SP is the option for Hi-C data). I then marked duplicates in the .sam file using samblaster. Converted it to BAM, and filtered out secondary alignments and unmapped reads; then further sorting by coordinates/read names for downstream Hi-C analyses.
Then it was a matter of following the yahs protocol; https://github.com/c-zhou/yahs
I'm no expert! But this worked for me.
Hope this helps. If you're needing this nextflow pipeline, try following the suggestion above and see if it works for you-- I don't recall if I got around to trying it or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants