Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: I/O failed while merging #68

Open
lingjoyo opened this issue Feb 22, 2024 · 8 comments
Open

Error: I/O failed while merging #68

lingjoyo opened this issue Feb 22, 2024 · 8 comments

Comments

@lingjoyo
Copy link

lingjoyo commented Feb 22, 2024

Hi everyone

The minimac4 run well for some chromosomes, like chr1to10. But reported error from chr11 in merging step:

Writing temp files took 47 seconds
Merging temp files ...
Error: I/O failed while merging
Error: failed merging temp files

Here is my code:
chr=11 minimac4 \ 1000g_phase3_v5.chr${chr}.with_parameter_estimates.msav \ 1.2_preinputation_check/qc_3rd-updated-chr${chr}.vcf.gz \ --min-ratio 1e-6 \ --threads 10 \ -o c${chr}.imputed.vcf.gz

Has anyone met the same problem?

@jonathonl
Copy link
Contributor

Which version are your running (minimac4 --version)?

Is it possible that you are running out of disk space to store the output files?

@lingjoyo
Copy link
Author

lingjoyo commented Feb 22, 2024

Hi Jonathonl,
Thans for your reply.

It's minimac v4.1.6.
The computing resources are:

storage 16T free space
cup 32
memory 370G

It shouldn't be the problem of space. By now, it's running well on chr1, chr2.

@jonathonl
Copy link
Contributor

Can you provide the full log output?

@lingjoyo
Copy link
Author

Here is the log :

minimac v4.1.6

Imputing 11:1-20000000 ...
Loading target haplotypes ...
Loading target haplotypes took 1 seconds
Loading reference haplotypes ...
Loading reference haplotypes took 2 seconds
Typed sites to imputed sites ratio: 0.00066783 (246/368357)
4426 variants are exclusive to target file and will be excluded from output
Running HMM with 1 threads ...
Completed 200 of 1401 samples
Completed 400 of 1401 samples
Completed 600 of 1401 samples
Completed 800 of 1401 samples
Completed 1000 of 1401 samples
Completed 1200 of 1401 samples
Completed 1400 of 1401 samples
Completed 1401 of 1401 samples
Running HMM took 392 seconds

Writing temp files took 49 seconds
Merging temp files ...
Error: I/O failed while merging
Error: failed merging temp files

@jonathonl
Copy link
Contributor

I would try running with --temp-prefix c${chr}.tmp_ so that the temp files are written to the same directory as your output file.

@lingjoyo
Copy link
Author

lingjoyo commented Feb 23, 2024

It works well if I put everything into one folder:

./minimac4 1000g_phase3_v5.chr22.with_parameter_estimates.msav \
qc_3rd-updated-chr22.vcf.gz \
-o c22.imputed.vcf.gz \
--min-r2 0.3 --min-ratio 1e-6 \
--temp-prefix c22.tmp_ 

But it will report the merging error if I give the absolute path to all inputs and outputs:

${minimac4} \
${g1k_p3}1000g_phase3_v5.chr${chr}.with_parameter_estimates.msav \
${wkdir}/1.2_preinputation_check/qc_3rd-updated-chr${chr}.vcf.gz \
-o ${wkdir}1.3_imputaion_minimac4_g1kp3/c${chr}.imputed.vcf.gz \
--min-r2 0.3 --min-ratio 1e-6 \
--temp-prefix c${chr}.tmp_  

Here is the log:

Imputing 22:1-20000000 ...
Loading target haplotypes ...
Loading target haplotypes took 0 seconds
Loading reference haplotypes ...
Loading reference haplotypes took 1 seconds
Typed sites to imputed sites ratio: 1.53001e-05 (1/65359)
691 variants are exclusive to target file and will be excluded from output
Running HMM with 1 threads ...
Completed 200 of 1401 samples
Completed 400 of 1401 samples
Completed 600 of 1401 samples
Completed 800 of 1401 samples
Completed 1000 of 1401 samples
Completed 1200 of 1401 samples
Completed 1400 of 1401 samples
Completed 1401 of 1401 samples
Running HMM took 32 seconds
Writing temp files took 3 seconds
Merging temp files ...
Error: I/O failed while merging
Error: failed merging temp files

So the problem is that the code couldn't find the temp file. When I set --temp-prefix ${wkdir}/1.2_preinputation_check/c${chr}.tmp_, it reported

minimac v4.1.6

Imputing 22:1-20000000 ...
Loading target haplotypes ...
Loading target haplotypes took 0 seconds
Loading reference haplotypes ...
Loading reference haplotypes took 1 seconds
Typed sites to imputed sites ratio: 1.53001e-05 (1/65359)
691 variants are exclusive to target file and will be excluded from output
Running HMM with 1 threads ...
Error: could not open temp file (/full-path-to/1.3_imputaion_minimac4_g1kp3/c22.tmp_0_XXXXXX)

@lingjoyo
Copy link
Author

lingjoyo commented Feb 23, 2024

I guess the problem is about the setting to temp files. What's the right way of setting --temp-prefix if I want to submit the job using SBATCH?

@jonathonl
Copy link
Contributor

Relative vs absolute paths shouldn't matter. I'm guessing that the output paths are invalid or unreachable from the compute node. Are you creating the full directory paths before running minimac4 (i.e., does the /full-path-to/1.3_imputaion_minimac4_g1kp3/ directory already exist)? I would add tests to your batch script before the minimac4 command to test that you can create new files in the directory you are writing output files. This would look something like:

set -e
out_vcf=${wkdir}/1.3_imputaion_minimac4_g1kp3/c${chr}.imputed.vcf.gz
touch $out_vcf
minimac4 -o $out_vcf ${g1k_p3}1000g_phase3_v5.chr${chr}.with_parameter_estimates.msav \
${wkdir}/1.2_preinputation_check/qc_3rd-updated-chr${chr}.vcf.gz \
--min-r2 0.3 --min-ratio 1e-6 \
-o $out_vcf

Note: you don't need to use absolute paths in Slurm as long as the directory you call sbatch from is accessible from the compute node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants