You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some atmos_ensstat jobs are failing when running the GEFS C384 with at least 2 memebers. A log file from a failed atmos_ensstat task can be found on WCOSS2: /lfs/h2/emc/ptmp/eric.sinsky/GEFS/COMROOT/customexp/gw_pr2788/logs/2021090900/atmos_ensstat_f003.log. The error seems to be due to a missing file that occurs for some mpmd jobs (there is a mpmd task for each product resolution).
A sample mpmd log file from a failed mpmd task can be found here: /lfs/h2/emc/stmp/eric.sinsky/RUNDIRS/gw_pr2788/gefs.2021090900/atmos_ensstat.219722/mpmd.1.out.
A sample mpmd log file from a successful mpmd task can be found here: /lfs/h2/emc/stmp/eric.sinsky/RUNDIRS/gw_pr2788/gefs.2021090900/atmos_ensstat.219722/mpmd.0.out
What should have happened?
All atmos_ensstat jobs should succeed when running C384 GEFS.
What machines are impacted?
WCOSS2
Steps to reproduce
Set up a GEFS test case with at least 2 members and with an FHOUT_HF_GFS of 3.
Run a test case. When the atmos_ensstat jobs run, not all these jobs will succeed.
Additional information
Some (not all) atmos_ensstat jobs failed running with and without replay ICs. This issue seems to not occur for C48 runs, which explains why this bug does not affect the GEFS CI tests. So far it has been found to only affect C384 GEFS runs. This issue has been occurring since the atmos_ensstat task has been added to the global workflow. Investigation of this issue has been ongoing, but the root cause has not been found yet.
Do you have a proposed solution?
No solution yet.
The text was updated successfully, but these errors were encountered:
Update: This bug may have to do with setting FHOUT_HF_GFS to 3 and may not be due to the model resolution. atmos_ensstat jobs at forecast hours that are divisible by 6 are successful.
I found that this issue originates in the atmos_prod task. In parm/config/gefs/config.atmos_products, FHOUT_PGBS is equal to FHOUT_GFS by default. If FHOUT_GFS is equal to 6, then FHOUT_PGBS will also equal 6, which means that supplemental gfs pgb files at 1.0 and 0.5 deg will not be generated for f003, f009, f015, etc (when FHOUT_HF_GFS=3) . atmos_ensstat depends on pgrb files for 1.0 and 0.5 deg to be generated at f003, f009, f015, etc. otherwise atmos_ensstat will fail for f003, f009, f015, etc.
What is wrong?
Some atmos_ensstat jobs are failing when running the GEFS C384 with at least 2 memebers. A log file from a failed atmos_ensstat task can be found on WCOSS2:
/lfs/h2/emc/ptmp/eric.sinsky/GEFS/COMROOT/customexp/gw_pr2788/logs/2021090900/atmos_ensstat_f003.log
. The error seems to be due to a missing file that occurs for some mpmd jobs (there is a mpmd task for each product resolution).A sample mpmd log file from a failed mpmd task can be found here:
/lfs/h2/emc/stmp/eric.sinsky/RUNDIRS/gw_pr2788/gefs.2021090900/atmos_ensstat.219722/mpmd.1.out
.A sample mpmd log file from a successful mpmd task can be found here:
/lfs/h2/emc/stmp/eric.sinsky/RUNDIRS/gw_pr2788/gefs.2021090900/atmos_ensstat.219722/mpmd.0.out
What should have happened?
All atmos_ensstat jobs should succeed when running C384 GEFS.
What machines are impacted?
WCOSS2
Steps to reproduce
FHOUT_HF_GFS
of3
.Additional information
Some (not all) atmos_ensstat jobs failed running with and without replay ICs. This issue seems to not occur for C48 runs, which explains why this bug does not affect the GEFS CI tests. So far it has been found to only affect C384 GEFS runs. This issue has been occurring since the atmos_ensstat task has been added to the global workflow. Investigation of this issue has been ongoing, but the root cause has not been found yet.
Do you have a proposed solution?
No solution yet.
The text was updated successfully, but these errors were encountered: