You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The tools/mark_duplicates_and_sort.wdl is a bottleneck, especially for WGS. It's expensive, and that's partially because it is long-running and gets preempted. Do some local testing on the cluster to explore options for optimizing it:
right now the sort and markdup get 8 cores each. Is that the optimal ratio? If one is faster than the other, there'll be wasted cycles.
If we increase the number of overall cores, how does that affect runtime? (do we saturate I/O? is that different between HDD/SSD?)
Can we prevent localization of the input files to save an hour or so?
would giving more ram to the sort part of that step allow it to do less slow writes of temp files to disk and speed things up?
The text was updated successfully, but these errors were encountered:
The
tools/mark_duplicates_and_sort.wdl
is a bottleneck, especially for WGS. It's expensive, and that's partially because it is long-running and gets preempted. Do some local testing on the cluster to explore options for optimizing it:The text was updated successfully, but these errors were encountered: