Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelization across regions #279

Open
cjw85 opened this issue Mar 24, 2021 · 1 comment
Open

Parallelization across regions #279

cjw85 opened this issue Mar 24, 2021 · 1 comment

Comments

@cjw85
Copy link
Member

cjw85 commented Mar 24, 2021

Thank you. Would you consider adding that as a feature to have it detect a set number of regions or chromosomes and parallelize itself? Also it does not seem like medaka_haploid_variant is able to take --regions as an input. How should I run it independently on each chromosome?

Originally posted by @jpn2021 in #263 (comment)

@cjw85
Copy link
Member Author

cjw85 commented Mar 24, 2021

@jpn2021 @Kirk3gaard

It looks like an oversight that medaka_haploid_variant doesn't take a --regions argument like other programs. We can look at adding that.

More generally the medaka programs don't implement parallelization across chromosomes/regions for two reasons:
a) most tasks are trivially parallelizable (so the programs can just be run multiple times)
b) the subtleties in handling hardware resources, e.g. implementing parallelization for CPU-only settings requires a different strategy to a single- or -multi-GPU setting.

Since medaka is fundamentally a piece of algorithm research, implementing some of these niceities takes a back seat to investigating new methods. We endeavour to stick to a Unix philosophy of creating composable tools that do one job such that users can use the tools flexibly in a manner that suits their situation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant