Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance for VCFAnnotator #266

Open
korikuzma opened this issue Sep 19, 2023 · 9 comments
Open

Improve performance for VCFAnnotator #266

korikuzma opened this issue Sep 19, 2023 · 9 comments
Labels
2.0-alpha Issues related to VRS 2.0-alpha branch performance Improvements to performance priority:high High priority Stale-exempt

Comments

@korikuzma
Copy link
Contributor

Parallelize? If we want people to use the tool, we have to improve the performance

@korikuzma korikuzma added 2.0-alpha Issues related to VRS 2.0-alpha branch priority:medium Medium priority performance Improvements to performance labels Sep 19, 2023
@wesleygoar
Copy link
Contributor

I do think we should parallelize. I think that would be an easy win.

@wesleygoar
Copy link
Contributor

I think we should def investigate ways to improve single threaded performance as well.

@korikuzma
Copy link
Contributor Author

I can take a stab at it since people really want this.

@korikuzma korikuzma added priority:high High priority and removed priority:medium Medium priority labels Sep 20, 2023
@korikuzma korikuzma self-assigned this Sep 20, 2023
@korikuzma
Copy link
Contributor Author

Goal: 10k/s

Copy link

This issue was marked stale due to inactivity.

@github-actions github-actions bot added the Stale label Nov 20, 2023
@korikuzma korikuzma removed their assignment Dec 8, 2023
@quinnwai
Copy link
Contributor

quinnwai commented Feb 22, 2024

Using v2.0.0a3 on a Mac Book Pro (13-inch 2017, 2.3 GHz Dual-Core Intel Core i5, 8 GB memory)...

  • 13.09 s ± 1.76 s for ~10,000 (9,889) variants so ~755 variants/s. Default params so getting ref and alt IDs and timing averaged over 10 rounds.

@wesleygoar
Copy link
Contributor

wesleygoar commented Feb 22, 2024

Running the vcf annotator script on a MacBook Pro (16-inch 2021, M1 Pro, 32 GB Memory) using default params on main vrs-python branch. (commit 64fee4c)
File: vcf including all chr1 variants from the GIAB benchmarking file

  • 1,703,455 Variants
  • Average vars/sec over 5 runs = 6,062.07 variants/sec
  • This means we are annotating over 12,000 alleles/sec

@korikuzma
Copy link
Contributor Author

korikuzma commented Feb 22, 2024

@quinnwai @wesleygoar Thanks. I think it would be beneficial to add the commit y'all used

@wesleygoar
Copy link
Contributor

@quinnwai @wesleygoar Thanks. I think it would be beneficial to add the commit y'all used as well as the command / params you used to test

@korikuzma I have updated my post to include the commit I was on during testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.0-alpha Issues related to VRS 2.0-alpha branch performance Improvements to performance priority:high High priority Stale-exempt
Projects
None yet
Development

No branches or pull requests

4 participants