Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling Example 2: MVP1 query: How could we be faster? #2373

Closed
edeutsch opened this issue Sep 11, 2024 · 2 comments
Closed

Profiling Example 2: MVP1 query: How could we be faster? #2373

edeutsch opened this issue Sep 11, 2024 · 2 comments

Comments

@edeutsch
Copy link
Collaborator

This is the classic drug treats disease query that is an MVP1 query. What may treat Castleman Disease?
https://arax.ci.transltr.io/?r=293768

Here's my analysis:

0.15s Launch, setup, and launch query to xDTD
3.9s xDTD has returned its results. Now Expand only starts sending queries to KPs
------- 0.1s since KP request: Automat-DrugCentral responds already!
------- 0.3s since KP request: MolePro responds already!
------- 0.6s since KP request: RTX-KG2 responds, nice!
----- <1.0s several other KPs are queried and respond with no edges, but do so in less than a second
------- 2.4s since KP request: Service Provider lumbers across the finish line panting heavily
------- 30.0s since KP request: knowledge-collaboratory does not respond in 30 seconds and request is abandoned
0.6s Add NGD edges to the graph
2.6s Remove general concepts from the knowledge graph
0.1s Resultify and Ranker and post processing
(did not record S3 storage but probably 0.3 seconds)

37.8s: Total processing time from receipt to begin storing Response
3.9s: Time spent obtaining xDTD results
30.0s: Time spent waiting for KPs to respond: MolePro, RTX-KG2 are sub-second. Knowledge collaboratory times out at 30s
3.9s: Other processing of data

Two local processing steps appear to stand out:

Computing NGD edges. 0.6 seconds seems pretty reasonable, but could it be 0.06 seconds?

Removing general concepts: 2.6 second seems quite slow, our slowest general processing step by far. Could this be 0.26 seconds?

Conclusion: How could we be faster?

  1. We could timeout our KPs faster. I think I overheard that Aragorn times out their KPs at 10s
  2. It appears that we are getting information from xDTD serially, not in parallel with other KPs. I wonder if it is possible to treat xDTD a bit more like a regular KP and launch its "fetch code" in parallel with waiting for other KPs? That may be tricky, but would provide a speed boost since it appears to be not the fastest and fetching of data from other traditional KPs only seems to begin after the xDTD code is complete.
  3. We could remove general concepts faster? My sense is that that could be a lot faster, knowing nothing about what's actually happening here.
  4. We could cache the whole initial query. If this same exact query has been done before very recently, why do it again?
  5. We could cache KP queries/results. If we sent an exact same query to a KP very recently, why do it again?
@edeutsch edeutsch changed the title Profiling Example 2: MVP1 query Profiling Example 2: MVP1 query: How could we be faster? Sep 11, 2024
@isbluis
Copy link
Member

isbluis commented Sep 12, 2024

Linking comment from Profiling Example 1 issue, since it is also relevant:

@edeutsch
Copy link
Collaborator Author

Closing this after spawning issue #2388

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants