-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add -outputRerankerRequests
option to create input for RankLLM
#2463
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2463 +/- ##
============================================
- Coverage 66.80% 66.66% -0.14%
Complexity 1418 1418
============================================
Files 213 213
Lines 12206 12244 +38
Branches 1488 1494 +6
============================================
+ Hits 8154 8163 +9
- Misses 3538 3563 +25
- Partials 514 518 +4 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get this merged in!
As discussed in RankLLM, we can create an intermediate JSONL file with this option.
Example Usage:
java -cp
ls target/*-fatjar.jario.anserini.search.SearchCollection -index msmarco-v2.1-doc-segmented -topics trec2021-dl -output runs/run.msmarco-v2.1-doc-segmented.dl21.txt -outputRerankerRequests runs/retrieve_results_msmarco-v2.1-doc-segmented-dl21_top20.jsonl -bm25 -hits 20 -threads 16 -format trec
We only use JSONL, since RankLLM supports it.
The results for a single query can then be visualized by
head -1 runs/retrieve_results_msmarco-v2.1-doc-segmented-dl21_top20.jsonl | jq
.