-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Scroll API #3705
Added Scroll API #3705
Conversation
|
||
That id is then meant to be sent to a scroll rest API. | ||
This API successive calls will then return pages incrementally. | ||
Scroll is limited to 10k items. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Airmail can go up to 20k.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. This is configurable in elastic. For the moment we don't have any limit what so ever.
The scrolled results should be consistent with the state of the original index. | ||
For this reason we need to capture the state of the index at the point of the original request. | ||
|
||
If a network error happens between the client and the server at page N, there is no way for the client to ask the reemission of page N. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we "get it right" in the Quickwit implementation and then "expose it wrong" in the ES API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do it right and expose it right actually. People can retry with the scroll id that failed due to a network error and it will work as you would expect in the case of quickwit
quickwit/quickwit-common/src/lib.rs
Outdated
@@ -32,6 +32,7 @@ pub mod pubsub; | |||
pub mod rand; | |||
pub mod rendezvous_hasher; | |||
pub mod runtimes; | |||
pub mod shared_consts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think quickwit-proto
is better suited for this usage since it's becoming the crate where we expose our public and internal APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.... I am not sure about that one. Maybe if we renamed it quickwit-public-api or something like that.
THe trouble is that both quickwit-commmon and quickwit-proto pull all kinds of shit.
#[derive(Clone, PartialEq, ::prost::Message)] | ||
pub struct PutKvRequest { | ||
#[prost(bytes = "vec", tag = "1")] | ||
pub key: ::prost::alloc::vec::Vec<u8>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We should use Bytes
instead of Vec<u8>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
.count() | ||
.await; | ||
|
||
if successful_replication == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd display a warning if any attempt failed and an error if all of them failed. It would be also nice to display the number of failures and which nodes failed. Given the urgency of this PR, feel free to ignore this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent point.
I went for warn whenever an attempt failed, error if all failed and there was at least one node.
quickwit/quickwit-search/src/root.rs
Outdated
} | ||
|
||
#[instrument(skip(search_request, cluster_client))] | ||
async fn search_partial_hits_round_with_scroll( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async fn search_partial_hits_round_with_scroll( | |
async fn search_partial_hits_phase_with_scroll( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's common to hear about two-phase search or execution for search engines like ES or ours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes!
if let Ok(Some(search_after_resp)) = client.get_kv(get_request.clone()).await { | ||
return Some(search_after_resp); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also display a warning here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not fond of warn here. It would trigger if you add a node for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it an info
e3488e7
to
5aadcbc
Compare
See Design details in docs/internals/scroll.md Closes #3551
5aadcbc
to
071f895
Compare
/// Maximum capacity of the search after cache. | ||
/// | ||
/// For the moment this value is hardcoded. | ||
/// TODO make configurable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you open an issue on this?
Also, I don't understand the comment "
/// Assuming a search context of 1MB, this can
/// amount to up to 1GB.
"
Are you talking about the scroll context
? If not, what is the search context
?
And another question where the 1000 factor comes from (1MB -> 1GB)?
For airmail project, the scroll API will be used for exporting data, up to 20k hits. Does this mean we need 20GB of RAM on a searcher?
Solution described in docs/internals/scroll.md.
Tests
Unit tests for the simplistic KV score.
Unit test for the scroll API
Rest API test comparing output from ElasticSearch.
Closing #3551