-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drop hints support #125
Comments
I use them to add |
It doesn't seem like |
Indeed it's not a hint but it seems to be the only way to pass it into that query from Ecto. I don't think Ecto supports a settings clause? |
It doesn't have to go into the Ecto query, settings can be passed separately from the query. |
I think So I would advise to not use/promote the usage of |
Some queries cannot be performed without FINAL but I think that is tangential to this issue. |
Really? Do you have an example? I am curious because I cannot think of an example maybe because I never encountered such a situation |
When using an |
You can use aggregating functions - |
I don't use it like that, I merge several partial records into one final record, like this (simplified): `id` FixedString(26),
`started_at` DateTime('UTC') MATERIALIZED ULIDStringToDateTime(id),
`disconnected_at` SimpleAggregateFunction(max, DateTime('UTC')),
`user_responded_at` SimpleAggregateFunction(max, DateTime('UTC')),
`properties` SimpleAggregateFunction(groupArrayArray, Array(String)) Additional properties (in the properties column) can be added to the record at any time, and the |
But can't you just do select id, started_at, max(disconnected_at) as disconnected_at, max(user_responded_at) as user_responded_at, groupArrayArray(properties) as properties from <table> group by id, started_at Maybe I am not really following. Still I don't think there is any query which you can only run with final and not as a group by + aggregation. |
I don't know if I can, but why would I do it that way? That is a way longer query. Perhaps you are right and FINAL is not strictly speaking necessary but I don't see the advantage of writing a longer query. |
It depends on your use case. If you have a small dataset final should be fine. If you have a larger dataset final should be avoided. If you don't run into performance issues or don't care about speed then final is also fine. |
If I query with a group by, I get
and with the
With GROUP BY, Clickhouse will probably scan every row, while FINAL can be performed in a more efficient way. When I repeat the query it is even worse:
remains. With FINAL:
So this is over 70 times slower and it looks like it will slow down further with additional records. Group byselect conversation_id, max(disconnected_at) as disconnected_at, max(user_responded_at) as user_responded_at from conversations group by conversation_id limit 10 format Vertical finalselect conversation_id, disconnected_at, user_responded_at from conversations final limit 10 format Vertical |
I mean 0.584 isn't bad depending on use case but also I dont see the full query & table so cannot say whether the query without final is performant written or not (but most likely it is not) |
If you say so. I will stop derailing this issue, I think I've made a clear case for |
The case is not clear because I don't see the table definition and the queries you used that result in that huge performance diff Edit: I am not against the PR but I am just saying that in all cases you can have at least equal performance and in most cases better performance without final modifier |
It doesn't seem like ClickHouse supports hints. In Plausible
query.hints
are used as a workaround to pass theSAMPLE
clause.Maybe it can be similar to
input/1
helper function:The text was updated successfully, but these errors were encountered: