You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, after upgrade querybook from 2.4.0 to 3.28.0 we faced with missing table list from our metastore. All databases were empty in UI, but I see that update_metastore job works fine in worker's logs.
I checked webserver logs and found such warning: [2023-10-13 Fri 19:30:02] - /opt/querybook/querybook/server/lib/elasticsearch/search_utils.py - WARNING "Got ElasticSearch exception: RequestError(400, 'search_phase_execution_exception', 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [name] in order to load field data by uninverting the inverted index. Note that this can use significant memory.')
After some investigation I added query to log https://github.com/pinterest/querybook/blob/master/querybook/server/lib/elasticsearch/search_utils.py#L64
and found that querybook use sorting function for 'name' field in search_tables_v1 index {'query': {'bool': {'must': [{'function_score': {'query': {'match_all': {}}, 'boost_mode': 'sum', 'script_score': {'script': {'source': "doc['importance_score'].value * 10 + (doc['golden'].value ? 50 : 0)"}}}}], 'filter': {'bool': {'must': [{'match': {'schema': 'abasharin'}}, {'match': {'metastore_id': 3}}]}}}}, 'size': 100, 'from': 0, '_source': ['id', 'schema', 'name'], 'sort': [{'name': {'order': 'asc'}}], 'highlight': {'pre_tags': ['<mark>'], 'post_tags': ['</mark>'], 'type': 'plain', 'fields': {'columns': {'fragment_size': 20, 'number_of_fragments': 5}, 'data_elements': {'fragment_size': 20, 'number_of_fragments': 5}, 'description': {'fragment_size': 60, 'number_of_fragments': 3}}}}"
I downloaded current index mapping: curl https://<es-server>/search_tables_v1/_mapping | jq > update_indeces.json
After that I added "fielddata": true, to name field and uploaded new mapping back: curl -XPUT https://<es-server>/search_tables_v1/_mapping -H "Content-Type: application/json" -d @update_indeces.json
In helps and solved this issue. update_indeces.json
I tested it with opensearch:2.9.0 and elasticsearch:7.16.2
Also I tried to add fielddata: true to querybook/config/elasticsearch.yaml and recreate indexes but looks like config parser ignores this key.
The text was updated successfully, but these errors were encountered:
Hi, after upgrade querybook from 2.4.0 to 3.28.0 we faced with missing table list from our metastore. All databases were empty in UI, but I see that update_metastore job works fine in worker's logs.
I checked webserver logs and found such warning:
[2023-10-13 Fri 19:30:02] - /opt/querybook/querybook/server/lib/elasticsearch/search_utils.py - WARNING "Got ElasticSearch exception: RequestError(400, 'search_phase_execution_exception', 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [name] in order to load field data by uninverting the inverted index. Note that this can use significant memory.')
After some investigation I added query to log https://github.com/pinterest/querybook/blob/master/querybook/server/lib/elasticsearch/search_utils.py#L64
and found that querybook use sorting function for 'name' field in search_tables_v1 index
{'query': {'bool': {'must': [{'function_score': {'query': {'match_all': {}}, 'boost_mode': 'sum', 'script_score': {'script': {'source': "doc['importance_score'].value * 10 + (doc['golden'].value ? 50 : 0)"}}}}], 'filter': {'bool': {'must': [{'match': {'schema': 'abasharin'}}, {'match': {'metastore_id': 3}}]}}}}, 'size': 100, 'from': 0, '_source': ['id', 'schema', 'name'], 'sort': [{'name': {'order': 'asc'}}], 'highlight': {'pre_tags': ['<mark>'], 'post_tags': ['</mark>'], 'type': 'plain', 'fields': {'columns': {'fragment_size': 20, 'number_of_fragments': 5}, 'data_elements': {'fragment_size': 20, 'number_of_fragments': 5}, 'description': {'fragment_size': 60, 'number_of_fragments': 3}}}}"
I downloaded current index mapping:
curl https://<es-server>/search_tables_v1/_mapping | jq > update_indeces.json
After that I added
"fielddata": true,
toname
field and uploaded new mapping back:curl -XPUT https://<es-server>/search_tables_v1/_mapping -H "Content-Type: application/json" -d @update_indeces.json
In helps and solved this issue.
update_indeces.json
I tested it with opensearch:2.9.0 and elasticsearch:7.16.2
Also I tried to add
fielddata: true
toquerybook/config/elasticsearch.yaml
and recreate indexes but looks like config parser ignores this key.The text was updated successfully, but these errors were encountered: