-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manage index settings in Vespa application package #921
base: mainline
Are you sure you want to change the base?
Conversation
1246404
to
4664309
Compare
5b444b8
to
88a14f5
Compare
88a14f5
to
3e280ff
Compare
logger.warn(f"Multiple schemas with name {name} found in services.xml, removing all") | ||
if should_deploy is None or should_deploy is True: | ||
prepare_response = self.vespa_client.prepare(prepare_url) | ||
# TODO handle prepare configChangeActions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This configChangeActions
field in the response from Vespa contains a list of actions (possibly empty) that must be performed in order to apply some config changes between the current active application and this next prepared application. hese actions are organized into three categories; restart, reindex, and refeed See https://docs.vespa.ai/en/reference/deploy-rest-api-v2.html#prepare-session for an example.
What should we do when this list is not empty?
- For requests modifying schemas (in the future), if we get any action in the list, should we reject the request?
- For bootstrap and rollback (in the future if they use deployment session after the Vespa bugfix), should we allow restart?
@farshidz, what's your thought on this?
services.xml file only has `node` config and empty `document-api` and `search` elements initially. Please | ||
note that any manual config change in container section will be overwritten. | ||
""" | ||
container_element = self._ensure_only_one('container') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our approach of fresh bootstrapping components config of container is based on following assumptions:
- There's only one container element in the services.xml file.
- All configs in the container element except the nodes section is fully controlled by Marqo.
These assumption might not be true in following cases:
- non-HA to HA migration in Marqo Cloud. A transiting state will have two container elements. In this case, we'd like to fail the bootstrapping process anyways to avoid entering any weird state. Also Marqo Cloud will make sure the upgrade of Marqo is blocked when processing migration
- OS customers could have multiple container clusters defined. (one use case is the separate the index and search traffic). Do we need to support this?
- OS customers could have other configuration in container section. should we keep those?
If we have to support the cases above, I'll need to change the fresh bootstrapping process to an incremental one, which will make the logic more complicated. @farshidz , what's your thought?
# Migrate existing index settings from previous versions of Marqo | ||
if not self.is_configured: | ||
if existing_index_settings is not None and len(existing_index_settings) > 0: | ||
logger.info(f'Importing existing index settings {[index.json for index in existing_index_settings]}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@farshidz, Do we have any security concerns of logging out the existing index settings?
self._store.save_file(content, 'validation-overrides.xml') | ||
|
||
def _validate_services_xml_for_rollback(self, services_xml_backup: str) -> None: | ||
services_xml_old = ServiceXml(services_xml_backup) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the rollback, if we detect the changes in 'content/documents' or any 'nodes' elements in services.xml, we will abort the rollback. This is to avoid rolling back to an inconsistent state. Should we capture any diffs of the services.xml file in the log for debugging purpose? @farshidz
@@ -0,0 +1,4 @@ | |||
<query-profile id="default"> | |||
<field name="maxHits">1000</field> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be 1,000,000?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used for testing. I used a different value to test if the bootstrapping process overrides this value
3e280ff
to
1867a84
Compare
8a4bddb
to
82900d4
Compare
4d1b53d
to
abdc84a
Compare
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
Reliability improvements
What is the current behavior? (You can also link to an open issue here)
Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
No, there are no changes to the external APIs or customer-facing behaviour of Marqo
Have unit tests been run against this PR? (Has there also been any additional testing?)
Yes
Related Python client changes (link commit/PR here)
N/A
Related documentation changes (link commit/PR here)
N/A
Other information:
Please check if the PR fulfills these requirements