Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support for Experimental Codecs #13723

Open
sarthakaggarwal97 opened this issue May 17, 2024 · 17 comments · May be fixed by #13992
Open

[Feature Request] Support for Experimental Codecs #13723

sarthakaggarwal97 opened this issue May 17, 2024 · 17 comments · May be fixed by #13992
Labels
enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing

Comments

@sarthakaggarwal97
Copy link
Contributor

Is your feature request related to a problem? Please describe

With the introduction of new custom codecs, it gets essential to bake the new codecs before marking them ready for production. This helps us to avoid potential index corruption risks, and allows for the community feedback to flow in before we make the codecs as generally available.

Earlier, we used to have sandbox for codecs OpenSearch, with the introduction of the separate repository for custom codecs, it gets important to have a similar functionality available in the codecs plugin as well.

Describe the solution you'd like

We can extend the existing CodecSettings interface to know whether the codec is experimental upon index creation.

Related component

Indexing

Describe alternatives you've considered

No response

Additional context

Coming from opensearch-project/custom-codecs#148

@sarthakaggarwal97 sarthakaggarwal97 added enhancement Enhancement or improvement to existing feature or request untriaged labels May 17, 2024
@github-actions github-actions bot added the Indexing Indexing, Bulk Indexing and anything related to indexing label May 17, 2024
@andrross
Copy link
Member

andrross commented May 17, 2024

@sarthakaggarwal97 Could the custom-codecs plugin itself create an register a static setting like codecs.experimental.enabled and only create and add the new codec if that setting is explicitly set to true?

@sarthakaggarwal97
Copy link
Contributor Author

@andrross Do you mean a static cluster setting? It would be like a feature flag right. Please correct me if I am wrong.

We might have to utilize this interface so that we can mark codecs as experimental. Another benefit of an interface change is that it gives a way for other plugins as well to use it. We have plugins like Security Analytics, K-nn who also currently maintain their own codecs, and might have similar use cases in the future.

I feel it will be a combination of both the setting and a clean way to mark the new codecs as experimental. We can then have assertions in place to block the create index call.

@andrross
Copy link
Member

@sarthakaggarwal97 Yeah it would functionally work the same way as a feature flag, but just be scoped to the custom-codecs plugin in this case. I think in order to determine exactly how we implement this feature we'll want to clearly define the user experience. If we extend the CodecSettings interface, how do we implement the behavior? What's a system admin's experience for enabling and disabling experimental codecs?

@sarthakaggarwal97
Copy link
Contributor Author

@andrross since the validation to select the codec happens in EngineConfig, we might need to implement the feature flag within OpenSearch itself. CodecSettings interface would help us know whether the particular codec is experimental or not, while the feature flag would make experimental codecs available via CodecService.

Tagging @reta @mgodwan @backslasht for inputs.

@reta
Copy link
Collaborator

reta commented May 30, 2024

@andrross since the validation to select the codec happens in EngineConfig, we might need to implement the feature flag within OpenSearch itself. CodecSettings interface would help us know whether the particular codec is experimental or not, while the feature flag would make experimental codecs available via CodecService.

I would 💯 agree with @andrross that experimental codecs should be contained in custom-codecs plugin, including the experimental support. The CodecSettings is optional and we should not expect any codec to implement it (vs Apache Lucene's Codec which is mandatory).

@sarthakaggarwal97
Copy link
Contributor Author

sarthakaggarwal97 commented May 30, 2024

I guess we should be able to have the setting in custom-codecs. We might have to still expose the experimental via the CodecSettings interface. If we would not validate the experimental nature of codecs in the EngineConfig, and the codecs are not available in CodecService (because they are experimental), the shards would fail.

Apologies if I am missing something here, please correct me if I am wrong.

@reta
Copy link
Collaborator

reta commented May 30, 2024

Apologies if I am missing something here, please correct me if I am wrong.

I would encourage to not rely on CodecSettings anyhow (for this specific feature)

@andrross
Copy link
Member

If we would not validate the experimental nature of codecs in the EngineConfig, and the codecs are not available in CodecService (because they are experimental), the shards would fail.

@sarthakaggarwal97 Can you elaborate a bit on what you mean here? I get that there are sharp edges on the user experience (i.e. if you disable an experimental codec then any existing indexes using that codec will fail), but that is sort of expected for these opt-in experimental features until they are stabilized and fully supported. To me, the simplest option here is to just let custom-codecs choose whether to register experimental codecs at startup based on the existence of a static setting. What about that implementation doesn't work with the experience you have in mind?

@sarthakaggarwal97
Copy link
Contributor Author

@andrross in EngineConfig when the validation happens upon index creation, it loads the codec from NamedSPI itself. The codecs are registered with NamedSPI through resources.

Is it possible to load a different resource file based on an experimental settings?

@reta
Copy link
Collaborator

reta commented Jun 3, 2024

Is it possible to load a different resource file based on an experimental settings?

It is not possible, the service loader mechanism does not support that (afaik)

@sarthakaggarwal97
Copy link
Contributor Author

yeah, even I am not aware.
@andrross @reta what do you guys suggest in this case?

@reta
Copy link
Collaborator

reta commented Jun 4, 2024

@andrross @reta what do you guys suggest in this case?

To me, the most logical way to move forward is to contain the change in custom-codecs (opensearch-project/custom-codecs#148): it provides own CodecService (CustomCodecService) and can filter out experimental codecs (driven by plugin specific settings).

@sarthakaggarwal97
Copy link
Contributor Author

The challenge would be make the experimental codecs not available in the index.codec settings validations.

@reta
Copy link
Collaborator

reta commented Jun 4, 2024

The challenge would be make the experimental codecs not available in the index.codec settings validations.

Not sure I get it: If experimental codec is not enabled (in custom-codecs), any attempts to configure it (including validating settings) would fail (since this is managed by our own CodecSettings)

@sarthakaggarwal97
Copy link
Contributor Author

If the codecs are registered by NamedSPI, they would be available over here

@reta
Copy link
Collaborator

reta commented Jun 4, 2024

If the codecs are registered by NamedSPI, they would be available over here

Same answer as for CodecSettings, the CodecAliases is ours and any attempts to use it (including validating settings) would fail. We cannot do this over any out of the box Apache Lucene codecs but we could do that for any codec in custom-codecs plugin (some code changes may be need of cause)

@sarthakaggarwal97 sarthakaggarwal97 linked a pull request Jun 5, 2024 that will close this issue
9 tasks
@peternied
Copy link
Member

[Triage - attendees 1 2 3 4 5 6 7]
@sarthakaggarwal97 Thanks for creating this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants