Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jaeger-v2] Align Cassandra Storage Config With OTEL #5928

Closed
6 tasks done
mahadzaryab1 opened this issue Sep 2, 2024 · 11 comments · Fixed by #5949
Closed
6 tasks done

[jaeger-v2] Align Cassandra Storage Config With OTEL #5928

mahadzaryab1 opened this issue Sep 2, 2024 · 11 comments · Fixed by #5949
Labels
area/storage changelog:new-feature Change that should be called out as new feature in CHANGELOG enhancement storage/cassandra v2

Comments

@mahadzaryab1
Copy link
Contributor

mahadzaryab1 commented Sep 2, 2024

Requirement

In this issue, we want to align the configuration for cassandra storage with OTEL (part of #5229)

Tasks / outcomes

  • All config fields are tagged with mapstructure such that they can be addressable from YAML via "good" names
  • Configs implement Validate()
  • Configs via YAML support the same defaults as in jaeger-v1
  • Configs reuse standard elements of OTEL configs whenever possible, e.g. TLS, clientcfg, etc.
  • Configuration looks "logical" and properly grouped, not flattened with long weird names
  • We create a migration guide document with a table that lists v1 CLI flags and the corresponding v2 config fields
@dosubot dosubot bot added area/storage changelog:new-feature Change that should be called out as new feature in CHANGELOG storage/cassandra v2 labels Sep 2, 2024
@mahadzaryab1
Copy link
Contributor Author

@yurishkuro I had a question regarding the reuse of standard OTEL configs. For https://github.com/mahadzaryab1/jaeger/blob/main/pkg/cassandra/config/config.go#L37, should the type of this field be configtls.ClientConfig from go.opentelemetry.io/collector/config/configtls?

@yurishkuro
Copy link
Member

Yes

@mahadzaryab1
Copy link
Contributor Author

hey @yurishkuro! i had a couple of questions about the TLS configs:

  1. I don't see an equivalent for https://github.com/jaegertracing/jaeger/blob/main/pkg/config/tlscfg/options.go#L160-L165 in OTEL's configtls. Can we simply just remove these calls to Stop now?
  2. How should we handle methods like https://github.com/jaegertracing/jaeger/blob/main/pkg/config/tlscfg/flags.go#L66? Do we want to make new methods that init from viper into the OTEL TLS type? Should this still live in the tlscfg package?

@yurishkuro
Copy link
Member

yurishkuro commented Sep 6, 2024

  1. The reason they don't have it is because OTEL uses different ways of watching for file changes. Jaeger actually has a watcher (integrated with events from the file system) that notifies Jaeger immediately when the file is changed, but that comes at the cost of running a separate goroutine for receiving the events, which is why we need the Close function to clean it up. OTEL uses a simpler approach - they have a fixed timeout and every time they read TLS certs (e.g. when a new connection is established) they check if the timeout has expired, and if so they explicitly check if the file was changed and if so they reload the certs. Their approach does not require a goroutine, but it does introduce an additional one-off latency in the request processing.
  2. Our tlscfg struct is mostly compatible with OTEL's TLS, so there are different ways we can handle the translation. The simplest is we just copy tlscfg into otel TLS as needed.

@mahadzaryab1
Copy link
Contributor Author

mahadzaryab1 commented Sep 6, 2024

@yurishkuro whats the reason that we have tags withmapstructure:"-" (ex. https://github.com/jaegertracing/jaeger/blob/main/pkg/cassandra/config/config.go#L36C2-L36C22?). Are these fields just temporarily disabled to be accepted via the config files?

@mahadzaryab1
Copy link
Contributor Author

@yurishkuro here's my proposal for the groupings of the cassandra v2 configs. Do you have any feedback? Let me know if you prefer to review it on the PR instead.

type Configuration struct {
	Connection    Connection    `mapstructure:"connection"`
	Data          Data          `mapstructure:"data"`
	Timeout       Timeout       `mapstructure:"timeout"`
	ProtoVersion  int           `mapstructure:"proto_version"`
	Authenticator Authenticator `mapstructure:"auth"`
}

type Connection struct {
	Servers              []string               `valid:"required,url" mapstructure:"servers"`
	LocalDC              string                 `mapstructure:"local_dc"`
	Port                 int                    `mapstructure:"port"`
	DisableAutoDiscovery bool                   `mapstructure:"-"`
	ConnectionsPerHost   int                    `mapstructure:"connections_per_host"`
	ReconnectInterval    time.Duration          `mapstructure:"reconnect_interval"`
	SocketKeepAlive      time.Duration          `mapstructure:"socket_keep_alive"`
	MaxRetryAttempts     int                    `mapstructure:"max_retry_attempts"`
	TLS                  configtls.ClientConfig `mapstructure:"tls"`
}

type Data struct {
	Keyspace           string `mapstructure:"keyspace"`
	DisableCompression bool   `mapstructure:"disable_compression"`
	Consistency        string `mapstructure:"consistency"`
}

type Timeout struct {
	Query   time.Duration `mapstructure:"-"`
	Connect time.Duration `mapstructure:"connection_timeout"`
}

type Authenticator struct {
	Basic BasicAuthenticator `yaml:"basic" mapstructure:",squash"`
}

type BasicAuthenticator struct {
	Username              string   `yaml:"username" mapstructure:"username"`
	Password              string   `yaml:"password" mapstructure:"password" json:"-"`
	AllowedAuthenticators []string `yaml:"allowed_authenticators" mapstructure:"allowed_authenticators"`
}

@yurishkuro
Copy link
Member

whats the reason that we have tags with mapstructure:"-"

No reason / laziness. v1 did not make any use of mapstructure tags, but v2 does, so we need to define them. I don't see why DisableAutoDiscovery would not be needed.

here's my proposal for the groupings of the cassandra v2 config

Overall looks good to me, but a few thoughts:

  • what specific benefits do you see in separating connection and data?
  • Timeout and ProtoVersion at the top level are a bit odd, won't they conceptually fit under Connection? Also, timeout needs to be renamed to a more concrete name - is it connection timeout, request timeout?
  • Would you mind collaborating with @akstron on Create Cassandra db schema on session initialization #5922 where new config options are being proposed (only used during schema creation), which might affect how you are thinking about logical separation of fields.

@mahadzaryab1
Copy link
Contributor Author

mahadzaryab1 commented Sep 7, 2024

What specific benefits do you see in separating connection and data?

I mostly just did for readability. Should I move the data groupings under connection?

Timeout and ProtoVersion at the top level are a bit odd, won't they conceptually fit under Connection? Also, timeout needs to be renamed to a more concrete name - is it connection timeout, request timeout?

I can move Timeout and ProtoVersion under connection. Timeout is a higher level grouping and it has two fields underneath; Query and Connect. Does that make sense? Let me know if you want any improvements here.

Would you mind collaborating with @akstron on #5922 where new config options are being proposed (only used during schema creation), which might affect how you are thinking about logical separation of fields.

Sure!

@yurishkuro
Copy link
Member

I would flatten timeouts into Connection

@yurishkuro
Copy link
Member

Why wouldn't auth go under connection?

Consistency seems a connection concept, not schema's

@mahadzaryab1
Copy link
Contributor Author

Thanks for your feedback! @yurishkuro. I'll address these and push up my changes.

yurishkuro added a commit that referenced this issue Sep 26, 2024
## Which problem is this PR solving?
-  Resolves #5928 

## Description of the changes
- Revamped configuration for Cassandra to be logically grouped 
- Added unit tests for the `Validate` method 
- Changed the configuration to use OTEL's `configtls.ClientConfig` type
- [Migration
guide](https://docs.google.com/document/d/1mCcQJTYC1eEhNVI9HSR-F6A9gxK-GlEtSz0wiEbRxlU/edit)

## How was this change tested?
- New unit tests 
- CI

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Yuri Shkuro <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage changelog:new-feature Change that should be called out as new feature in CHANGELOG enhancement storage/cassandra v2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants