-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frrist/metrics/include instance #4450
base: main
Are you sure you want to change the base?
Conversation
- clients pass their instance and installation id in headers, this is added to the jobs metadata when submitted - server includes their instance and installation id in meta of job. Now, given a job spec it may be determined which node submitted the job and which node receieved the submission
cmd/cli/serve/serve.go
Outdated
installationID, err := metastore.ReadInstallationID() | ||
if err != nil { | ||
log.Trace().Err(err).Msg("failed to read installationID") | ||
} | ||
instanceID, err := metastore.ReadInstanceID() | ||
if err != nil { | ||
log.Trace().Err(err).Msg("failed to read instanceID") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trace are mainly for more granular and request level logging. It might be useful to use debug level here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing to debug
cmd/util/api.go
Outdated
installationID, err := metastore.ReadInstallationID() | ||
if err != nil { | ||
return nil, err | ||
} | ||
instanceID, err := metastore.ReadInstanceID() | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should be best effort instead of failing the request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will modify to log a debug message, and only include in headers if found and non-empty.
cmd/util/api.go
Outdated
apimodels.HTTPHeaderBacalhauBuildInstallationID: {installationID}, | ||
apimodels.HTTPHeaderBacalhauInstanceID: {instanceID}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be HTTPHeaderBacalhauInstallationID
instead of HTTPHeaderBacalhauBuildInstallationID
. Also should we always send the headers or only if they have values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will only include if value is present: #4450 (comment)
pkg/node/requester.go
Outdated
type MetaStore interface { | ||
ReadInstallationID() (string, error) | ||
ReadInstanceID() (string, error) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why introducing an interface instead of just passing the actual metastore to NewRequesterNode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To prevent the receivers of the type from calling methods that can modify the contents of the store - this is a read only store given this interface. It will also make testing easier in the future.
pkg/node/requester.go
Outdated
instanceID, err := metaStore.ReadInstanceID() | ||
if err != nil { | ||
return nil, err | ||
} | ||
installationID, err := metaStore.ReadInstallationID() | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a common pattern I am seeing in different places in the code where we should treat these as best effort instead of failing. My recommendation is to have the metastore handle the failures internally by just logging and returning an empty id. If there are cases where you need to fail, then you can introduce MustReadInstanceID()
methods.
What would be nice is:
- Have the metastore read the ids from disk only once the first time a getter is called, store the entries in-memory and then serve subsequent requests from memory
- If a failure happens while reading from disk, then it should log only once, and then all future calls will just get an empty id without more logging
pkg/orchestrator/endpoint.go
Outdated
job.Meta[models.MetaClientInstallationID] = request.ClientInstallationID | ||
job.Meta[models.MetaClientInstanceID] = request.ClientInstanceID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we only set these if the request has these metadata?
func OrchestratorInstanceID(instanceID string) JobTransformer { | ||
f := func(ctx context.Context, job *models.Job) error { | ||
job.Meta[models.MetaServerInstanceID] = instanceID | ||
return nil | ||
} | ||
return JobFn(f) | ||
} | ||
|
||
func OrchestratorInstallationID(installationID string) JobTransformer { | ||
f := func(ctx context.Context, job *models.Job) error { | ||
job.Meta[models.MetaServerInstallationID] = installationID | ||
return nil | ||
} | ||
return JobFn(f) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same here
pkg/repo/sysmeta.go
Outdated
func (fsr *MetadataStore) ReadInstallationID() (string, error) { | ||
sysmeta, err := fsr.readMetadata() | ||
if err != nil { | ||
return "", err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
installationID should no longer be in the meta store. It is fine to keep the method here if you prefer, but the source is not system_metadata.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping it in the store prevents it from being scattered in different files. This was the design we decided here: https://www.notion.so/expanso/Rethinking-Configuration-435fbe87419148b4bbc5119d413786eb?pvs=4#5d5f7298b9ef4ab4bcb1fd6a44ca8c30. Additionally, for users migrating from v1.4.0 to v1.5.0, the installationID will be moved to the metadata store from its previous location the config file.
I know we discussed this in our meeting; How would you like to proceed here?
We can continuing doing what we do now and keep it in the store, or we can modify the metadata store to check both locations - its internal file and the file produced by the installation script. The latter feels like unnecessary complexity and coupling with the implementation of the install script.
Decouples the metadata store from the repo, allow service that accept only a config to read the contents of metadata stored in the repo.
feat: client and server include ids in job meta
added to the jobs metadata when submitted
and which node receieved the submission