-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RHINENG-11927 - New API for uploading tsdb blocks #748
base: main
Are you sure you want to change the base?
Conversation
All the PR checks are failing because go version has been updated. go version is auto updated by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work on this!
I think usecase we are trying to enable here, is upload of tsdb blocks from "disconnected" environments. In this case, I don't think doing entire flow through Observatorium API would be correct.
Also, this does quite a lot of block relabeling during the request itself which really should be a background job based on some queue. You've limited compressed block size to 200mb but we might get larger.
So I would re-evaluate the design here a bit, and if we even need to do this in the API itself. cc: @moadz
if bkt != nil { | ||
r.Group(func(r chi.Router) { | ||
r.Use(server.StripTenantPrefix("/api/metrics/v1")) | ||
r.Post(UploadRoute, func(w http.ResponseWriter, r *http.Request) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This route is doing a lot of processing, that could be done in the background.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree we should do this async. We will need to decide if should we have external service like Kafka, rabbitmq where we can park these income request and process them asynchronously OR we have a simple go routine which will process the request in the background.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem with doing that in a goroutine is that it's not durable. If the process dies for any reason (e.g. rolling out a new version) then the goroutine goes away and the blocks were never relabeled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I meant doing this in some completely separate process instead is preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do something like -
When upload request is received we upload those .tar.gz
archive to object store and than push a json message(containing a archive link) to a kafka topic.
Than create a separate kafka consumer service which will consume these json message, download the .tar.gz
file from object store and relabel the tsdb blocks and upload to the new object store bucket where Thanos store gateway is configured.
Thanks for the review @saswatamcode |
Hey @patilsuraj767 thanks for raising this PR! I have a few potential contentions with this approach wrt how it impacts observatorium-api's statelessness: Generally obs-api is completely stateless authn/authz proxy. My highest concern is that this process would make obs-api stateful as it has to store the blocks on disk during the upload process, and undertake mutations on the block before it is ready to query. These blocks can be gigabytes in size, and we would be paying for ingress to obs-api, and then uploading when ingress is generally free on direct upload. My suggestion is as follows:
With respect to compression, my intuition is that we should not upload a compressed archive, but to decompress on disk (in where the upload is happening) and upload directly using the signed URL. Ingress is generally free for s3 providers, and we would simplify the marking later on. Let me know if you'd like to schedule a call to discuss this fruther. |
Hello @moadz, Thanks for reviewing the PR. Do you mean we will add external labels in
And Yes, we can schedule a call to discuss this further. |
Hello, I got the answers for above questions please ignore it. I have doubt regarding |
We deploy Thanos behind Observatorium (https://github.com/observatorium/api) where we manage tenancy based on the presence of the tenant_id label. Marking a block would simply add the tenant_id as an external label to the block to identify those metrics as belonging to the ROS tenant. In terms of querying - it won't impact much as we expect the tenant label to exist in the metrics. So perhaps we need to spike this an explore if simply adding the external label as tenant is sufficient to make it queryable. |
Thanks @moadz Got your point. |
As long as the external label is set on the block for the cluster-id before you upload it should be fine. It's feasible that someone may not one them not to be compacted, if you imagine they all belong to the same tenant, they will have the tenant label marked and will be compacted together independent of the cluster label. So it's on the uploader to determine the sharding they want for blocks at rest. |
This PR adds new
/api/v1/upload
API where user can upload tsdb blocks packed in.tar.gz
archive.Below is the short summary of how upload API will work.
.tar.gz
file. User can also addexternal_labels
field in the multipart request. These external labels will be further added to all time series present in tsdb block.