Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RHINENG-11927 - New API for uploading tsdb blocks #748

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

patilsuraj767
Copy link

This PR adds new /api/v1/upload API where user can upload tsdb blocks packed in .tar.gz archive.

Below is the short summary of how upload API will work.

  1. User uploads the tsdb blocks packed in .tar.gz file. User can also add external_labels field in the multipart request. These external labels will be further added to all time series present in tsdb block.
    curl -v -F "external_labels={\"cluster_id\":\"123\"}" -F "file=@/tmp/tmp/test.tar.gz" -H "Authorization: Bearer ${TOKEN}"  
    http://localhost:8080/api/metrics/v1/test-oidc/api/v1/upload
    
  2. Once the request is received to the obs-api it checks the file size and later extract the tar archive.
  3. After extraction of tar, we read each tsdb blocks and creates the new blocks by adding the external_labels.
  4. later these new blocks are uploaded to the object storage.

@patilsuraj767
Copy link
Author

All the PR checks are failing because go version has been updated. go version is auto updated by go mod tidy because the new dependency github.com/thanos-io/thanos uses go 1.22.0

Copy link
Member

@saswatamcode saswatamcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this!

I think usecase we are trying to enable here, is upload of tsdb blocks from "disconnected" environments. In this case, I don't think doing entire flow through Observatorium API would be correct.

Also, this does quite a lot of block relabeling during the request itself which really should be a background job based on some queue. You've limited compressed block size to 200mb but we might get larger.

So I would re-evaluate the design here a bit, and if we even need to do this in the API itself. cc: @moadz

if bkt != nil {
r.Group(func(r chi.Router) {
r.Use(server.StripTenantPrefix("/api/metrics/v1"))
r.Post(UploadRoute, func(w http.ResponseWriter, r *http.Request) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This route is doing a lot of processing, that could be done in the background.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree we should do this async. We will need to decide if should we have external service like Kafka, rabbitmq where we can park these income request and process them asynchronously OR we have a simple go routine which will process the request in the background.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem with doing that in a goroutine is that it's not durable. If the process dies for any reason (e.g. rolling out a new version) then the goroutine goes away and the blocks were never relabeled

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I meant doing this in some completely separate process instead is preferable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do something like -
When upload request is received we upload those .tar.gz archive to object store and than push a json message(containing a archive link) to a kafka topic.
Than create a separate kafka consumer service which will consume these json message, download the .tar.gz file from object store and relabel the tsdb blocks and upload to the new object store bucket where Thanos store gateway is configured.

@patilsuraj767
Copy link
Author

Thanks for the review @saswatamcode
Regarding disconnected env - In case of ROS, OpenShift cluster/cost-operator will not directly upload the tsdb blocks to obs-api. It uploads it first to an intermediate service which then uploads block to obs-api. In case of disconnected clusters user upload tsdb blocks manually to the intermediate service which is then forward to observatorium.

@moadz
Copy link

moadz commented Aug 28, 2024

Hey @patilsuraj767 thanks for raising this PR!

I have a few potential contentions with this approach wrt how it impacts observatorium-api's statelessness:

Generally obs-api is completely stateless authn/authz proxy. My highest concern is that this process would make obs-api stateful as it has to store the blocks on disk during the upload process, and undertake mutations on the block before it is ready to query. These blocks can be gigabytes in size, and we would be paying for ingress to obs-api, and then uploading when ingress is generally free on direct upload.

My suggestion is as follows:

  • We expose a getSignedUrl endpoint under the tenant metrics blocks api e.g. {tenant_id}/metrics/api/v1/blocks/getSignedUrl?block-id={SOME_BLOCK_ID} that takes a block-id (should be consistent with the uploaded block ID and returns a single use signed URL to object storage that permits the client to upload the block independently of obs-api. The signed URL would be tied to a request, and scoped to the block-id, so a signed URL cannot be reused for uploading other blocks.
  • After the block is uploaded, we create an endpoint for marking a block as 'ready for querying' {tenant_id}/metrics/api/v1/blocks/mark?block-id={SOME_BLOCK_ID} that takes the block-id (which is now already uploaded) and manipulates the manifest external labels to make it queryable by Thanos. We can then use http error codes to reflect whether the block was not found, found but not relabled or relabled successfully.

With respect to compression, my intuition is that we should not upload a compressed archive, but to decompress on disk (in where the upload is happening) and upload directly using the signed URL. Ingress is generally free for s3 providers, and we would simplify the marking later on. Let me know if you'd like to schedule a call to discuss this fruther.

@patilsuraj767
Copy link
Author

Hello @moadz, Thanks for reviewing the PR.
I got the first point of your suggestion regarding the getSignedUrl endpoint but didn't completely understand the 2nd point.
Particularly I didn't understand what you mean by - manipulates the manifest external labels to make it queryable by Thanos I have very limited knowledge of how Thanos works. Can you please elaborate it for me.

Do you mean we will add external labels in meta.json file leveraging this struct - https://github.com/thanos-io/thanos/blob/main/pkg/block/metadata/meta.go#L76?
If Yes, Than I have further question on this. -

  1. Suppose If we add cluster_id = 123 in external_labels in meta.json file than will Querier consider those labels while querying metrics? If I query any time series which is inside that tsdb block like - metrics_name{cluster: "123"} will this work?
  2. If there are two tsdb blocks say BLOCK-1 and BLOCK-2 and BLOCK-1 meta.json file has meta.Thanos.Labels: cluster_id=111 and BLOCK-2 meta.json has external label meta.Thanos.Labels: cluster_id=222 than will compactor make sure that it won't merge these 2 blocks?

And Yes, we can schedule a call to discuss this further.

@patilsuraj767
Copy link
Author

Hello, I got the answers for above questions please ignore it.

I have doubt regarding {tenant_id}/metrics/api/v1/blocks/mark?block-id={SOME_BLOCK_ID} API. How do we mark a block 'ready for querying' OR 'not ready for querying'?
I didn't find anything under BlockMeta that we can use to mark block ready or not ready for querying.

@moadz
Copy link

moadz commented Sep 5, 2024

I have doubt regarding {tenant_id}/metrics/api/v1/blocks/mark?block-id={SOME_BLOCK_ID} API. How do we mark a block 'ready for querying' OR 'not ready for querying'?

We deploy Thanos behind Observatorium (https://github.com/observatorium/api) where we manage tenancy based on the presence of the tenant_id label. Marking a block would simply add the tenant_id as an external label to the block to identify those metrics as belonging to the ROS tenant. In terms of querying - it won't impact much as we expect the tenant label to exist in the metrics. So perhaps we need to spike this an explore if simply adding the external label as tenant is sufficient to make it queryable.

@patilsuraj767

@patilsuraj767
Copy link
Author

Thanks @moadz Got your point.
Regarding Thanos Compactor, Do you think thanos compactor can cause any issue here? I mean time between the block is uploaded and the mark API is called, within that time if compactor is involved and try to compact blocks than it might happen that compactor can combine two blocks from different tenants.

@moadz
Copy link

moadz commented Sep 8, 2024

Thanks @moadz Got your point. Regarding Thanos Compactor, Do you think thanos compactor can cause any issue here? I mean time between the block is uploaded and the mark API is called, within that time if compactor is involved and try to compact blocks than it might happen that compactor can combine two blocks from different tenants.

As long as the external label is set on the block for the cluster-id before you upload it should be fine. It's feasible that someone may not one them not to be compacted, if you imagine they all belong to the same tenant, they will have the tenant label marked and will be compacted together independent of the cluster label. So it's on the uploader to determine the sharding they want for blocks at rest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants