Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about About Multiple Indexers with kafka source config #3927

Closed
yangshike opened this issue Oct 9, 2023 · 13 comments
Closed

about About Multiple Indexers with kafka source config #3927

yangshike opened this issue Oct 9, 2023 · 13 comments
Labels
documentation Improvements or additions to documentation

Comments

@yangshike
Copy link
Contributor

yangshike commented Oct 9, 2023

hi,
I want to configure multiple indexers for the Kafka source, but I couldn't find multiple indexer configuration documents
I have a node that is started through quickwit run -- config config.yaml (which already includes an indexer, kafka source, and some other ingest indexing methods),

I would like to start a separate indexer service to connect to this cluster and only serve as the second indexer for Kafka source,

I tried to start the second node, but on the Kafka end, I saw that only one node was consuming data at the same time

@yangshike yangshike added the documentation Improvements or additions to documentation label Oct 9, 2023
@fmassot
Copy link
Contributor

fmassot commented Oct 9, 2023

Hi @yangshike

We indeed need a good example in our docs on this (cf #3923).

version: 0.6
source_id: kafka-source
source_type: kafka
max_num_pipelines_per_indexer: 1
desired_num_pipelines: 4
params:
  topic: logs
  client_params:
    bootstrap.servers: broker-1:9092,broker-2:9092

See this docs page to have details about the source parameters.

Does this help?

@yangshike
Copy link
Contributor Author

Hi @yangshike

We indeed need a good example in our docs on this (cf #3923).

version: 0.6
source_id: kafka-source
source_type: kafka
max_num_pipelines_per_indexer: 1
desired_num_pipelines: 4
params:
  topic: logs
  client_params:
    bootstrap.servers: broker-1:9092,broker-2:9092

See this docs page to have details about the source parameters.

yes, I almost configured it this way, but I want to know if the indexer needs special configuration?

this is my indexer config:
node_id: indexer-2
listen_address: 0.0.0.0
peer_seeds:

  • 172.16.2.22
    version: 0.6
    metastore_uri: s3://xxx/meta

@fmassot
Copy link
Contributor

fmassot commented Oct 9, 2023

No, the indexer does not need a special configuration.

You should check that your indexers form a cluster though. You can have a look at the logs and the output of api/v1/cluster and share that with us if it does not work as you expect.

@fmassot
Copy link
Contributor

fmassot commented Oct 9, 2023

Additional question: did you start one or several nodes that play the roles of metastore/control plane/janitor?

@yangshike
Copy link
Contributor Author

Additional question: did you start one or several nodes that play the roles of metastore/control plane/janitor?

no,I only have one node that has started these services

@fmassot
Copy link
Contributor

fmassot commented Oct 9, 2023

no,I only have one node that has started these services

Ok, so it should work if this node and the two indexers form a cluster.

@yangshike
Copy link
Contributor Author

max_num_pipelines_per_indexer: 1
desired_num_pipelines: 4

Just add these two parameters. I didn't have these two configurations just now

@fmassot
Copy link
Contributor

fmassot commented Oct 9, 2023

Ok! Can you confirm if it works with those parameters?

@yangshike
Copy link
Contributor Author

yangshike commented Oct 9, 2023

Ok! Can you confirm if it works with those parameters?

Yes, it can work normally now.

Now there are two more questions:

  1. I cannot see the offset information submitted by Quickwit on the Kafka server

  2. Can resetting a location point through the reset checkpoint API indicate location information? For example, the latest or specific offset or start_time
    in addition, I have configured this parameter to recreate the kafka source, but I think we will still start consuming from the earliest, auto. offset. reset: latest

@fmassot
Copy link
Contributor

fmassot commented Oct 9, 2023

I cannot see the offset information submitted by Quickwit on the Kafka server

You can see the offsets by partition in the metastore. You can do a GET /api/v1/indexes/{your-index} to get the info.

If you're using the 0.6.4, Quickwit will not inform Kafka that documents has been indexed up to a given offset (commit on Kafka side). It was added recently (#3638)

@fmassot
Copy link
Contributor

fmassot commented Oct 9, 2023

Can resetting a location point through the reset checkpoint API indicate location information? For example, the latest or specific offset or start_time

Currently, It's not possible start indexing from a specific offset in Quickwit, unless you enter that information manually in the metastore.

reset_checkpoint only deletes all checkpoints, just as if you delete and create the source.

@yangshike
Copy link
Contributor Author

ok, thank you very much!

@fmassot
Copy link
Contributor

fmassot commented Oct 9, 2023

You're welcome!

Closing in favor of #3923

@fmassot fmassot closed this as completed Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants