Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support READONLY to allow exporting data from replica nodes #160

Open
atkretsch opened this issue Oct 9, 2024 · 1 comment
Open

Support READONLY to allow exporting data from replica nodes #160

atkretsch opened this issue Oct 9, 2024 · 1 comment

Comments

@atkretsch
Copy link

Currently, if you run riot in non-cluster mode, and configure it to point to a replica node URI, you will get MOVED responses and the export/replication will fail.

However, it would be useful to be able to export/replicate data directly from replicas so as to avoid putting additional load on the primary nodes. This would theoretically also allow greater throughput because we could run a separate parallel riot process for each shard in the cluster.

I could see this working in one of two ways:

  1. a --read-only flag (or similar) that would tell riot to send a READONLY command before any scan/read operations; then, the source URI could point to a replica node without fear of MOVED responses. In this case, it would be up to the caller of riot to understand the cluster topology and pass in the correct replica URIs, decide whether to run in parallel or in sequence, etc.
  2. abstract this behavior behind some additional flag(s) when running in -c mode, such that riot itself could handle any potential parallelization (perhaps with tuning, e.g. --replica-read-threads N). Conceptually, this would mean the caller wouldn't need to know the details of the cluster topology when invoking riot, but this is probably a much more complex change to implement.
@atkretsch
Copy link
Author

Looking through the output for file-export --help, I see the --read-from REPLICA option. This covers the ability to read from replicas to keep load off the primary nodes, but doesn't help with parallelization across shards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant