feat: add a RetryInterval setting #1654
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR proposes adding a retryInterval setting to pipeline and vertex manifests, with the default set to 1ms.
Reasons for increasing the retryInterval beyond the default include reducing spammy logs on failures, saving resources on long component failures, and lack of significant performance increase of a 1ms retryInterval over a 500ms one in most cases. Reasons for sticking to the default include having highly available and scalable components, critical response times, or relying on CPU/RAM limits to prevent resource waste.
Considering potential implementation of retry with backoff (mentioned in existing comment), this change may be reversed in the future.
Also, always adding parameters may not be a good habit for usability, so I'm open to instead finding a compromise or renouncing it if the community deems it of little use.