Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added delay capability to retry #97

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

krotte1
Copy link

@krotte1 krotte1 commented Sep 22, 2019

In some cases a user wants to add a delay between retries that are performed. While continue to support the original usage of retry, this adds three new attributes to RetryStep: useTimeDelay, timeDelay, and units. If useTimeDelay is true, then the timeDelay will be applied between retries.

@krotte1
Copy link
Author

krotte1 commented Sep 24, 2019

@jglick @basil @dwnusbaum @abayer I figured it might be worth reaching out to a few of you with recent activity on the workflow-basic-steps-plugin. I have added the ability to introduce a delay in between retries and would like you to review this pull-request. Thank you

@basil
Copy link
Member

basil commented Sep 24, 2019

Hey @krotte1, unlike Jesse and Devin I am not (yet!) an official committer to this project, so I defer to them for any official feedback. My comments below are merely suggestions, not blocking issues.

The feature seems useful, and I've wanted it in my own pipelines several times. I haven't reviewed the implementation yet, but I took a look at the proposed API change to the retry step and I have some ideas about that. Your proposed syntax is as follows:

retry(count: 3, timeDelay: 10, unit: 'SECONDS', useTimeDelay: true) {
  [...]
}

Whenever I'm adding a new API, I try to study existing examples of similar APIs to learn from their mistakes and possibly gain insights that could benefit the design of my own API. In this case, I took a look at the API offered by Tenacity, which is (to me) the gold standard in retry APIs. I use it frequently in my Python-based build scripts.

Tenacity offers several options when it comes to waiting before retries. I've used several of them myself, depending on the use case. Some examples:

  • Fixed period: @retry(wait=wait_fixed(2))
  • Randomness: @retry(wait=wait_random(min=1, max=2))
  • Exponential backoff: @retry(wait=wait_exponential(multiplier=1, min=4, max=10))
  • Fixed waits and jitter: @retry(wait=wait_fixed(3) + wait_random(0, 2))
  • Exponentially increasing jitter: @retry(wait=wait_random_exponential(multiplier=1, max=60))
  • Chain of backoffs: @retry(wait=wait_chain(*[wait_fixed(3) for i in range(3)] + [wait_fixed(7) for i in range(2)] + [wait_fixed(9)]))

As you can see, Tenacity is the "Cadillac" of retry APIs. Am I suggesting that you implement all of this in your PR? I am emphatically not suggesting that.

What I am suggesting, though, is that you design for extension so that these additional features could be added later.

For example, it might be worth sketching out some possible syntax for the fancier forms of time-based retry (such as backoff and jitter) and making sure that those future improvements could be made without having to break compatibility with the syntax being proposed in this change. That way, we can move forward with what you're proposing here with more confidence that future extensions won't run into implementation issues.

What do you think?

@krotte1
Copy link
Author

krotte1 commented Sep 25, 2019

Thanks @basil. I like where you are going with this and can see the usefulness of the implementation you suggested. Before I go through the effort of refactoring the code to make this type of implementation possible, I would like to know from @jglick @dwnusbaum @abayer would even allow this enhancement or the more extensible implementation you propose into the baseline.

@krotte1
Copy link
Author

krotte1 commented Sep 26, 2019

I was really hoping to hear from one of the maintainers (@jglick @dwnusbaum @kohsuke @abayer) about the possibility of incorporating this time of feature in the baseline. I have started implementing what @basil talked about on a different branch and am almost finished.... I don't want to continue down this line if the simple case won't even be accepted.

@krotte1
Copy link
Author

krotte1 commented Sep 29, 2019

I now have the extensible version implemented that was mentioned by @basil in a previous post. @jglick @dwnusbaum @kohsuke @abayer ... I was just wondering if you could tell me if either of these implementations would be accepted. The simple version in this pull request justs adds a fixed delay option to the retry step. The extensible version that I could merge into the pull-request allows for a FixedDelay, RandomDelay, IncrementalDelay, ExponentialDelay, and a RandomExponential delay. It can also be easily extended to add other delay algorithms.

@krotte1
Copy link
Author

krotte1 commented Oct 1, 2019

I looked around and see that some pull requests have reviewers added to them, but I can't figure out how to do it myself. I beginning to think that only folks like @jglick @dwnusbaum @kohsuke @abayer that may have some type of elevated privileges can do it. All of the normal interactions with the Reviewers tab are not available to me so I can't really request someone to review this.

@jglick jglick requested a review from dwnusbaum October 1, 2019 12:11
@jglick
Copy link
Member

jglick commented Oct 1, 2019

Indeed you need write permission to a repository to modify reviewers. There is a new “triage” permission on GitHub that we are proposing to roll out; see the Jenkins dev list for more.

@jglick jglick requested a review from basil October 1, 2019 12:13
@krotte1
Copy link
Author

krotte1 commented Oct 2, 2019

@jglick - thank you for adding the reviewers to this pull request. I really appreciate it..

@basil - I went ahead and implemented the recommendations you made earlier in this pull request. Do you want me to go ahead and merge that into this pull request before you review it? I made the delay extensible and implemented many of the items you mentioned above: FixedDelay, RandomDelay, IncrementalDelay, ExponentialDelay, and a RandomExponential.

@dwnusbaum
Copy link
Member

@krotte1 we don't currently have a great process for evaluating new features like this, but we are working on improving it. For now, can you create a Jira ticket, component workflow-basic-steps, mention me in the description (@dnusbaum), and we can discuss the idea/design there?

@krotte1
Copy link
Author

krotte1 commented Oct 5, 2019

A new Jira ticket (https://issues.jenkins-ci.org/browse/JENKINS-59678) was created for this pull request to discuss the desired solution (simple fixed retry in the pull request or extensible solution that has already been implemented and can be added to this pull request).

@krotte1
Copy link
Author

krotte1 commented Oct 13, 2019

@dwnusbaum - now that https://issues.jenkins-ci.org/browse/JENKINS-59678 has been out there for a week, what happens next? Which solution do we want to go with - simple fixed time delay between retries or an extensible method with different algorithms for retry delays?

@krotte1
Copy link
Author

krotte1 commented Oct 27, 2019

@dwnusbaum - just wondering what the status is... I am really hoping to get one of these delay solutions incorporated into the baseline.

Copy link
Member

@timja timja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great feature, we've implemented this internally in a shared library, but it took us some time, testing and iterations before we had it working nicely. It would be great if this could be supported in the plugin.

I've taken a look and added a few comments, ignore or action as you wish, I'm not a maintainer here

}

public int getCount() {
return count;
}

@DataBoundSetter public void setUseTimeDelay(boolean useTimeDelay) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? shouldn't a non null / non default time delay mean this is enabled?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem to be a bit extra here

@Override
public Set<? extends Class<?>> getRequiredContext() {
return Collections.singleton(TaskListener.class);
}

}

private static final long serialVersionUID = 1L;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normally constants are at the top of a class in java

}

@Override
public boolean start() throws Exception {
@Override public boolean start() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Override would normally go above a method, and it already was above there before you changed it so I would suggest you move it back

}

@Override
public boolean start() throws Exception {
@Override public boolean start() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Override would normally go above a method, and it already was above there before you changed it so I would suggest you move it back

@v1v
Copy link
Member

v1v commented Feb 5, 2020

Awesome! Great work!

I just found this PR, as I actually had to create a few retry/sleep steps and came up with some internal shared library step that it works but I wish we could have this feature in place.

Any plans to move forward this PR?

Thanks

@jakauppila
Copy link

Is there anything holding up this PR from being merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants