Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert GitHub refactor (#2379) (including followups) #2669

Closed
wants to merge 5 commits into from

Conversation

rosecodym
Copy link
Collaborator

@rosecodym rosecodym commented Apr 4, 2024

Description:

The GitHub source refactor in #2379 reduced the overall number of requests to GitHub, but compressed them into a smaller period of time, resulting in more rate limit pressure that is causing noticeable degradation in usability in some cases. We've decided to back it out until we can figure out how to make improvements without this interim usability degradation. That PR had multiple followup PRs (#2608, #2614, #2625), which this PR also reverts.

Note that this doesn't revert the go-github version bump from 57 to 61 that has since occurred.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

@rosecodym rosecodym requested a review from a team as a code owner April 4, 2024 15:24
@rosecodym
Copy link
Collaborator Author

@rgmz we've decided to pull the trigger on this. I believe this is a straightforward stack of reverts but please speak up if anything looks amiss.

if s.handleRateLimit(err) {
continue
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be continue.

Copy link
Collaborator Author

@rosecodym rosecodym Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading the original refactor commit correctly, this is an appropriate change to revert it. Did I miss some consideration?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revert was correct. However, some of these changes are fixes for things that didn't work prior. I'm just pointing out things that need to be re-fixed.

E.g., using break instead of continue means it will never retry these API requests.

if s.handleRateLimit(err) {
continue
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be continue.

options := &github.ListOptions{
PerPage: defaultPagination,
Page: initialPage,
}
for {
comments, _, err := s.apiClient.Gists.ListComments(ctx, gistID, options)
if s.handleRateLimit(err) {
continue
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be continue.

if s.handleRateLimit(err) {
continue
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be continue.

if s.handleRateLimit(err) {
continue
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be continue.

@@ -1569,3 +1607,11 @@ func (s *Source) scanTarget(ctx context.Context, target sources.ChunkingTarget,

return common.CancellableWrite(ctx, chunksChan, chunk)
}

func removeURLAndSplit(url string) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would result in a regression compared to getRepoURLParts. e.g., it doesn't work with http:// urls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so the original refactor also fixed a bug? Is that fix something we can independently deploy?

@rgmz
Copy link
Contributor

rgmz commented Apr 4, 2024

@rgmz we've decided to pull the trigger on this. I believe this is a straightforward stack of reverts but please speak up if anything looks amiss.

I did a quick scan through and left comments on any obvious issues.

The GitHub source refactor in #2379 reduced the overall number of requests to GitHub, but compressed them into a smaller period of time, resulting in more rate limit pressure that is causing noticeable degradation in usability in some cases.

This is surprising to hear. I'm not aware of any interaction that would result in more traffic over a smaller period of time, but I'm also not aware of all the use cases.

@rosecodym
Copy link
Collaborator Author

@rgmz we've decided to pull the trigger on this. I believe this is a straightforward stack of reverts but please speak up if anything looks amiss.

I did a quick scan through and left comments on any obvious issues.

The GitHub source refactor in #2379 reduced the overall number of requests to GitHub, but compressed them into a smaller period of time, resulting in more rate limit pressure that is causing noticeable degradation in usability in some cases.

This is surprising to hear. I'm not aware of any interaction that would result in more traffic over a smaller period of time, but I'm also not aware of all the use cases.

The refactor added a new API call that's invoked for each repository enumerated. When many repositories are enumerated, this results in show-stopping rate limiting.

@rgmz
Copy link
Contributor

rgmz commented Apr 4, 2024

The refactor added a new API call that's invoked for each repository enumerated. When many repositories are enumerated, this results in show-stopping rate limiting.

Ah, I see. That was unintentional and should have been fixed in #2625. It seems like that didn't land until the most recent release (v3.72.0), though.

The change definitely could have used another pass with a fresh pair of eyes before being merged. I think part of the issue was that I hadn't touched it for quite a while prior to it being merged.

@rosecodym
Copy link
Collaborator Author

Ah, got it. With that pointer we dug a little further and it appears that we were just bitten by some very unfortunate release timing - the rate limit issue has, in fact, been resolved. Thanks for your patience and your eyes on this @rgmz!

@rosecodym rosecodym closed this Apr 4, 2024
@rosecodym rosecodym deleted the revert-2379-github-refactor branch April 4, 2024 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants