Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Internal] Update Jobs GetRun API to support paginated responses for jobs and ForEach tasks with >100 runs #324

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ricardoc-db
Copy link
Contributor

Changes

Jobs GetRun 2.2 behind the scenes pagination

Tests

Unit tests

Comment on lines 0 to 1
37e2bbe0cbcbbbe78a06a018d4fab06314a26a40 No newline at end of file
universe:/Users/ricardo.costa/universe
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looking to get an official SHA here.

@ricardoc-db ricardoc-db changed the title Jobs GetRun 2.2 behind the scenes pagination [Internal] Jobs GetRun 2.2 behind the scenes pagination Jul 25, 2024
Comment on lines 38 to 42
if (paginatingIterations) {
run.getIterations().addAll(currRun.getIterations());
} else {
run.getTasks().addAll(currRun.getTasks());
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need this condition because in case of For each run, the singular task keeps showing up in the tasks array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we're only ever paginating one of the two fields, and never both at the same time.
This way we avoid having duplicate entries in the array that is not being paginated.

}

// now that we've added all pages to the Run, the token is useless
run.setNextPageToken(null);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous page token too I suppose

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


@Override
public Run getRun(GetRunRequest request) {
Run run = super.getRun(request);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this calling API 2.2 or 2.1? I can't find where the version is defined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently 2.1. The implementation is supposed to work with both (as long as the types are there).

Copy link

@gkiko10 gkiko10 Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly you will override the String path in JobsImpl.java variable later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JobsImpl.java is generated. It follows that the String path there is also generated. My understanding is that when the version is updated on the proto, the next generation will automatically update those paths.

Copy link

@gkiko10 gkiko10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change

@ricardoc-db ricardoc-db force-pushed the jobs-get-run-2.2-behind-the-scenes-pagination branch from 08727a7 to 6981221 Compare August 5, 2024 13:25
Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Were you able to test this out against a real workspace?

for (long runId : taskRunIds) {
tasks.add(new RunTask().setRunId(runId));
}
run.setIterations(tasks);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean run.setTasks I assume?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thanks for spotting that. The test was passing previously because the expected object was also being constructed in the same way.
Fixed: 9967447

}

@Override
public Run getRun(GetRunRequest request) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, we may want to have a separate doccomment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here

I may have missed another comment of yours.
I added a docstring to this method in this commit: e9eec92 Is this what you were looking for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you were referring to this? databricks/databricks-sdk-py#725 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved further to explicitly mention the 100 task/iteration limit: 057f140

@ricardoc-db
Copy link
Contributor Author

Were you able to test this out against a real workspace?

Did not attempt.

@ricardoc-db
Copy link
Contributor Author

@mgyucht tested manually against a job with 100+ tasks as well as a ForEach run with 100+ iterations, hardcoding both v 2.1 and 2.2. All combinations worked as expected.

@mgyucht mgyucht changed the title [Internal] Jobs GetRun 2.2 behind the scenes pagination [Feature] Added support for ForEach tasks and jobs with >100 tasks in the GetRun API Aug 16, 2024
@ricardoc-db ricardoc-db changed the title [Feature] Added support for ForEach tasks and jobs with >100 tasks in the GetRun API [Internal] Update Jobs GetRun API to support paginated responses for jobs and ForEach tasks with >100 runs Aug 19, 2024
@mgyucht mgyucht added this pull request to the merge queue Aug 19, 2024
@mgyucht mgyucht removed this pull request from the merge queue due to a manual request Aug 19, 2024
@renaudhartert-db renaudhartert-db removed their request for review September 11, 2024 09:18
@renaudhartert-db
Copy link
Contributor

Removing myself from the reviewer list as it seems my review is not explicitly needed/requested. Please add me back if that is incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants