This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Feature Implementation: AWS Glue Job Execution Support #308
Feature Implementation: AWS Glue Job Execution Support #308
Changes from 10 commits
1f82b74
4fb7a2c
8f9e9af
9d6ac4f
5af0057
5c46568
7adaf5b
d2183db
71552b9
f44c66f
956c7da
c726304
bae5e12
71934c2
e71b947
68b02b8
3b501e2
1dceb73
8390a94
6b3ff0b
c0d611f
3bd4a17
7ea43d4
491191e
9e06097
12299d4
41367dd
0d04479
cb36769
7b9cbad
32a43c0
18e71d7
bd69c20
3bf96cb
f086f03
7048c2a
f3c7d2c
705ffe4
993076b
d92238d
5a4fd2d
12781be
d5c37e2
9f71f03
474c045
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
BaseJobConfiguration
class has aname
attribute that automatically gets populated with the flow name. I think it'd be worthwhile to use that attribute rather than introducing another name attribute.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
job_name
is necessarily to use start glue job.https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue/client/start_job_run.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, does this require a Glue job to exist already? Usually, workers dynamically create jobs to execute flow runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am user very interested in using this feature, I actually I have forked this into my repo already. I think it's expected that the glue job exists already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@feliperazeek are you using the worker to run Glue jobs or the block in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name is good and will work, but I think the implementation of the worker needs to change to operate the same as our other workers. I'd expect a Prefect Glue worker to create a Glue job for each flow run that it picks up and then monitor that Glue job for completion.
@knakazawa99 I think we can take this PR in one of two directions:
JobBlock
interface instead of theInfrastructure
interface. This will make it more suited for executing existing Glue jobs (which I'm realizing might have been the original intent of the PR, which I misunderstood).Let me know which direction you'd like to go with this PR, and I will do whatever I can to help. Thank you for your patience and sticking with this PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running glue jobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@desertaxle
Is the following direction correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay. Yes, that is the direction that I'm suggesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@desertaxle
I have fixed it. However, the test is failing due to the boto3. I have not been able to reproduce this error in my local environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how flexible this field is, but it'd be nice to have some stronger typing while maintaining flexibility. Introducing a variables class that contains some values that will often vary between deployments and creating a default template can help with that. You can checkout the ECS worker's variables class as an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fix it.
Does this match what you had in mind?
e71b947