Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add mlflow modules #5

Merged
merged 21 commits into from
Feb 20, 2024
Merged

feat: add mlflow modules #5

merged 21 commits into from
Feb 20, 2024

Conversation

kukushking
Copy link
Contributor

@kukushking kukushking commented Feb 14, 2024

Description of changes:

  • Add mlflow-image module that builds mlflow container image and pushes to ECR.
  • Add mlflow-fargate module to run mlflow on AWS Fargate from the image above. By default, uses EFS for persistence.

Refer to the architecture and module README.MD for details.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
@kukushking kukushking self-assigned this Feb 14, 2024
Signed-off-by: Anton Kukushkin <[email protected]>
Copy link

@dgraeber dgraeber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of things:

  • unit tests?
  • you need to add a change log to the repo and begin to track all PR's (see the other module repos)
  • looks like we need to add in the git workflows to executing the linting checks and the pytest

environment={
"BUCKET": f"s3://{artifacts_bucket_name}",
# TODO: Add persistence
# "HOST": database.db_instance_endpoint_address,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be configurable (i.e. add in RDS persistence if configured)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Also in case user wants to use the default file store for persistence we probably need an EBS volume -- checking this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added EFS by default

modules/mlflow/mlflow-fargate/stack.py Outdated Show resolved Hide resolved
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
@kukushking kukushking marked this pull request as ready for review February 16, 2024 16:40
modules/mlflow/mlflow-image/src/Dockerfile Outdated Show resolved Hide resolved
modules/mlflow/mlflow-fargate/stack.py Outdated Show resolved Hide resolved

Tags.of(scope=cast(IConstruct, self)).add(key="Deployment", value=app_prefix[:64])

role = iam.Role(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to name this role and prefix with the dep spec name to allow multiple instances of deployment of same module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am relying on the autogenerated name that will be unique across stacks/deployments. Does that work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test multiple instances of this module deployment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just did.

container.add_port_mappings(port_mapping)

# Add EFS
fs = efs.FileSystem(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a good idea to externalize EFS from this module and consume from IDF kind of a generic repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it s a good idea, but only after we have a second use case for it, to make sure we actually have a generalized version of it

modules/mlflow/mlflow-fargate/stack.py Outdated Show resolved Hide resolved
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
Signed-off-by: Anton Kukushkin <[email protected]>
vpc=vpc,
encrypted=True,
throughput_mode=efs.ThroughputMode.ELASTIC,
performance_mode=efs.PerformanceMode.GENERAL_PURPOSE,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the retain policy? will we have efs volumes out there when the module destroys?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's RETAIN by default

@kukushking kukushking merged commit 20500a7 into awslabs:main Feb 20, 2024
5 checks passed
@kukushking kukushking deleted the feat/mlflow branch February 20, 2024 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants