Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SageMaker Notebook Instance module #12

Merged
merged 18 commits into from
Feb 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- added `mlflow-image` and `mlflow-fargate` modules
- added `sagemaker-studio` module
- added `sagemaker-endpoint` module
- added `sagemaker-notebook` module

### **Changed**

Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@ All modules in this repository adhere to the module structure defined in the the

### SageMaker Modules

| Type | Description |
|-----------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [SageMaker Endpoint Module](modules/sagemaker/sagemaker-endpoint/README.md) | Creates SageMaker real-time inference endpoint for the specified model package or latest approved model from the model package group |
| [SageMaker Studio Module](modules/sagemaker/sagemaker-studio/README.md) | Provisions secure SageMaker Studio Domain environment, creates example User Profiles for Data Scientist and Lead Data Scientist linked to IAM Roles, and adds lifecycle config |

| Type | Description |
|------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [SageMaker Endpoint Module](modules/sagemaker/sagemaker-endpoint/README.md) | Creates SageMaker real-time inference endpoint for the specified model package or latest approved model from the model package group |
| [SageMaker Studio Module](modules/sagemaker/sagemaker-studio/README.md) | Provisions secure SageMaker Studio Domain environment, creates example User Profiles for Data Scientist and Lead Data Scientist linked to IAM Roles, and adds lifecycle config |
| [SageMaker Notebook Instance Module](modules/sagemaker/sagemaker-notebook/README.md) | Creates SageMaker Notebook Instances |

### Mlflow Modules

Expand Down
20 changes: 20 additions & 0 deletions examples/manifests/sagemaker-notebook-modules.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@

name: notebook
path: modules/sagemaker/sagemaker-notebook
parameters:
- name: notebook_name
value: dummy
- name: instance_type
value: ml.t2.xlarge
- name: subnet_ids # Optional parameter, you can remove it safely
valueFrom:
moduleMetadata:
group: networking
name: networking
key: PrivateSubnetIds
- name: vpc_id # Optional parameter, you can remove it safely
valueFrom:
moduleMetadata:
group: networking
name: networking
key: VpcId
57 changes: 57 additions & 0 deletions modules/sagemaker/sagemaker-notebook/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# SageMaker Notebook Instance

## Description

This module creates a SageMaker Notebook instance.

### Architecture

![SageMaker Notebook Instance Architecture](docs/_static/architecture.drawio.png "SageMaker Notebook Instance Architecture")

## Inputs/Outputs

### Input Paramenters

#### Required

- `notebook_name`: The name of the new notebook instance
- `instance_type`: The type of ML compute instance to launch for the notebook instance

#### Optional

- `direct_internet_access`: Sets whether SageMaker provides internet access to the notebook instance, by default None
- `root_access`: Whether root access is enabled or disabled for users of the notebook instance, by default None
- `volume_size_in_gb`: The size, in GB, of the ML storage volume to attach to the notebook instance, by default None
- `imds_version`: The Instance Metadata Service (IMDS) version, by default None
- `subnet_ids`: A list of subnet IDs in a VPC to which you would like to have a connectivity, by default None. Only the first subnet id will be used.
- `vpc_id`: The ID of the VPC to which you would like to have a connectivity, by default None
- `kms_key_arn`: The ARN of a AWS KMS key that SageMaker uses to encrypt data on the storage volume attached, by default None
- `code_repository`: The Git repository associated with the notebook instance as its default code repository, by default None
- `additional_code_repositories`: An array of up to three Git repositories associated with the notebook instance, by default None
- `role_arn`: An IAM Role ARN that SageMaker assumes to perform tasks on your behalf, by default None
- `tags`: Extra tags to apply to the SageMaker notebook instance, by default None

### Sample manifest declaration

```yaml
name: notebook
path: modules/sagemaker/sagemaker-notebook
targetAccount: primary
parameters:
- name: notebook_name
value: dummy123
- name: instance_type
value: ml.t2.xlarge
```

### Module Metadata Outputs

- `SageMakerNotebookArn`: the SageMaker Notebook instance ARN.

#### Output Example

```json
{
"SageMakerNotebookArn": "arn:aws:sagemaker:xxxxxxx:123412341234:notebook-instance/xxxxx",
}
```
25 changes: 25 additions & 0 deletions modules/sagemaker/sagemaker-notebook/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/usr/bin/env python3
"""Create a Sagemaker Model Stack."""
import aws_cdk as cdk

from sagemaker_notebook.settings import ApplicationSettings
from sagemaker_notebook.stack import SagemakerNotebookStack

# Load application settings from env vars.
app_settings = ApplicationSettings()

env = cdk.Environment(
account=app_settings.default.account,
region=app_settings.default.region,
)

app = cdk.App()

stack = SagemakerNotebookStack(
scope=app,
construct_id=app_settings.settings.app_prefix,
env=env,
**app_settings.parameters.model_dump(),
)

app.synth()
3 changes: 3 additions & 0 deletions modules/sagemaker/sagemaker-notebook/coverage.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[run]
omit =
tests/*
26 changes: 26 additions & 0 deletions modules/sagemaker/sagemaker-notebook/deployspec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
publishGenericEnvVariables: true
deploy:
phases:
install:
commands:
- env
# Install whatever additional build libraries
- npm install -g [email protected]
- pip install -r requirements.txt
build:
commands:
# execute the CDK
- cdk deploy --require-approval never --progress events --app "python app.py" --outputs-file ./cdk-exports.json
# Export metadata
- seedfarmer metadata convert -f cdk-exports.json || true
destroy:
phases:
install:
commands:
# Install whatever additional build libraries
- npm install -g [email protected]
- pip install -r requirements.txt
build:
commands:
# execute the CDK
- cdk destroy --force --app "python app.py"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<mxfile host="Electron" modified="2024-02-27T20:02:21.834Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/20.2.3 Chrome/102.0.5005.167 Electron/19.0.11 Safari/537.36" etag="IxBhroUwZb-r2ZDYP9zm" version="20.2.3" type="device"><diagram id="nOkpBZcZqbjnXywVXBLF" name="Page-1">7Vttc6JIEP41flyL4UX0I77tpi7Zym2ubm8/WSOMOBtkLBiN3q+/HhhQhjHGqNHbNUmlmGZeoPt5unsaaFi92epzgufTBxaQqGEawaph9Rum2W4Z8F8I1rnAaUtBmNAgF6GN4In+S6Sw6LagAUkrHTljEafzqtBncUx8XpHhJGEv1W4TFlVXneOQ1ARPPo7q0u804FN5W46xkX8hNJwWKyNDnpnhorMUpFMcsJctkTVoWL2EMZ4fzVY9EgndFXrJxw13nC0vLCExf8uAn3+G/qO3nHhWt33fCvhqYfz4JGdZ4mghb9j7/gSCXsQWgbxuvi6UMWc05plCnS78wXo9o+HAmZ5oNU1HEahttypA9ZaYoypQ225VgNTpkbI+Ui9wS1BrVaY3lPWNrQuEP6vLFjyiMemV0DNAGCY4oGCSHotYArKYxaC97pTPImghOHyZUk6e5tgXWn0B1oBswmIuwY/Moi0VL2YFeHMMayVyjswSJBksSW6QvE8U4XlKx+WohPiLJKVL8o2k+eRCCkCci+PZKhSUbeKX1G6GCVvMs8u/g7W0Z0dwOPIFMEY44mIinrBnUtxow7TgdyjA153QKFIUsCQJp8ArL6KhmJ8zsRyWrYhMshlBKzQO77NW3zKkJnRLBDidkkDekkQxLEFWO+mBStKBsyJsRniyhi7FgIK40lHZsvmyYb1d+K7pFuPtwlNh6WnCcuoNGeFA8vEAbpo1bv792Ltx8sbJbU4u576WiXa7jeyzMtHzum63/S4m7g5FO+lpVtlpadhpa9hpts7FTqvGzseELjEngp6LcUz4jao3qm5TNRXTUb4ebTo/ZbwtJlZJjGx30PUUEoN80Bqaw/bpmFyucyImm0cy2XQ0TEbtczHZqTG5xlyNF61bq+22WtZOq6iQ1OU1/aPUbh2rdqRTu30utbdqan+CzdQDfgaWmsZXxsmYsWc4vItTjmOgu2qV9Jlwfyr1qXWuWgerc7JaR1t3tpVumfvTrKAKdTK3LkT1boXHrAt1Ml14UEcjzWikjN7tnFW/USBeddpwzva6ZsfbOten4E45zZxjzBIBOtWvIdTqg/E03JpkP6qHKoh2j8ckemQpldPPaBBEukynPKG4yK3osi+S4HSe62NCV+JC9OEgISlbJD7JgwEEklQXFlLA+yzDu9WNlHtIcgoq3tu5rINwNAmWcyb30K5XJmb4X9CNaQy6TzdfcGW+oNyXa3xBy/Asyz3MF5iuC+7geF8wZpyz2d5cySciH7ycKyARTuHyRuOI+c+jlLOEbMh/dEL2OvGRU2U+Muz65kpX+ThXZtB5hfq9enJ2o/71Un/otgeGfRj1+4bTQ+7vQ33f/DCqt66M6ahe48yfP9x5D/D/G4v2ZP1H4VPu5BT4dYdGWxN5ZOerQ55aykA70ciiD4wphts03QrYHKeGNctp2rqc8mxwq2/1s5K6MYgDGTg+Gm1233Q979dDGyk1CoNIQsGAJHmqYO9USHy93ISKHYqEobZ0/JE+z9RUjhfjiPq3wnE1tbkVjt9bONY+/RGF4w64ptbpCsflOh9TOO7srVAgS8Pk81WOTbtG5a/eXyD4jDl5wetbNDlRNIkxH4VSp0cB7PWsxa6GivaFIwVyawAiQUgKlYrdEwtZjKPBRtoFFxEHpZU3fe6ZsF+m4p+E87U0FF5wVrUiWVH+jxjedGTrx9aZ/krOnDXWFcqLi3sH4eEGs23SK/2kXThOQvLqM2RHb9+ERJiDP65cnc5ecuijAOZWDqEAA3UUk+c3IEdtrO4liXACZbdNjvm2dWqvdB3WHw7yK9hAsNTJEU6vdQlUbvuW1x5FpIAR7okX/xpbj+pANqTiNq8KrXn4OAKtxwWv+t77TnjjLAG9RbATRzAqVbsVxj5m/2NZ17b/af/m/mO/X9hh0SOjmOUoUcy13xTF9k5Ue8kqv8PaRKeKQKhzCQS9BRll7rSVOW3yKH3uJBqPRYXkSIShN0YotCMPPhJhZrEhKnZoDmp2tn/eh7cSqMW0aoHw3HirV2zkW/IsgAg1m9F6yeb2UEot71zPQym778LJxkEPpeDH7nZ0xZZf8qGUD8j2c2Sf8Z1epPDaMi+bmqD6O2mfKf+yGNfpLVVLZ9n3Q/vNnb3f08X+c5gFogI/AZngRfYxxQ48qDbNFvQKqVFI4HjKufgmyhMqMId+EH+iYNj00zwOmxPh1qEFdp3BSUc8dxsK+sE/SH6ac5GGns3O5ScSV2Pni6SghyQQRbnlahMIQ2/xYwstxnkSCOfCCYSxI4H44+H2Jtv/KXMYOrZrdw7LHHoustDwt8kcnsl6NMOxeLUVrmSUkmRJhSs5W3RRNweXDi7mRXan11DfKL7P3lulb50leFhutSxR8/InqtLX1nFMBTQHVd2hufkIPO+++ZLeGvwH</diagram></mxfile>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
35 changes: 35 additions & 0 deletions modules/sagemaker/sagemaker-notebook/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
[tool.black]
line-length = 120
target-version = ["py36", "py37", "py38"]
exclude = '''
/(
\.eggs
| \.git
| \.hg
| \.mypy_cache
| \.tox
| \.venv
| \.env
| _build
| buck-out
| build
| dist
| codeseeder.out
)/
'''

[tool.isort]
multi_line_output = 3
include_trailing_comma = true
force_grid_wrap = 0
use_parentheses = true
ensure_newline_before_comments = true
line_length = 120
py_version = 36
skip_gitignore = false

[tool.pytest.ini_options]
addopts = "-v --cov=. --cov-report term --cov-config=coverage.ini --cov-fail-under=80"
pythonpath = [
"."
]
6 changes: 6 additions & 0 deletions modules/sagemaker/sagemaker-notebook/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
aws-cdk-lib~=2.126.0
cdk-nag~=2.28.27
constructs~=10.3.0
pydantic~=2.5.3
pydantic-settings~=2.0.3
configobj~=5.0.8
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
"""Defines the stack settings."""

import time
from abc import ABC
from typing import Dict, List, Optional

from pydantic import Field, computed_field, field_validator
from pydantic_settings import BaseSettings, SettingsConfigDict


class CdkBaseSettings(BaseSettings, ABC):
"""Defines common configuration for settings."""

model_config = SettingsConfigDict(
case_sensitive=False,
env_nested_delimiter="__",
protected_namespaces=(),
extra="ignore",
populate_by_name=True,
)


class SeedFarmerParameters(CdkBaseSettings):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this!

"""Seedfarmer Parameters.

These parameters are required for the module stack.
"""

model_config = SettingsConfigDict(env_prefix="SEEDFARMER_PARAMETER_")

notebook_name: str
instance_type: str

direct_internet_access: str = Field(default="Enabled")
root_access: str = Field(default="Disabled")
volume_size_in_gb: Optional[int] = Field(default=None)
imds_version: str = Field(default="2")
subnet_ids: Optional[List[str]] = Field(default=None)
code_repository: Optional[str] = Field(default=None)
additional_code_repositories: Optional[List[str]] = Field(default=None)
vpc_id: Optional[str] = Field(default=None)
kms_key_arn: Optional[str] = Field(default=None)
role_arn: Optional[str] = Field(default=None)
tags: Optional[Dict[str, str]] = Field(default=None)

@field_validator("notebook_name")
@classmethod
def validate_name_length(cls, v: str) -> str:
"""Validate if notebook_name length is valid."""
if len(v) <= 50:
return f"{v}-{int(time.time())}"

raise ValueError(f"'name' length must be <= 50, got '{len(v)}'")


class SeedFarmerSettings(CdkBaseSettings):
"""Seedfarmer Settings.

These parameters comes from seedfarmer by default.
"""

model_config = SettingsConfigDict(env_prefix="SEEDFARMER_")

project_name: str = Field(default="")
deployment_name: str = Field(default="")
module_name: str = Field(default="")

@computed_field # type: ignore
@property
def app_prefix(self) -> str:
"""Application prefix."""
prefix = "-".join([self.project_name, self.deployment_name, self.module_name])
return prefix


class CdkDefaultSettings(CdkBaseSettings):
"""CDK Default Settings.

These parameters comes from AWS CDK by default.
"""

model_config = SettingsConfigDict(env_prefix="CDK_DEFAULT_")

account: str
region: str


class ApplicationSettings(CdkBaseSettings):
"""Application settings."""

settings: SeedFarmerSettings = Field(default_factory=SeedFarmerSettings)
parameters: SeedFarmerParameters = Field(default_factory=SeedFarmerParameters)
default: CdkDefaultSettings = Field(default_factory=CdkDefaultSettings)
Loading
Loading