Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sub]feat: refactor image transfer and add tests #759

Closed
wants to merge 32 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
2b739f3
feat: decouple image builder from worker
SdgJlbl Jul 11, 2023
be9f486
fix: update skaffold config
guilhem-barthes Jul 18, 2023
4cc02e8
feat: add `ServiceAccount` and modify role
guilhem-barthes Jul 19, 2023
5b9bd8c
fix: improve `wait_for_image_built`
guilhem-barthes Jul 19, 2023
5051de2
feat: build image in new pod
guilhem-barthes Jul 19, 2023
187b5b5
chore: rename `deployment-builder.yaml` to `stateful-builder.yaml`
guilhem-barthes Jul 19, 2023
d857ddf
chore: rename `stateful-builder.yaml` to `statefulset-builder.yaml`
guilhem-barthes Jul 19, 2023
8f8f688
chore: centralize params
guilhem-barthes Jul 19, 2023
a3ecb53
feat: create `BuildTask`
guilhem-barthes Jul 20, 2023
91967cf
feat: move some values to `builder` module
guilhem-barthes Jul 20, 2023
d25c65b
feat: move more code to `builder`
guilhem-barthes Jul 24, 2023
1f326ff
fix: remove TaskProfiling as Celery task + save Entrypoint in DB
SdgJlbl Jul 31, 2023
307df7c
fix: extract entrypoint from registry
SdgJlbl Aug 1, 2023
331ff59
fix: make doc for helm chart
SdgJlbl Aug 3, 2023
a0aa2cb
feat: build function at registration (#707)
guilhem-barthes Aug 9, 2023
4b1453a
feat: share images between backends (#708)
SdgJlbl Aug 17, 2023
08015a6
chore: release 0.40.0-alpha.1 (#710)
ThibaultFy Aug 21, 2023
8e3219d
chore: update helm worklfow
ThibaultFy Aug 21, 2023
a630ba5
release: 0.40.0-alpha.2
ThibaultFy Aug 21, 2023
4b5c58b
chore: add .DS_Store to gitignore
ThibaultFy Aug 22, 2023
16ac6b7
chore: rm DS_Store
ThibaultFy Aug 22, 2023
ef2ec80
chore: release 0.40.0-alpha.3
ThibaultFy Aug 22, 2023
ba328be
chore: rm .DS_Store
ThibaultFy Aug 22, 2023
0484a01
[sub]fix: add missing migration poc (#728)
guilhem-barthes Sep 7, 2023
62d93e8
[sub]feat: add function events (#714)
guilhem-barthes Sep 19, 2023
b8c8032
[sub]fix(app/orchestrator/resources): FunctionStatus.FUNCTION_STATUS_…
thbcmlowk Sep 29, 2023
82b5e4e
fix: builder using builder SA (#754)
guilhem-barthes Oct 6, 2023
d52438e
fix: rebase changelog
guilhem-barthes Oct 6, 2023
2eef17a
chore: clean image transfer and add tests
ThibaultFy Oct 9, 2023
d6aa882
chore: to pydantic v2
ThibaultFy Oct 9, 2023
125f6a1
chore: launch serial test on a separate process
ThibaultFy Oct 9, 2023
5987add
test: add serial marker
ThibaultFy Oct 12, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .github/workflows/helm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ on:
- "main"
paths:
- "charts/**"
release:
types: [published, edited]

concurrency:
group: "${{ github.workflow }}-${{ github.ref }}"
Expand Down Expand Up @@ -85,7 +87,7 @@ jobs:
publish:
name: Publish
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
if: github.ref == 'refs/heads/main' && github.event_name == 'push' || github.event_name == 'release'
needs:
- test
- generate-chart-readme
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ wheels/
.installed.cfg
*.egg
MANIFEST
.DS_Store

# PyInstaller
# Usually these files are written by a python script from a template
Expand Down
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

- Field `asset_type` on `AssetFailureReport` (based on protobuf enum `orchestrator.FailedAssetKind`) ([#727](https://github.com/Substra/substra-backend/pull/727))
- Celery task `FailableTask` that contains the logic to store the failure report, that can be re-used in different assets. ([#727](https://github.com/Substra/substra-backend/pull/727))
- Add `FunctionStatus` enum ([#714](https://github.com/Substra/substra-backend/pull/714))
- BREAKING: Add `status` on `api.Function` (type `FunctionStatus`) ([#714](https://github.com/Substra/substra-backend/pull/714))

### Changed

- `ComputeTaskFailureReport` renamed in `AssetFailureReport` ([#727](https://github.com/Substra/substra-backend/pull/727))
- Field `AssetFailureReport.compute_task_key` renamed to `asset_key` ([#727](https://github.com/Substra/substra-backend/pull/727))

## [0.42.1](https://github.com/Substra/substra-backend/releases/tag/0.42.1) 2023-10-06

### Added
Expand Down
6 changes: 4 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,9 @@ db: ## Set up test database

.PHONY: test
test: ## Run tests
cd backend && pytest --cov-report=
cd backend && \
pytest -n auto --cov-report= -m "not serial" && \
pytest --cov-report= -m "serial"

.PHONY: coverage
coverage: test ## Report test coverage
Expand All @@ -32,7 +34,7 @@ format: ## Format code
lint: ## Perform a static analysis of the code
flake8 $(SRC_DIRS)
bandit --ini=.bandit
mypy backend/substrapp/tasks/
mypy

.PHONY: shell
shell: ## Start a Python shell for the Django project
Expand Down
30 changes: 26 additions & 4 deletions backend/api/events/sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
from api.serializers import PerformanceSerializer
from orchestrator import client as orc_client
from orchestrator import computetask
from orchestrator import failure_report_pb2

logger = structlog.get_logger(__name__)

Expand Down Expand Up @@ -85,13 +86,19 @@ def _create_function(channel: str, data: dict) -> None:
def _on_update_function_event(event: dict) -> None:
"""Process update function event to update local database."""
logger.debug("Syncing function update", asset_key=event["asset_key"], event_id=event["id"])
_update_function(key=event["asset_key"], data=event["function"])
function = event["function"]
_update_function(key=event["asset_key"], name=function["name"], status=function["status"])


def _update_function(key: str, data: dict) -> None:
def _update_function(key: str, *, name: Optional[str] = None, status: Optional[str] = None) -> None:
"""Process update function event to update local database."""
function = Function.objects.get(key=key)
function.name = data["name"]

if name:
function.name = name
if status:
function.status = status

function.save()


Expand Down Expand Up @@ -376,7 +383,22 @@ def _disable_model(key: str) -> None:
def _on_create_failure_report(event: dict) -> None:
"""Process create failure report event to update local database."""
logger.debug("Syncing failure report create", asset_key=event["asset_key"], event_id=event["id"])
_update_computetask(key=event["asset_key"], failure_report=event["failure_report"])

asset_key = event["asset_key"]
failure_report = event["failure_report"]
asset_type = failure_report_pb2.FailedAssetKind.Value(failure_report["asset_type"])

if asset_type == failure_report_pb2.FAILED_ASSET_FUNCTION:
# Needed as this field is only in ComputeTask
compute_task_keys = ComputeTask.objects.values_list("key", flat=True).filter(
function_id=asset_key,
status__in=[ComputeTask.Status.STATUS_TODO.value, ComputeTask.Status.STATUS_DOING.value],
)

for task_key in compute_task_keys:
_update_computetask(key=str(task_key), failure_report={"error_type": failure_report.get("error_type")})
else:
_update_computetask(key=asset_key, failure_report=failure_report)


EVENT_CALLBACKS = {
Expand Down
30 changes: 30 additions & 0 deletions backend/api/migrations/0053_function_status.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Generated by Django 4.2.3 on 2023-08-23 13:18

from django.db import migrations
from django.db import models


class Migration(migrations.Migration):
dependencies = [
("api", "0052_remove_metric_from_performance"),
]

operations = [
migrations.AddField(
model_name="function",
name="status",
field=models.CharField(
choices=[
("FUNCTION_STATUS_UNKNOWN", "Function Status Unknown"),
("FUNCTION_STATUS_CREATED", "Function Status Created"),
("FUNCTION_STATUS_BUILDING", "Function Status Building"),
("FUNCTION_STATUS_READY", "Function Status Ready"),
("FUNCTION_STATUS_CANCELED", "Function Status Canceled"),
("FUNCTION_STATUS_FAILED", "Function Status Failed"),
],
default="FUNCTION_STATUS_UNKNOWN",
max_length=64,
),
preserve_default=False,
),
]
28 changes: 28 additions & 0 deletions backend/api/migrations/0054_alter_function_status.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Generated by Django 4.2.3 on 2023-09-27 16:09

from django.db import migrations
from django.db import models


class Migration(migrations.Migration):
dependencies = [
("api", "0053_function_status"),
]

operations = [
migrations.AlterField(
model_name="function",
name="status",
field=models.CharField(
choices=[
("FUNCTION_STATUS_UNKNOWN", "Function Status Unknown"),
("FUNCTION_STATUS_WAITING", "Function Status Waiting"),
("FUNCTION_STATUS_BUILDING", "Function Status Building"),
("FUNCTION_STATUS_READY", "Function Status Ready"),
("FUNCTION_STATUS_CANCELED", "Function Status Canceled"),
("FUNCTION_STATUS_FAILED", "Function Status Failed"),
],
max_length=64,
),
),
]
6 changes: 6 additions & 0 deletions backend/api/models/function.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import orchestrator.common_pb2 as common_pb2
from api.models.utils import AssetPermissionMixin
from api.models.utils import URLValidatorWithOptionalTLD
from orchestrator import function_pb2


class FunctionInput(models.Model):
Expand Down Expand Up @@ -43,6 +44,10 @@ class Meta:
class Function(models.Model, AssetPermissionMixin):
"""Function represent a function and its associated metadata"""

Status = models.TextChoices(
"Status", [(status_name, status_name) for status_name in function_pb2.FunctionStatus.keys()]
)

key = models.UUIDField(primary_key=True)
name = models.CharField(max_length=100)
description_address = models.URLField(validators=[URLValidatorWithOptionalTLD()])
Expand All @@ -57,6 +62,7 @@ class Function(models.Model, AssetPermissionMixin):
creation_date = models.DateTimeField()
metadata = models.JSONField()
channel = models.CharField(max_length=100)
status = models.CharField(max_length=64, choices=Status.choices)

class Meta:
ordering = ["creation_date", "key"] # default order for relations serializations
1 change: 1 addition & 0 deletions backend/api/serializers/function.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ class Meta:
"permissions",
"inputs",
"outputs",
"status",
]

def to_representation(self, instance):
Expand Down
37 changes: 28 additions & 9 deletions backend/api/tests/asset_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@

import datetime
import uuid
from typing import Optional

from django.core import files
from django.utils import timezone
Expand All @@ -80,9 +81,10 @@
from api.models import Model
from api.models import Performance
from api.models import TaskProfiling
from substrapp.models import ComputeTaskFailureReport as ComputeTaskLogs
from substrapp.models import AssetFailureReport
from substrapp.models import DataManager as DataManagerFiles
from substrapp.models import DataSample as DataSampleFiles
from substrapp.models import FailedAssetKind
from substrapp.models import Function as FunctionFiles
from substrapp.models import Model as ModelFiles
from substrapp.utils import get_hash
Expand Down Expand Up @@ -236,6 +238,7 @@ def create_function(
creation_date=timezone.now(),
owner=owner,
channel=channel,
status=Function.Status.FUNCTION_STATUS_WAITING,
**get_permissions(owner, public),
)

Expand Down Expand Up @@ -534,20 +537,36 @@ def create_model_files(
return model_files


def create_computetask_logs(
compute_task_key: uuid.UUID,
logs: files.File = None,
) -> ComputeTaskLogs:
def create_asset_logs(
asset_key: uuid.UUID,
asset_type: FailedAssetKind,
logs: Optional[files.File] = None,
) -> AssetFailureReport:
if logs is None:
logs = files.base.ContentFile("dummy content")

compute_task_logs = ComputeTaskLogs.objects.create(
compute_task_key=compute_task_key,
asset_logs = AssetFailureReport.objects.create(
asset_key=asset_key,
asset_type=asset_type,
logs_checksum=get_hash(logs),
creation_date=timezone.now(),
)
compute_task_logs.logs.save("logs", logs)
return compute_task_logs
asset_logs.logs.save("logs", logs)
return asset_logs


def create_computetask_logs(
compute_task_key: uuid.UUID,
logs: Optional[files.File] = None,
) -> AssetFailureReport:
return create_asset_logs(compute_task_key, FailedAssetKind.FAILED_ASSET_COMPUTE_TASK, logs)


def create_function_logs(
function_key: uuid.UUID,
logs: Optional[files.File] = None,
) -> AssetFailureReport:
return create_asset_logs(function_key, FailedAssetKind.FAILED_ASSET_FUNCTION, logs)


def create_computetask_profiling(compute_task: ComputeTask) -> TaskProfiling:
Expand Down
1 change: 0 additions & 1 deletion backend/api/tests/views/test_views_computetask.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,6 @@ class GenericTaskViewTests(ComputeTaskViewTests):
def setUp(self):
super().setUp()
self.url = reverse("api:task-list")
self.maxDiff = None

todo_task = self.compute_tasks[ComputeTask.Status.STATUS_TODO]
waiting_task = self.compute_tasks[ComputeTask.Status.STATUS_WAITING]
Expand Down
Loading