Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement SLASensor #177

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Conversation

chaelant
Copy link

@chaelant chaelant commented Nov 1, 2024

Description

This implements an SLASensor to monitor the on-time performance of Databricks workflows. It is a special instance of the WorkflowTaskDependencySensor that accepts an expected SLA time in the form of a UTC datetime object or a quartz cron expression in the workflow tags. The user provides the target workflow name and the task to monitor in that workflow that would indicate the workflow has completed. If the workflow does not finish by the specified time, an alert will be sent to a user-provided email and/or slack channel.

Related Issue

This PR closes #166.

Motivation and Context

Current alerting functionality only allows for alerts to be sent if a workflow runs longer than a specified amount of time. This change adds the functionality to be able to be alerted if a workflow does not finish by a certain time, regardless of when the workflow was kicked off.

How Has This Been Tested?

These changes were tested with make test and by building a wheel file and testing within a workflow. This implementation does not affect other features of brickflow.

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Signed-off-by: Michael Espiritu <[email protected]>
databricks_token=api_token_key,
custom_description="message to provide additional context",
slack_webhook_url="https://hooks.slack.com/your/webhook/url",
email_params={
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make email_params in the class initialization as a pydantic object or a type object to validate the email_params.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @asingamaneni - updated sla_sensor.py to have an EmailParams pydantic object and use that to validate the email params passed to the SLASensor in the initialization. Will throw a ValidationError if missing required fields

Copy link

codecov bot commented Nov 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.08%. Comparing base (35d0824) to head (cdf9a01).
Report is 26 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #177      +/-   ##
==========================================
+ Coverage   90.33%   92.08%   +1.75%     
==========================================
  Files          22       22              
  Lines        3384     4322     +938     
==========================================
+ Hits         3057     3980     +923     
- Misses        327      342      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Implement an SLA Miss Sensor for workflows
2 participants