Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add draft next-gen Python workflows #25

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions draft/python/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Advanced python workflows

**Keep these outside of `.github` to ensure they cannot run by mistake.**
292 changes: 292 additions & 0 deletions draft/python/_py-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,292 @@
# SPDX-FileCopyrightText: Copyright 2024, the RCSB PDB, University of California, Rutgers University, and contributors
henrychao-rcsb marked this conversation as resolved.
Show resolved Hide resolved
# SPDX-FileCopyrightText: Copyright 2020-2024, Contributors to CICD
# SPDX-PackageHomePage: https://github.com/dmyersturnbull/cicd
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright 2020-2023, Contributors to Tyrannosaurus
# SPDX-PackageHomePage: https://github.com/dmyersturnbull/tyrannosaurus
# SPDX-License-Identifier: Apache-2.0

# This workflow is designed to run various types of tests and analyses on the codebase.
# It is triggered from other workflows using `workflow_call`, with specific ref inputs.
# Different jobs cover tests across multiple operating systems (Ubuntu, Windows, macOS) and Python versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we do not have CI/CD runners for Windows and macOS environments right now, this may not be possible. Is this a strict need for RCSB Python projects?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely think it's ideal for packages we intend for wider audiences. How hard would this be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a concern on difficulty, it's a question of effort versus value. When we get to a point where we want to start distributing and maintaining packages for more environments, we can then evaluate this integration.

# Some jobs are conditional, triggered only by specific comments on pull requests or specific events,
# such as '/please-review', '/please-run-tests', '/please-check-docker-build', etc.
# TODO: Allow tests after failure to run as long as not cancelled

name: Test
run-name: >
${{ github.workflow }} \
${{ github.ref_name }} \
(triggered by ${{ github.ref_type }})

on:
workflow_call:
inputs:
ref:
type: string
description: Ref to check out
default: main

permissions:
contents: read
statuses: write
actions: read
security-events: write

jobs:

preliminary:
name: Start Markdown table
runs-on: ubuntu-latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the current process is to create and define Docker images for runners within the https://github.com/rcsb/devops-runner-images repo, associate them with a label to runners within our Kubernetes cluster (https://github.com/rcsb/rcsb-kube/blob/master/helm/gh-actions-runner-controller/values/west.yaml#L45), then refer to them in jobs by their label. We'll want to avoid or minimize the use of GitHub hosted runners as much as possible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be a good way to generate textual reports (like this Markdown table)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would probably be the easier option: https://github.blog/news-insights/product-news/supercharging-github-actions-with-job-summaries/. Otherwise, perhaps building an output markdown build artifact file would work, provided you use an IDE or some other tool to view the markdown content.

steps:
- run:
echo "\n| Test | Outcome |\n| --- | --- |" >> ${GITHUB_STEP_SUMMARY}

pytest:
# This job runs unit tests on multiple operating systems (Ubuntu, Windows, macOS) with Python 3.12.
# It is triggered by pull requests, specific comments ('/please-review', '/please-run-tests'),
# or when called directly from another workflow.
# The job uploads coverage reports as artifacts for each combination of OS and Python version.
if: >
github.event_name == 'pull_request' ||
github.event.comment.body == '/please-review' ||
github.event.comment.body == '/please-run-tests' ||
github.event_name == 'workflow_call'
strategy:
matrix:
os:
- ubuntu-latest
- windows-latest
- macos-latest
python:
- '3.12'
name: Run tests for ${{ matrix.os }}, Python ${{ matrix.python }}
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.ref || github.ref }}
- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python }}
Comment on lines +69 to +71
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, do you envision that the pipeline will need to support multiple Python versions within the same run? If so, we'll want to look into something like pyenv within the containerized runner environment, rather than setup python each time on each run.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No hard objection, and that would undoubtably be faster, but using only pyenv is limited, and using both a matrix and pyenv gets complicated pretty quickly. For instance, to test for both FastAPI v0.x.x and v1.x.x, where v1 requires Python>3.10:

strategy:
  matrix:
    os:
      - ubuntu-latest
      - windows-latest
      - macos-latest
    python:
      - '3.10'
      - '3.13'
    fastapi-major:
      - '0'
      - '1'
    exclude:
      - python: '3.10'
        fastapi-major: '0'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we've reached this point for the FastAPI example, then more likely than not, the matrix would be defined within the workflow of the specific project, which then calls the workflow in this library to pass in the appropriate inputs for a build of a specific python version and dependency.

- name: Install dependencies and run tests
id: test
run: |
pip install hatch~=1.12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to another comment in this review, package installs will be handled in the Docker runner image build process. This way, we can avoid running the same package install commands multiple time across different jobs, and ensure that the versions of the packages used are consistent throughout the pipeline.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking I understand. Currently, multiple _test.yaml jobs install Python, build the main package (let's say pkg) and download and install pkg's dependencies along with some extra packages:

  • pytest (builds pkg; installs dependencies, Hatch, and test frameworks)
  • docker (builds pkg; installs dependencies and Hatch)
  • docs (installs Hatch and doc-building tools)
  • security and codeql also set up Python

The pytest and docker redundancy isn't great. We could run tests inside Docker to avoid it. The downside is with testing non-containerized RCSB Python packages (i.e. libraries, clients, etc.): Currently, excluding a Dockerfile disables that step, while tests are still run. If we move tests to Docker, things are less straightforward.

As for having to build separately for all 4 jobs -- if I understand correctly, you're proposing a solution for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hatch, test frameworks, and doc-building tools would be installed as part of the Python runner image. When the pipeline runs for pkg, it'll pull and use the Python runner image and avoid having to download and install those common build tools. It'll just have to install the pkg specific dependencies and build pkg.

hatch run test
echo "\
| Pytest on ${os}, Python ${python} \
| $( (( $? == 0 )) && echo ✅ || echo ❌ ) \
|\
" >> ${GITHUB_STEP_SUMMARY}
- name: Upload coverage report
uses: actions/upload-artifact@v4
with:
name: coverage-${{ matrix.os }}-${{ matrix.python }}
path: coverage.json
if-no-files-found: error

docker:
# This job builds Docker images on Ubuntu and Windows if a Dockerfile is present.
# It is triggered by pull requests, specific comments ('/please-review', '/please-check-docker-build'),
# or when called directly from another workflow.
# macOS is not included due to known limitations with Docker on macOS.
if: >
github.event_name == 'pull_request' ||
github.event.comment.body == '/please-review' ||
github.event.comment.body == '/please-check-docker-build' ||
github.event_name == 'workflow_call'
strategy:
matrix:
os:
- ubuntu-latest
- windows-latest
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.ref || github.ref }}
- name: Check for Dockerfile
id: check-docker
run: |
[[ -f Dockerfile ]]; echo "found=$?" >> ${GITHUB_ENV}
- name: Build Docker image
if: env.found == '0'
run: |
docker build .
echo "\
| **Docker build** on ${os} \
| $( (( $? == 0 )) && echo ✅ || echo ❌ ) \
|\
" >> ${GITHUB_STEP_SUMMARY}

codeql:
# This job runs CodeQL analysis for code quality and security checks.
# It is triggered by pull requests, specific comments ('/please-review', '/please-check-code-quality'),
# or when called directly from another workflow.
if: >
github.event_name == 'pull_request' ||
github.event.comment.body == '/please-review' ||
github.event.comment.body == '/please-check-code-quality' ||
github.event_name == 'workflow_call'
name: Analyze with CodeQL
continue-on-error: true
runs-on: ubuntu-latest
steps:
- name: Check out repository
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.ref || github.ref }}
- name: Run ruff --no-fix
id: ruff
run:
pip install hatch~=1.12
hatch run check
- name: Run CodeQL init
uses: github/codeql-action/init@v3
- name: Run CodeQL autobuild
uses: github/codeql-action/autobuild@v3
- name: Run CodeQL analyze
id: codeql
uses: github/codeql-action/analyze@v3
- name: Log Ruff outcome
run: |
v=[[ ${{ steps.ruff.outcome }} == success ]]
echo "\
| **Ruff linting** \
| $( (( ${v} == 0 )) && echo ✅ || echo ⚠️ ) \
|\
" >> ${GITHUB_STEP_SUMMARY}
- name: Log CodeQL outcome
run: |
v=[[ ${{ steps.codeql.outcome }} == success ]]
echo "\
| **CodeQL linting** \
| $( (( ${v} == 0 )) && echo ✅ || echo ⚠️ ) \
|\
" >> ${GITHUB_STEP_SUMMARY}

security:
# This job runs security linters (Bandit) to check for security issues in the codebase.
# It is triggered by pull requests, specific comments ('/please-review', '/please-check-security'),
# or when called directly from another workflow.
if: >
github.event_name == 'pull_request' ||
github.event.comment.body == '/please-review' ||
github.event.comment.body == '/please-check-security' ||
github.event_name == 'workflow_call'
name: Run security linters
runs-on: ubuntu-latest
outputs:
test_result: ${{ steps.bandit.outcome }}
steps:
- name: Check out code
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.ref || github.ref }}
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Run Bandit security linter
id: bandit
run: |
pip install hatch~=1.12
hatch run check-security
echo "\
| **Security checks** \
| $( (( $? == 0 )) && echo ✅ || echo ❌ ) \
|\
" >> ${GITHUB_STEP_SUMMARY}

docs:
# This job tests building the documentation to ensure there are no errors.
# It is triggered by pull requests and the comments '/please-review' and '/please-check-docs'.
if: >
github.event_name == 'pull_request' ||
github.event.comment.body == '/please-review' ||
github.event.comment.body == '/please-check-docs' ||
github.event_name == 'workflow_call'
name: Test building docs
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.ref || github.ref }}
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies and build docs
id: build_docs
run: |
pip install hatch~=1.12
# Use ⚠️ if there were warnings.
mkdocs_output=$( hatch run build-docs 2>&1 )
emoji=$(
(( $? != 0 ))
&& echo "❌"
|| ( echo "${mkdocs_output}" | grep --quiet "^WARNING " && echo"⚠️" || echo "✅" )
)
echo "\
| **Documentation build** \
| ${emoji} \
|\
" >> ${GITHUB_STEP_SUMMARY}
echo "${mkdocs_output}"

links:
# This job runs a link checker (linkspector) to verify that all links in the documentation are valid.
# It is triggered by pull requests and the comments '/please-review' and '/please-check-links'.
# Failures will NOT cause the workflow run to fail.
# Note that it uses Reviewdog and will leave a PR review.
name: Check for broken links
if: >
github.event_name == 'pull_request' ||
github.event.comment.body == '/please-review' ||
github.event.comment.body == '/please-check-links' ||
github.event_name == 'workflow_call'
runs-on: ubuntu-latest
continue-on-error: true
steps:
- name: Check out code
uses: actions/checkout@v4
- name: Run linkspector
id: linkspector
uses: umbrelladocs/action-linkspector@v1
with:
github_token: ${{ secrets.github_token }}
filter_mode: added
reporter: ${{ github.event_type == 'schedule' && 'github-check' || 'github-pr-review' }}
- name: Log
run: |
v=${{ steps.linkspector.outcome }}
echo "\
| Markdown links \
| $( [[ ${v} == success ]] && echo ✅ || echo ⚠️ ) \
|\
" >> ${GITHUB_STEP_SUMMARY}

quick-test:
# This job performs a quick test run for fast feedback.
# **IMPORTANT:** This job is ONLY triggered via /please-check-quickly !
if: github.event.comment.body == '/please-check-quickly'
name: Quick test on Ubuntu, Python 3.12
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.ref || github.ref }}
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Run quick tests
run: |
pip install hatch~=1.12
hatch run test
echo "\
| Pytest on ubuntu-latest, Python 3.12 \
| $( (( $? == 0 )) && echo ✅ || echo ❌ ) \
|\
" >> ${GITHUB_STEP_SUMMARY}
77 changes: 77 additions & 0 deletions draft/python/py-check-pr-metadata.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# SPDX-FileCopyrightText: Copyright 2024, the RCSB PDB, University of California, Rutgers University, and contributors

# TODO: This is not complete.

name: Validate PR metadata
run-name: ${{ github.workflow }} on ${{ github.ref_name }}

on:
pull_request:
types:
- ready_for_review
- edited
issue_comment:
types:
- created

jobs:

commitizen-check:
runs-on: ubuntu-latest

steps:

- name: Check out repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install Commitizen
run: pip install commitizen~=3.27

- name: Check if comment contains /please-check-pr-metadata
if: github.event_name == 'issue_comment'
id: check-comment
run: |
exit [[ "${{ github.event.comment.body }}" == *"/please-check-pr-metadata"* ]]

- name: Validate proposed commit message with Commitizen
id: validate-commit-message
run: |
commit_message=$(jq --raw-output '.pull_request.title' "${GITHUB_EVENT_PATH}")
commit_body=$(jq --raw-output '.pull_request.body' "${GITHUB_EVENT_PATH}")
commit_message="${commit_message}\n\n${commit_body}"
cz_output=$(echo "${commit_message}" | cz check 2>&1) && {
echo "**✅** The PR title+description meets commit message requirements.\n\n" \
>> ${GITHUB_STEP_SUMMARY}
} || {
echo "fail_msg=The PR title+description do not meet commit message requirements: ${cz_output}" \
>> ${GITHUB_ENV}
echo "\
**❌** The PR title+description does not meet commit message requirements: \
\n```\n${cz_output}\n```\n\n" \
>> ${GITHUB_STEP_SUMMARY}
exit 1
}

- name: Set validation status
if: always()
uses: actions/github-script@v7
with:
script: |
const { context, github } = require('@actions/github');
const status = context.job === 'success' ? 'success' : 'failure';
const description = status === 'success' ?
'Validation passed'
: 'Validation failed: ' + process.env.fail_msg;
await github.repos.createCommitStatus({
owner: context.repo.owner,
repo: context.repo.repo,
sha: context.payload.pull_request.head.sha,
state: status,
description: status === 'success' ? 'Validation passed' : 'Validation failed',
context: 'PR Metadata Validation'
});
Loading