Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Feature for Subprocess based Job Execution #90

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Change Log
Notes: During development phase, expect breaking API and YAML schema changes during minor updates. Patch updates are guarenteed to be backward compatible.

## 0.2.0
### API
#### GET /jobs/:jobID/logs
- In response body `container_logs` key is replaced by `process_logs`
#### PUT|POST /processes/:processID
- Request payload schema has changed (See Process YAML Schema changes below)

### Process YAML Schema
- `command` is now a first class object and moved outside of `container`
- `config` object is added
- `maxResources` and `envVars` are moved under `config` object
- `image` moved under `host`
- `container` object removed
- `host.type` valid options are changed from `local` | `aws-batch` to `docker` | `aws-batch` | `subprocess`

### Features
- `subprocess` type processes now can be executed through API. They must be registered like other processes and will be called using OS subprocess calls.

### Documentation
- A `CHANGELOG.md` file is added in the repo.
- Process templates are provided for all three host types in `./process_templates` folder
- Windows setup instructions are added in `README.md`

## 0.1.0
2 changes: 2 additions & 0 deletions DEV_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@
- Requests from Admin Role are allowed to retrieve all jobs information, non admins can only retrieve information for jobs that they submitted.
- Only admins can add/update/delete processes.

## Inputs
- If `"Inputs": {}` in `/execution` payload. Nothing will be appended to process commands. This allow running processes that do not have any inputs.

## Scope
- The behavior of logging is unknown for AWS Batch processes with job definitions having number of attempts more than 1.
32 changes: 22 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
[![E2E Tests](https://github.com/dewberry/process-api/actions/workflows/e2e-tests.yml/badge.svg?event=push)](https://github.com/Dewberry/process-api/actions/workflows/e2e-tests.yml)
[![Update Sequence Diagrams Wiki](https://github.com/Dewberry/process-api/actions/workflows/update-squence-wiki.yml/badge.svg)](https://github.com/Dewberry/process-api/actions/workflows/update-squence-wiki.yml)

A lightweight, extensible, OGC compliant Process API for local and cloud based containerized processing.
A lightweight, extensible, OGC-API Processes compliant API for local and cloud based task processing.

For more information on the specification visit the [OGC API - Processes - Part 1: Core](https://docs.ogc.org/is/18-062r2/18-062r2.html#toc0).

Expand All @@ -15,6 +15,8 @@ https://developer.ogc.org/api/processes/index.html

## Getting Started

### Linux using Docker

1. Create docker network `docker network create process_api_net`

1. Build docker images for example plugins
Expand All @@ -25,14 +27,21 @@ chmod +x build.sh &&
```
1. Create a `.env` file (example below) at the root of this repo.
![](imgs/readme/getting-started.gif)
1. Add process configuration file(s) (yaml) to the [plugins](plugins/) directory
1. run `docker compose up`
1. Add/Delete process configuration file(s) (yaml) to the [plugins](plugins/) directory as needed
1. Run `docker compose up`
1. Create a bucket in the minio console (http://localhost:9001).
1. Test endpoints using the swagger documentation page. (http://localhost:5050/swagger/index.html)

![](imgs/readme/swagger-demo.gif)

*API docs created using [swaggo](https://github.com/swaggo/swag)*
### Windows
*`docker` and `aws-batch` jobs have not been tested yet on Windows*

1. Download and run MinIO https://min.io/docs/minio/windows/index.html
2. In a separate command prompt window, CD into api folder. Run `cd api`
3. Build API by running `go build -o papi.exe main.go`
4. Create a `.env` file (example below). Update paths in the env file as needed.
5. Run API by `papi -e .env`

---

Expand All @@ -45,7 +54,7 @@ The API is the main orchestrator for all the downstream functionality and a sing

### Processes
![](imgs/readme/processes.png)
Processes are computational tasks described through a configuration file that can be executed in a container. Each configuration file contains information about the process such as the title of this process, its description, execution mode, execution resources, secrets required, inputs, and outputs. Each config file is to be unmarshalled to register a process in the API. These processes then can be called several times by the users to run jobs.
Processes are computational tasks described through a configuration file that can be executed as a subprocess or in a container. Each configuration file contains information about the process such as the title of this process, its description, execution mode, execution resources, secrets required, inputs, and outputs. Each config file is to be unmarshalled to register a process in the API. These processes then can be called several times by the users to run jobs.


### Jobs
Expand All @@ -64,15 +73,17 @@ Execution platforms are hosts that can provide resources to run a job. The This

![](imgs/readme/design.svg)

At the start of the app, all the `.yaml` `.yml` (configuration) files are read and processes are registered. Each file describes what resources the process requires and where it wants to be executed. There are two execution platforms available; local processes run in a docker container, hence they must specify a docker image and the tag. The API will download these images from the repository and then run them on the host machine. Commands specified will be appended to the entrypoint of the container. The API responds to the request of local processes synchronously.
At the start of the app, all the `.yaml` `.yml` (configuration) files are read and processes are registered. Each file describes what resources the process requires and where it wants to be executed. There are three execution platforms available; docker processes run in a docker container, hence they must specify a docker image and the tag. The API will download these images from the repository and then run them on the host machine. Commands specified will be appended to the entrypoint of the container. The API responds to the request of local processes synchronously.

Cloud processes are executed on the cloud using a workload management service. AWS Batch was chosen as the provider for its wide user base. Cloud processes must specify the provider type, job definition, job queue, and job name. The API will submit a request to run the job to the AWS Batch API directly.

The containerized processes must expect a JSON load as the last argument of the entrypoint command and write results as the last log message in the format `{"plugin_results": results}`. It is the responsibility of the process to write these results correctly if the process succeeds. The API will store logs of the container and will try to parse the last log for results when the client requests results for jobs.
Subprocess based processes are executed natively using an OS subprocess call.

All processes must expect a JSON load as the last argument of the command and write results as the last log message in the format `{"plugin_results": results}`. It is the responsibility of the process to write these results correctly if the process succeeds. The API will store logs of the container and will try to parse the last log for results when the client requests results for jobs.

When a job is submitted, a local container is fired up immediately for sync jobs, and a job request is submitted to the AWS batch for async jobs. When a local job reaches a finished state (successful or failed), the local container is removed. Similarly, if an active job is explicitly dismissed using DEL route, the job is terminated, and resources are freed up. If the server is gracefully shut down, all currently active jobs are terminated, and resources are freed up.
When a local job (docker or subprocess) reaches a finished state (successful or failed), the artifacts of the jobs such as container is removed. Similarly, if an active job is explicitly dismissed using DEL route, the job is terminated, and resources are freed up. If the server is gracefully shut down, all currently active jobs are terminated, and resources are freed up.

The API responds to all GET requests (except `/jobs/<jobID>/results`) as HTML or JSON depending upon if the request is being originated from Browser or not or if it specifies the format using query parameter ‘f’.
The API responds to all GET requests as HTML or JSON depending upon if the request is being originated from Browser or not or if it specifies the format using query parameter ‘f’.

### Logs
![](imgs/readme/logs.png)
Expand All @@ -87,4 +98,5 @@ Similar to logs, metadata is not included in the OGC-API Processes specification
An env file is required and should be available at the root of this repository (`./.env`). See the [example.env](example.env) for a guide.

## Notes
*NOTE: This server was adapted for ogc-compliance from an existing api developed by @albrazeau*
1. This server was adapted for ogc-compliance from an existing api developed by @albrazeau
2. API docs created using [swaggo](https://github.com/swaggo/swag)
4 changes: 2 additions & 2 deletions api/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# -------------------------------
FROM golang:1.19.12 AS dev
FROM golang:1.23.2 AS dev

# RUN go install github.com/swaggo/swag/cmd/[email protected]

Expand All @@ -16,7 +16,7 @@ ENTRYPOINT ["CompileDaemon", "--build=go build main.go", "--command=./main"]
# -------------------------------

# -------------------------------
FROM debian:12.1-slim as prod
FROM debian:12.6-slim as prod

# Copy the CA certificates from the dev stage
COPY --from=dev /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
Expand Down
31 changes: 22 additions & 9 deletions api/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ go 1.19
require (
github.com/aws/aws-sdk-go v1.44.214
github.com/docker/docker v23.0.1+incompatible
github.com/google/uuid v1.3.0
github.com/google/uuid v1.6.0
github.com/joho/godotenv v1.5.1
github.com/labstack/echo/v4 v4.10.2
github.com/labstack/gommon v0.4.0
Expand All @@ -14,6 +14,20 @@ require (
github.com/swaggo/echo-swagger v1.3.5
github.com/swaggo/swag v1.8.1
gopkg.in/yaml.v3 v3.0.1
modernc.org/sqlite v1.33.1
)

require (
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/hashicorp/golang-lru/v2 v2.0.7 // indirect
github.com/ncruces/go-strftime v0.1.9 // indirect
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
modernc.org/gc/v3 v3.0.0-20240107210532-573471604cb6 // indirect
modernc.org/libc v1.55.3 // indirect
modernc.org/mathutil v1.6.0 // indirect
modernc.org/memory v1.8.0 // indirect
modernc.org/strutil v1.2.0 // indirect
modernc.org/token v1.1.0 // indirect
)

require (
Expand All @@ -40,8 +54,7 @@ require (
github.com/josharian/intern v1.0.0 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/mattn/go-colorable v0.1.13 // indirect
github.com/mattn/go-isatty v0.0.17 // indirect
github.com/mattn/go-sqlite3 v1.14.17
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/moby/term v0.0.0-20221205130635-1aeaba878587 // indirect
github.com/morikuni/aec v1.0.0 // indirect
github.com/opencontainers/go-digest v1.0.0 // indirect
Expand All @@ -51,13 +64,13 @@ require (
github.com/swaggo/files v0.0.0-20220728132757-551d4a08d97a // indirect
github.com/valyala/bytebufferpool v1.0.0 // indirect
github.com/valyala/fasttemplate v1.2.2 // indirect
golang.org/x/crypto v0.6.0 // indirect
golang.org/x/mod v0.6.0 // indirect
golang.org/x/net v0.7.0 // indirect
golang.org/x/sys v0.5.0 // indirect
golang.org/x/text v0.7.0 // indirect
golang.org/x/crypto v0.21.0 // indirect
golang.org/x/mod v0.16.0 // indirect
golang.org/x/net v0.22.0 // indirect
golang.org/x/sys v0.22.0 // indirect
golang.org/x/text v0.14.0 // indirect
golang.org/x/time v0.3.0 // indirect
golang.org/x/tools v0.2.0 // indirect
golang.org/x/tools v0.19.0 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gotest.tools/v3 v3.4.0 // indirect
)
Loading
Loading