Workflow and Datasets Automation Blog

Repository for the code and configuration files referenced in the Workflow Automation for Nextflow Tower Pipelines blog post.

The material provided can be built as an AWS Lambda-compatible container image which can also be run on your local machine.

Quickstart

To run this code on your local machine, do the following:

Clone the repository

$ git clone https://github.com/seqeralabs/datasets-automation-blog.git
Install Python 3.9
Install docker
Install the aws cli
Configure the aws cli
Build the Docker image

$ docker build --tag lambda_tutorial:v1.0 .
Run the container

$ docker run --rm -it -v ~/.aws:/root/.aws:ro -p 9000:8080 lambda_tutorial:v1.0
Send a transaction to the container

$ curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d @PATH_TO_YOUR_JSON_TEST_EVENT

NOTE: Transactions will receive error messages until you add the necessary configuration items to your AWS Account (see below).

Required Configuration

The provided code relies on the presence of specific artefacts in your AWS Account and Tower instance.

Please see the related blog for step-by-step instructions to create the following:

AWS
1. S3 Bucket
2. IAM Role
  - lambda_tutorial
3. Secrets Manager
  - lambda_tutorial/tower_PAT
4. Systems Manager Parameter Store
  - /lambda_tutorial/tower_api_endpoint
  - /lambda_tutorial//lambda_tutorial/workspace_id
  - /lambda_tutorial/target_pipeline_name
  - /lambda_tutorial/s3_root_prefix
  - /lambda_tutorial/samplesheet_file_types
  - /lambda_tutorial/logging_level
5. ECR
  - lambda_tutorial
Nextflow Tower
1. Personal Access Token

Deploying to AWS Lambda

To deploy the code to the AWS Lambda Service, please see the related blog for step-by-step instructions.

NOTE: Do not deploy the image until you have created a local container and created all necessary configuration keys.

Folder Structure

The code is organized as follows:

$ tree
.
├── Dockerfile
├── LICENSE
├── README.md
├── app.py
├── aws-lambda-rie-x86_64
├── datafiles
│   └── samplesheet_full.csv
├── entry_script.sh
├── iam
│   ├── lambda_tutorial_all_permissions.json
│   └── trust_policy.json
├── requirements.txt
└── testing
    ├── test_event_bad_file.json
    ├── test_event_bad_prefix.json
    └── test_event_good.json

Salient features

The iam folder contains the policies you can attach to your AWS IAM Role.
The datafiles folder contains an example sample sheet for the https://github.com/nf-core/rnaseq pipeline (the pipeline used during the creation of this material).
The testing folder contains sample S3 Put notification events that the AWS Lambda Service receives from S3. This can be used when testing locally and/or when testing in the Lambda Service.
- If you conduct tests with these file, be sure to replace YOUR_AWS_REGION and YOUR_S3_BUCKET with your own values. Also ensure that your positive test cases have a file in your corresponding S3 local so that the function can successfully retrieve it.
The aws-lambda-rie-x86_64 and entry_script.sh files are used to allow your container to emulate AWS Lambda while testing locally.
The app.py file is the Python 3.9 code that will be executed by your Lambda function.

While most parameters are externalized to supporting AWS Services, three values were hardcoded for ease of development. If you choose to use different names for your configuration items than directed, please ensure that you have updated them in the code as well.
- execution_role
  The Role used by AWS Lambda to interact with other AWS Services. Set to lambda_tutorial.
- params_to_retrieve
  An array populated with the AWS Systems Manager Parameter Store parameters.
- secret_name
  The AWS Secrets Manager key containing the value of your Tower PAT.

Caveat

This code was written to demonstrate the art of the possible for clients of Nextflow Tower. It has not yet been optimized for maximum efficiency nor to minimize unnecessary retries.

Given that AWS Lambda charges for each MB of RAM on a per 1ms basis, and resulting pipeline invocations will incur charges with your batch computing provider, individuals are advised to conduct further testing and refinements before deployment to Production so as to minimize the risk of unexpected billing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Workflow and Datasets Automation Blog

Quickstart

Required Configuration

Deploying to AWS Lambda

Folder Structure

Salient features

Caveat

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
datafiles		datafiles
iam		iam
testing		testing
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
aws-lambda-rie-x86_64		aws-lambda-rie-x86_64
entry_script.sh		entry_script.sh
requirements.txt		requirements.txt

License

seqeralabs/datasets-automation-blog

Folders and files

Latest commit

History

Repository files navigation

Workflow and Datasets Automation Blog

Quickstart

Required Configuration

Deploying to AWS Lambda

Folder Structure

Salient features

Caveat

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages