Skip to content

Commit

Permalink
OxCGRT data merge, npi model computation docker deployment (#523)
Browse files Browse the repository at this point in the history
* 504 - Extending the NPI model data
* added a dummy featrues for every intervention which is turned on when a intervention is turned of for the first time and turned of when the original intervention is put in place again.
* extended the data for another month by using th last know countermeasures in each region and data from johns hopkins

* fixed workflow yaml

* fixed workflow yml

* maybe resolving failing dependency installation in github actions by updating pip and setuptools

* fixed the extension of data, removed the cancled columns from intervention json

* Introduce new intervention icons

* Fix linting

* Show NPI model chart only for model channel

* extrapolating data

* fixed data preprocessing - removing deaths from countermeasures

* short-term model improvements
* extrapolation

* added extrapolation date

* Merge OxCGRT countermeasure data

* poetry update

* Merged NPI data from OxCGRT

Prepared docker container which can be run on GCP compute instances from github pipelines

defined conda environment to accelerate the computation of the npi model

* fixed invalid github workflow yml

* add upload data step, removed decrypt secrets to workflow

* fixed upload data step

* moved GCP setup in workflow, added extrapolation perriod to the model

* fixed extrapolation period argument

* removed steps from compute npi workflow
* don't run previous steps in workflows, instead download the latest r_estimates.csv inside docker. The other steps are quick
Created a script which deletes the instance when docker exits. This script is copied to the GCP console, but I included it in the repo for consistency

* fixed create-with-container command in workflow

* added env file in compute-npi-model to hopefully fix the gcloud command

* fixed workflow

* fixed workflow

* set region for compute

* use different service account

* update gcloud in workflow

* changed order of arguments

* screw instance templates, it just refuses to work - defining the instance in the command

* set gcp project in workflow

* fixed typo

* pass foretold channel env variable directly

* fixed parameter name

* appending newline to env file

* debugging workflow

* redefine the machine

* the cpu has to also be specified

* dropped the machine type added vm type instead

* added scopes to the vm instance, so that it can pull docker image

* fixed syntax

* building docker container, debugging startup script

* use url to pass the startup script to the instance

* extracting branch name, refactoring workflow

* reformatted workflow yml

* fixed workflow step

* make sure the startup script won't block model

* trying it without the startup script

* another try to not block the npi model by startup script

* more disk-space (conda image is large), debugging startup script

* fixed preprocessing of countermeasures, debugging startup script

* removed the startup script, killing the instance from the docker container, reformatted code

* fixed linting after black update

* Filter out subregions from OxCGRT data
* OxCGRT added data for subregions (e.g. US states) which broke the pipeline. Fix is to filter it out, but we might use them in the future

* fixed key passing to the container

* Strip the quotes from the key - they are necessary when passing them in the env file

* fixed run model script

* fixed extrapolation date, changed channel, run on 40 countries

* small fixes
* Triggering the npi-model computing workflow manually
* More tune interactions of the model (to hopefully shrink the confidence interval)
* Made sure that each NUTS sampling process created by pymc3 only uses one thread - The parallelization doesn't work and this greatly speeds up the computation

* fixed lining

* pre-PR clean-up

* lgtm based fixes

Co-authored-by: Marek Pukaj <[email protected]>
  • Loading branch information
JanataPavel and marekpukaj committed Sep 11, 2020
1 parent 78ff0b6 commit e397f2e
Show file tree
Hide file tree
Showing 23 changed files with 6,628 additions and 4,066 deletions.
73 changes: 73 additions & 0 deletions .github/workflows/compute-npi-model.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
name: Compute npi-model and upload

on:
workflow_dispatch:
inputs:
channel_name:
description: 'Name of the channel to which the results of the model will be uploaded'
required: true
default: 'model'

jobs:
NPI-model-computation:
runs-on: ubuntu-latest
env:
RUN_REGION: us-west1-c
IMAGE_NAME: npi-model

steps:
- name: Checkout
uses: actions/checkout@master

- name: Checkout data repo
uses: actions/checkout@v2
with:
repository: epidemics/epimodel-covid-data
path: data-pipeline/data

- uses: GoogleCloudPlatform/github-actions/setup-gcloud@master
name: Setup Google Cloud Platform
with:
version: '290.0.1'
service_account_email: ${{ secrets.COMPUTE_SA_EMAIL }}
service_account_key: ${{ secrets.GOOGLE_COMPUTE_CREDENTIALS }}

# Configure docker to use the gcloud command-line tool as a credential helper
- run: |
gcloud auth configure-docker
# Build the Docker image
- name: Build Docker
working-directory: data-pipeline
run: |
docker build -t gcr.io/${{ secrets.GKE_PROJECT }}/$IMAGE_NAME:$GITHUB_SHA -f Dockerfile.conda .
# Push the Docker image to Google Container Registry
- name: Publish Docker
run: |
docker push gcr.io/${{ secrets.GKE_PROJECT }}/$IMAGE_NAME:$GITHUB_SHA
- name: Google cloud run setup
env:
GCP_KEY: ${{ secrets.NPI_MODEL_SERVICE_ACCOUNT_KEY }}
run: |
echo -E "GCP_KEY='${GCP_KEY}'" > .env
echo "FORETOLD_CHANNEL=${{ secrets.FORETOLD_CHANNEL }}" >> .env
echo "INSTANCE_NAME=$IMAGE_NAME" >> .env
echo "PROJECT_NAME=${{ secrets.GKE_PROJECT }}" >> .env
gcloud config set compute/zone $RUN_REGION
gcloud config set project ${{ secrets.GKE_PROJECT }}
- name: Run model and upload results
run: |
gcloud compute instances create-with-container $IMAGE_NAME \
--zone $RUN_REGION \
--image-project cos-cloud \
--image-family cos-stable \
--boot-disk-size 15GB \
--machine-type n2-custom-2-20480-ext \
--scopes default \
--container-restart-policy never \
--container-image "gcr.io/${{ secrets.GKE_PROJECT }}/${IMAGE_NAME}:${GITHUB_SHA}" \
--container-env-file .env \
--container-arg ${{ github.event.inputs.channel_name }}
228 changes: 228 additions & 0 deletions conda-requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
_libgcc_mutex=0.1=main
_pytorch_select=0.2=gpu_0
appdirs=1.4.4=py_0
arviz=0.8.3=py_0
attrs=19.3.0=py_0
backcall=0.2.0=py_0
binutils_impl_linux-64=2.33.1=he6710b0_7
binutils_linux-64=2.33.1=h9595d00_15
black=19.10b0=py_0
blas=1.0=mkl
bleach=3.1.5=py_0
blinker=1.4=py36_0
blosc=1.19.0=hd408876_0
bzip2=1.0.8=h7b6447c_0
ca-certificates=2020.6.20=hecda079_0
cachetools=3.1.1=py_0
cairo=1.14.12=h8948797_3
certifi=2020.6.20=py36h9f0ad1d_0
cffi=1.14.0=py36h2e261b9_0
cfgv=3.1.0=py_0
cftime=1.1.3=py36h785e9b2_0
chardet=3.0.4=py36_1003
click=7.1.2=py_0
colorama=0.4.3=py_0
coloredlogs=14.0=py36h9f0ad1d_1
contextvars=2.4=py_0
cryptography=2.9.2=py36h1ba5d50_0
cudatoolkit=10.1.243=h6bb024c_0
cudnn=7.6.5=cuda10.1_0
curl=7.71.1=hbc83047_1
cycler=0.10.0=py36_0
dbus=1.13.16=hb2f20db_0
decorator=4.4.2=py_0
defusedxml=0.6.0=py_0
distlib=0.3.0=pyh9f0ad1d_0
docutils=0.16=py36_1
editdistance=0.5.3=py36h831f99a_1
entrypoints=0.3=py36_0
expat=2.2.9=he6710b0_2
fastprogress=0.2.3=py_0
filelock=3.0.12=py_0
fontconfig=2.13.0=h9420a91_0
freetype=2.10.2=h5ab3b9f_0
fribidi=1.0.9=h7b6447c_0
future=0.18.2=py36_1
gcc_impl_linux-64=7.3.0=habb00fd_1
gcc_linux-64=7.3.0=h553295d_15
gitdb=4.0.5=py_0
gitpython=3.1.3=py_1
glib=2.63.1=h5a9c865_0
google-auth=1.17.2=py_0
google-auth-oauthlib=0.4.1=py_2
graphite2=1.3.14=h23475e2_0
graphviz=2.40.1=h21bd128_2
gspread=3.6.0=pyh9f0ad1d_0
gst-plugins-base=1.14.0=hbbd80ab_1
gstreamer=1.14.0=hb453b48_1
gxx_impl_linux-64=7.3.0=hdf63c60_1
gxx_linux-64=7.3.0=h553295d_15
h5py=2.10.0=py36h7918eee_0
harfbuzz=1.8.8=hffaf4a1_0
hdf4=4.2.13=h3ca952b_2
hdf5=1.10.4=hb1b8bf9_0
httplib2=0.18.1=pyh9f0ad1d_0
humanfriendly=8.2=py36_0
icu=58.2=he6710b0_3
identify=1.4.19=pyh9f0ad1d_0
idna=2.9=py_1
immutables=0.14=py36h8c4c3a4_0
importlib-metadata=1.7.0=py36_0
importlib_metadata=1.7.0=0
importlib_resources=1.4.0=py36_0
intel-openmp=2020.1=217
ipykernel=5.3.0=py36h5ca1d4c_0
ipython=7.15.0=py36_0
ipython_genutils=0.2.0=py36_0
ipywidgets=7.5.1=py_0
jedi=0.17.0=py36_0
jinja2=2.11.2=py_0
jpeg=9b=h024ee3a_2
json5=0.9.5=py_0
jsonschema=3.2.0=py36_0
jupyter=1.0.0=py36_7
jupyter_client=6.1.6=py_0
jupyter_console=6.1.0=py_0
jupyter_core=4.6.3=py36_0
jupyterlab=2.1.5=py_0
jupyterlab_server=1.2.0=py_0
kiwisolver=1.2.0=py36hfd86e86_0
krb5=1.18.2=h173b8e3_0
ld_impl_linux-64=2.33.1=h53a641e_7
libcurl=7.71.1=h20c2e04_1
libedit=3.1.20191231=h14c3975_1
libffi=3.2.1=hd88cf55_4
libgcc-ng=9.1.0=hdf63c60_0
libgfortran-ng=7.3.0=hdf63c60_0
libgpuarray=0.7.6=h14c3975_0
libnetcdf=4.7.3=hb80b6cc_0
libpng=1.6.37=hbc83047_0
libsodium=1.0.18=h7b6447c_0
libssh2=1.9.0=h1ba5d50_1
libstdcxx-ng=9.1.0=hdf63c60_0
libtiff=4.1.0=h2733197_1
libuuid=1.0.3=h1bed415_2
libxcb=1.14=h7b6447c_0
libxml2=2.9.10=he19cac6_1
lockfile=0.12.2=py36_0
luigi=2.8.13=py36h9f0ad1d_0
lz4-c=1.9.2=he6710b0_1
lzo=2.10=h7b6447c_2
mako=1.1.3=py_0
markupsafe=1.1.1=py36h7b6447c_0
matplotlib=3.2.2=0
matplotlib-base=3.2.2=py36hef1b27d_0
mistune=0.8.4=py36h7b6447c_0
mkl=2020.1=217
mkl-service=2.3.0=py36he904b0f_0
mkl_fft=1.1.0=py36h23d657b_0
mkl_random=1.1.1=py36h0573a6f_0
mock=4.0.2=py_0
more-itertools=8.4.0=py_0
mypy_extensions=0.4.3=py36_0
nbconvert=5.6.1=py36_0
nbdime=2.0.0=py_1
nbformat=5.0.7=py_0
ncurses=6.2=he6710b0_1
netcdf4=1.5.3=py36hbf33ddf_0
ninja=1.10.0=py36hfd86e86_0
nodeenv=1.4.0=pyh9f0ad1d_0
notebook=6.0.3=py36_0
numexpr=2.7.1=py36h423224d_0
numpy=1.19.1=py36hbc911f0_0
numpy-base=1.19.1=py36hfa32c7d_0
oauth2client=4.1.3=py_0
oauthlib=3.1.0=py_0
openssl=1.1.1g=h516909a_1
opt-einsum=3.0.0=py_0
packaging=20.4=py_0
pandas=1.0.5=py36h0573a6f_0
pandoc=2.10=0
pandocfilters=1.4.2=py36_1
pango=1.42.4=h049681c_0
parso=0.7.0=py_0
pathspec=0.8.0=pyh9f0ad1d_0
patsy=0.5.1=py36_0
pcre=8.44=he6710b0_0
pexpect=4.8.0=py36_0
pickleshare=0.7.5=py36_0
pip=20.2.2=py36_0
pixman=0.40.0=h7b6447c_0
plotly=4.8.1=py_0
pluggy=0.13.1=py36_0
pre-commit=2.5.1=py36h9f0ad1d_0
prometheus_client=0.5.0=py36_0
prompt-toolkit=3.0.5=py_0
prompt_toolkit=3.0.5=0
ptyprocess=0.6.0=py36_0
py=1.8.2=py_0
pyasn1=0.4.8=py_0
pyasn1-modules=0.2.7=py_0
pycparser=2.20=py_2
pygments=2.6.1=py_0
pygpu=0.7.6=py36heb32a55_0
pyjwt=1.7.1=py36_0
pymc3=3.9.1=py_0
pyopenssl=19.1.0=py_1
pyparsing=2.4.7=py_0
pyqt=5.9.2=py36h05f1152_2
pyro4=4.80=pyh9f0ad1d_0
pyrsistent=0.16.0=py36h7b6447c_0
pysocks=1.7.1=py36_0
pytables=3.6.1=py36h71ec239_0
pytest=5.4.3=py36_0
python=3.6.10=hcf32534_1
python-daemon=2.2.4=py36_1
python-dateutil=2.8.1=py_0
python_abi=3.6=1_cp36m
pytorch=1.4.0=cuda101py36h02f0884_0
pytz=2020.1=py_0
pyyaml=5.3.1=py36h7b6447c_1
pyzmq=19.0.1=py36he6710b0_1
qt=5.9.7=h5867ecd_1
qtconsole=4.7.4=py_0
qtpy=1.9.0=py_0
readline=8.0=h7b6447c_0
regex=2020.6.8=py36h7b6447c_0
requests=2.24.0=py_0
requests-oauthlib=1.3.0=py_0
retrying=1.3.3=py36_2
rsa=4.4=pyh9f0ad1d_0
scipy=1.5.0=py36h0b6359f_0
seaborn=0.10.1=py_0
send2trash=1.5.0=py36_0
serpent=1.30.2=py_0
setuptools=49.2.1=py36_0
sip=4.19.8=py36hf484d3e_0
six=1.15.0=py_0
smmap=3.0.4=py_0
snappy=1.1.8=he6710b0_0
sqlite=3.32.3=h62c20be_0
terminado=0.8.3=py36_0
testpath=0.4.4=py_0
theano=1.0.4=py36hfd86e86_0
tk=8.6.10=hbc83047_0
toml=0.10.1=py_0
tornado=5.1.1=py36h7b6447c_0
tqdm=4.46.1=py_0
traitlets=4.3.3=py36_0
typed-ast=1.4.1=py36h7b6447c_0
typing-extensions=3.7.4.2=0
typing_extensions=3.7.4.2=py_0
unidecode=1.1.1=py_0
urllib3=1.22=py36hbe7ace6_0
virtualenv=20.0.20=py36h9f0ad1d_1
wcwidth=0.2.4=py_0
webencodings=0.5.1=py36_1
wheel=0.34.2=py36_0
widgetsnbextension=3.5.1=py36_0
xarray=0.15.1=py_0
xz=5.2.5=h7b6447c_0
yaml=0.2.5=h7b6447c_0
zeromq=4.3.2=he6710b0_2
zipp=3.1.0=py_0
zlib=1.2.11=h7b6447c_3
zstd=1.4.5=h9ceee32_0
27 changes: 27 additions & 0 deletions data-pipeline/Dockerfile.conda
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
FROM continuumio/anaconda3:2020.07

WORKDIR /app

RUN apt-get update && apt-get install -y \
g++ \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

ENV ENV_NAME=covid

COPY conda-enviroment.yml ./
RUN conda env create -f conda-enviroment.yml

ENV PATH /opt/conda/envs/${ENV_NAME}/bin:$PATH
RUN /bin/bash -c "source activate ${ENV_NAME}"


RUN curl -sSL https://sdk.cloud.google.com | bash
ENV PATH $PATH:/root/google-cloud-sdk/bin

COPY epimodel epimodel
COPY data-dir data-dir
COPY run_luigi luigi.cfg logging.conf scripts/run_model.sh ./

ENTRYPOINT ["/bin/bash", "run_model.sh"]
File renamed without changes.
Loading

0 comments on commit e397f2e

Please sign in to comment.