OxCGRT data merge, npi model computation docker deployment (#523)

* 504 - Extending the NPI model data * added a dummy featrues for every intervention which is turned on when a intervention is turned of for the first time and turned of when the original intervention is put in place again. * extended the data for another month by using th last know countermeasures in each region and data from johns hopkins * fixed workflow yaml * fixed workflow yml * maybe resolving failing dependency installation in github actions by updating pip and setuptools * fixed the extension of data, removed the cancled columns from intervention json * Introduce new intervention icons * Fix linting * Show NPI model chart only for model channel * extrapolating data * fixed data preprocessing - removing deaths from countermeasures * short-term model improvements * extrapolation * added extrapolation date * Merge OxCGRT countermeasure data * poetry update * Merged NPI data from OxCGRT Prepared docker container which can be run on GCP compute instances from github pipelines defined conda environment to accelerate the computation of the npi model * fixed invalid github workflow yml * add upload data step, removed decrypt secrets to workflow * fixed upload data step * moved GCP setup in workflow, added extrapolation perriod to the model * fixed extrapolation period argument * removed steps from compute npi workflow * don't run previous steps in workflows, instead download the latest r_estimates.csv inside docker. The other steps are quick Created a script which deletes the instance when docker exits. This script is copied to the GCP console, but I included it in the repo for consistency * fixed create-with-container command in workflow * added env file in compute-npi-model to hopefully fix the gcloud command * fixed workflow * fixed workflow * set region for compute * use different service account * update gcloud in workflow * changed order of arguments * screw instance templates, it just refuses to work - defining the instance in the command * set gcp project in workflow * fixed typo * pass foretold channel env variable directly * fixed parameter name * appending newline to env file * debugging workflow * redefine the machine * the cpu has to also be specified * dropped the machine type added vm type instead * added scopes to the vm instance, so that it can pull docker image * fixed syntax * building docker container, debugging startup script * use url to pass the startup script to the instance * extracting branch name, refactoring workflow * reformatted workflow yml * fixed workflow step * make sure the startup script won't block model * trying it without the startup script * another try to not block the npi model by startup script * more disk-space (conda image is large), debugging startup script * fixed preprocessing of countermeasures, debugging startup script * removed the startup script, killing the instance from the docker container, reformatted code * fixed linting after black update * Filter out subregions from OxCGRT data * OxCGRT added data for subregions (e.g. US states) which broke the pipeline. Fix is to filter it out, but we might use them in the future * fixed key passing to the container * Strip the quotes from the key - they are necessary when passing them in the env file * fixed run model script * fixed extrapolation date, changed channel, run on 40 countries * small fixes * Triggering the npi-model computing workflow manually * More tune interactions of the model (to hopefully shrink the confidence interval) * Made sure that each NUTS sampling process created by pymc3 only uses one thread - The parallelization doesn't work and this greatly speeds up the computation * fixed lining * pre-PR clean-up * lgtm based fixes Co-authored-by: Marek Pukaj <[email protected]>
epidemics · Sep 11, 2020 · e397f2e · e397f2e
1 parent 78ff0b6
commit e397f2e
Show file tree

Hide file tree

Showing 23 changed files with 6,628 additions and 4,066 deletions.
diff --git a/.github/workflows/compute-npi-model.yml b/.github/workflows/compute-npi-model.yml
@@ -0,0 +1,73 @@
+name: Compute npi-model and upload
+
+on:
+  workflow_dispatch:
+    inputs:
+      channel_name:
+        description: 'Name of the channel to which the results of the model will be uploaded'
+        required: true
+        default: 'model'
+
+jobs:
+  NPI-model-computation:
+    runs-on: ubuntu-latest
+    env:
+      RUN_REGION: us-west1-c
+      IMAGE_NAME: npi-model
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@master
+
+      - name: Checkout data repo
+        uses: actions/checkout@v2
+        with:
+          repository: epidemics/epimodel-covid-data
+          path: data-pipeline/data
+
+      - uses: GoogleCloudPlatform/github-actions/setup-gcloud@master
+        name: Setup Google Cloud Platform
+        with:
+          version: '290.0.1'
+          service_account_email: ${{ secrets.COMPUTE_SA_EMAIL }}
+          service_account_key: ${{ secrets.GOOGLE_COMPUTE_CREDENTIALS }}
+
+      # Configure docker to use the gcloud command-line tool as a credential helper
+      - run: |
+          gcloud auth configure-docker
+
+      # Build the Docker image
+      - name: Build Docker
+        working-directory: data-pipeline
+        run: |
+          docker build -t gcr.io/${{ secrets.GKE_PROJECT }}/$IMAGE_NAME:$GITHUB_SHA -f Dockerfile.conda .
+
+      # Push the Docker image to Google Container Registry
+      - name: Publish Docker
+        run: |
+          docker push gcr.io/${{ secrets.GKE_PROJECT }}/$IMAGE_NAME:$GITHUB_SHA
+
+      - name: Google cloud run setup
+        env:
+          GCP_KEY: ${{ secrets.NPI_MODEL_SERVICE_ACCOUNT_KEY }}
+        run: |
+          echo -E "GCP_KEY='${GCP_KEY}'" > .env
+          echo "FORETOLD_CHANNEL=${{ secrets.FORETOLD_CHANNEL }}" >> .env
+          echo "INSTANCE_NAME=$IMAGE_NAME" >> .env
+          echo "PROJECT_NAME=${{ secrets.GKE_PROJECT }}" >> .env
+          gcloud config set compute/zone $RUN_REGION
+          gcloud config set project ${{ secrets.GKE_PROJECT }}
+
+      - name: Run model and upload results
+        run: |
+          gcloud compute instances create-with-container $IMAGE_NAME \
+            --zone $RUN_REGION \
+            --image-project cos-cloud \
+            --image-family cos-stable \
+            --boot-disk-size 15GB \
+            --machine-type n2-custom-2-20480-ext \
+            --scopes default \
+            --container-restart-policy never \
+            --container-image "gcr.io/${{ secrets.GKE_PROJECT }}/${IMAGE_NAME}:${GITHUB_SHA}" \
+            --container-env-file .env \
+            --container-arg ${{ github.event.inputs.channel_name }}
diff --git a/conda-requirements.txt b/conda-requirements.txt
@@ -0,0 +1,228 @@
+# This file may be used to create an environment using:
+# $ conda create --name <env> --file <this file>
+# platform: linux-64
+_libgcc_mutex=0.1=main
+_pytorch_select=0.2=gpu_0
+appdirs=1.4.4=py_0
+arviz=0.8.3=py_0
+attrs=19.3.0=py_0
+backcall=0.2.0=py_0
+binutils_impl_linux-64=2.33.1=he6710b0_7
+binutils_linux-64=2.33.1=h9595d00_15
+black=19.10b0=py_0
+blas=1.0=mkl
+bleach=3.1.5=py_0
+blinker=1.4=py36_0
+blosc=1.19.0=hd408876_0
+bzip2=1.0.8=h7b6447c_0
+ca-certificates=2020.6.20=hecda079_0
+cachetools=3.1.1=py_0
+cairo=1.14.12=h8948797_3
+certifi=2020.6.20=py36h9f0ad1d_0
+cffi=1.14.0=py36h2e261b9_0
+cfgv=3.1.0=py_0
+cftime=1.1.3=py36h785e9b2_0
+chardet=3.0.4=py36_1003
+click=7.1.2=py_0
+colorama=0.4.3=py_0
+coloredlogs=14.0=py36h9f0ad1d_1
+contextvars=2.4=py_0
+cryptography=2.9.2=py36h1ba5d50_0
+cudatoolkit=10.1.243=h6bb024c_0
+cudnn=7.6.5=cuda10.1_0
+curl=7.71.1=hbc83047_1
+cycler=0.10.0=py36_0
+dbus=1.13.16=hb2f20db_0
+decorator=4.4.2=py_0
+defusedxml=0.6.0=py_0
+distlib=0.3.0=pyh9f0ad1d_0
+docutils=0.16=py36_1
+editdistance=0.5.3=py36h831f99a_1
+entrypoints=0.3=py36_0
+expat=2.2.9=he6710b0_2
+fastprogress=0.2.3=py_0
+filelock=3.0.12=py_0
+fontconfig=2.13.0=h9420a91_0
+freetype=2.10.2=h5ab3b9f_0
+fribidi=1.0.9=h7b6447c_0
+future=0.18.2=py36_1
+gcc_impl_linux-64=7.3.0=habb00fd_1
+gcc_linux-64=7.3.0=h553295d_15
+gitdb=4.0.5=py_0
+gitpython=3.1.3=py_1
+glib=2.63.1=h5a9c865_0
+google-auth=1.17.2=py_0
+google-auth-oauthlib=0.4.1=py_2
+graphite2=1.3.14=h23475e2_0
+graphviz=2.40.1=h21bd128_2
+gspread=3.6.0=pyh9f0ad1d_0
+gst-plugins-base=1.14.0=hbbd80ab_1
+gstreamer=1.14.0=hb453b48_1
+gxx_impl_linux-64=7.3.0=hdf63c60_1
+gxx_linux-64=7.3.0=h553295d_15
+h5py=2.10.0=py36h7918eee_0
+harfbuzz=1.8.8=hffaf4a1_0
+hdf4=4.2.13=h3ca952b_2
+hdf5=1.10.4=hb1b8bf9_0
+httplib2=0.18.1=pyh9f0ad1d_0
+humanfriendly=8.2=py36_0
+icu=58.2=he6710b0_3
+identify=1.4.19=pyh9f0ad1d_0
+idna=2.9=py_1
+immutables=0.14=py36h8c4c3a4_0
+importlib-metadata=1.7.0=py36_0
+importlib_metadata=1.7.0=0
+importlib_resources=1.4.0=py36_0
+intel-openmp=2020.1=217
+ipykernel=5.3.0=py36h5ca1d4c_0
+ipython=7.15.0=py36_0
+ipython_genutils=0.2.0=py36_0
+ipywidgets=7.5.1=py_0
+jedi=0.17.0=py36_0
+jinja2=2.11.2=py_0
+jpeg=9b=h024ee3a_2
+json5=0.9.5=py_0
+jsonschema=3.2.0=py36_0
+jupyter=1.0.0=py36_7
+jupyter_client=6.1.6=py_0
+jupyter_console=6.1.0=py_0
+jupyter_core=4.6.3=py36_0
+jupyterlab=2.1.5=py_0
+jupyterlab_server=1.2.0=py_0
+kiwisolver=1.2.0=py36hfd86e86_0
+krb5=1.18.2=h173b8e3_0
+ld_impl_linux-64=2.33.1=h53a641e_7
+libcurl=7.71.1=h20c2e04_1
+libedit=3.1.20191231=h14c3975_1
+libffi=3.2.1=hd88cf55_4
+libgcc-ng=9.1.0=hdf63c60_0
+libgfortran-ng=7.3.0=hdf63c60_0
+libgpuarray=0.7.6=h14c3975_0
+libnetcdf=4.7.3=hb80b6cc_0
+libpng=1.6.37=hbc83047_0
+libsodium=1.0.18=h7b6447c_0
+libssh2=1.9.0=h1ba5d50_1
+libstdcxx-ng=9.1.0=hdf63c60_0
+libtiff=4.1.0=h2733197_1
+libuuid=1.0.3=h1bed415_2
+libxcb=1.14=h7b6447c_0
+libxml2=2.9.10=he19cac6_1
+lockfile=0.12.2=py36_0
+luigi=2.8.13=py36h9f0ad1d_0
+lz4-c=1.9.2=he6710b0_1
+lzo=2.10=h7b6447c_2
+mako=1.1.3=py_0
+markupsafe=1.1.1=py36h7b6447c_0
+matplotlib=3.2.2=0
+matplotlib-base=3.2.2=py36hef1b27d_0
+mistune=0.8.4=py36h7b6447c_0
+mkl=2020.1=217
+mkl-service=2.3.0=py36he904b0f_0
+mkl_fft=1.1.0=py36h23d657b_0
+mkl_random=1.1.1=py36h0573a6f_0
+mock=4.0.2=py_0
+more-itertools=8.4.0=py_0
+mypy_extensions=0.4.3=py36_0
+nbconvert=5.6.1=py36_0
+nbdime=2.0.0=py_1
+nbformat=5.0.7=py_0
+ncurses=6.2=he6710b0_1
+netcdf4=1.5.3=py36hbf33ddf_0
+ninja=1.10.0=py36hfd86e86_0
+nodeenv=1.4.0=pyh9f0ad1d_0
+notebook=6.0.3=py36_0
+numexpr=2.7.1=py36h423224d_0
+numpy=1.19.1=py36hbc911f0_0
+numpy-base=1.19.1=py36hfa32c7d_0
+oauth2client=4.1.3=py_0
+oauthlib=3.1.0=py_0
+openssl=1.1.1g=h516909a_1
+opt-einsum=3.0.0=py_0
+packaging=20.4=py_0
+pandas=1.0.5=py36h0573a6f_0
+pandoc=2.10=0
+pandocfilters=1.4.2=py36_1
+pango=1.42.4=h049681c_0
+parso=0.7.0=py_0
+pathspec=0.8.0=pyh9f0ad1d_0
+patsy=0.5.1=py36_0
+pcre=8.44=he6710b0_0
+pexpect=4.8.0=py36_0
+pickleshare=0.7.5=py36_0
+pip=20.2.2=py36_0
+pixman=0.40.0=h7b6447c_0
+plotly=4.8.1=py_0
+pluggy=0.13.1=py36_0
+pre-commit=2.5.1=py36h9f0ad1d_0
+prometheus_client=0.5.0=py36_0
+prompt-toolkit=3.0.5=py_0
+prompt_toolkit=3.0.5=0
+ptyprocess=0.6.0=py36_0
+py=1.8.2=py_0
+pyasn1=0.4.8=py_0
+pyasn1-modules=0.2.7=py_0
+pycparser=2.20=py_2
+pygments=2.6.1=py_0
+pygpu=0.7.6=py36heb32a55_0
+pyjwt=1.7.1=py36_0
+pymc3=3.9.1=py_0
+pyopenssl=19.1.0=py_1
+pyparsing=2.4.7=py_0
+pyqt=5.9.2=py36h05f1152_2
+pyro4=4.80=pyh9f0ad1d_0
+pyrsistent=0.16.0=py36h7b6447c_0
+pysocks=1.7.1=py36_0
+pytables=3.6.1=py36h71ec239_0
+pytest=5.4.3=py36_0
+python=3.6.10=hcf32534_1
+python-daemon=2.2.4=py36_1
+python-dateutil=2.8.1=py_0
+python_abi=3.6=1_cp36m
+pytorch=1.4.0=cuda101py36h02f0884_0
+pytz=2020.1=py_0
+pyyaml=5.3.1=py36h7b6447c_1
+pyzmq=19.0.1=py36he6710b0_1
+qt=5.9.7=h5867ecd_1
+qtconsole=4.7.4=py_0
+qtpy=1.9.0=py_0
+readline=8.0=h7b6447c_0
+regex=2020.6.8=py36h7b6447c_0
+requests=2.24.0=py_0
+requests-oauthlib=1.3.0=py_0
+retrying=1.3.3=py36_2
+rsa=4.4=pyh9f0ad1d_0
+scipy=1.5.0=py36h0b6359f_0
+seaborn=0.10.1=py_0
+send2trash=1.5.0=py36_0
+serpent=1.30.2=py_0
+setuptools=49.2.1=py36_0
+sip=4.19.8=py36hf484d3e_0
+six=1.15.0=py_0
+smmap=3.0.4=py_0
+snappy=1.1.8=he6710b0_0
+sqlite=3.32.3=h62c20be_0
+terminado=0.8.3=py36_0
+testpath=0.4.4=py_0
+theano=1.0.4=py36hfd86e86_0
+tk=8.6.10=hbc83047_0
+toml=0.10.1=py_0
+tornado=5.1.1=py36h7b6447c_0
+tqdm=4.46.1=py_0
+traitlets=4.3.3=py36_0
+typed-ast=1.4.1=py36h7b6447c_0
+typing-extensions=3.7.4.2=0
+typing_extensions=3.7.4.2=py_0
+unidecode=1.1.1=py_0
+urllib3=1.22=py36hbe7ace6_0
+virtualenv=20.0.20=py36h9f0ad1d_1
+wcwidth=0.2.4=py_0
+webencodings=0.5.1=py36_1
+wheel=0.34.2=py36_0
+widgetsnbextension=3.5.1=py36_0
+xarray=0.15.1=py_0
+xz=5.2.5=h7b6447c_0
+yaml=0.2.5=h7b6447c_0
+zeromq=4.3.2=he6710b0_2
+zipp=3.1.0=py_0
+zlib=1.2.11=h7b6447c_3
+zstd=1.4.5=h9ceee32_0
diff --git a/data-pipeline/Dockerfile.conda b/data-pipeline/Dockerfile.conda
@@ -0,0 +1,27 @@
+FROM continuumio/anaconda3:2020.07
+
+WORKDIR /app
+
+RUN apt-get update && apt-get install -y \
+    g++ \
+    && \
+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
+
+ENV ENV_NAME=covid
+
+COPY conda-enviroment.yml ./
+RUN conda env create -f conda-enviroment.yml
+
+ENV PATH /opt/conda/envs/${ENV_NAME}/bin:$PATH
+RUN /bin/bash -c "source activate ${ENV_NAME}"
+
+
+RUN curl -sSL https://sdk.cloud.google.com | bash
+ENV PATH $PATH:/root/google-cloud-sdk/bin
+
+COPY epimodel epimodel
+COPY data-dir data-dir
+COPY run_luigi luigi.cfg logging.conf scripts/run_model.sh  ./
+
+ENTRYPOINT ["/bin/bash", "run_model.sh"]
diff --git a/data-pipeline/Dockerfile → data-pipeline/Dockerfile.poetry b/data-pipeline/Dockerfile → data-pipeline/Dockerfile.poetry