-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add docs site for archives.getmyvax.org (#1570)
We’ve needed to have some docs around https://archives.getmyvax.org for a while and never gotten around to it. Now that the API and its accompanying docs are disappearing, This adds some docs that live directly on `archives.getmyvax.org` (so whatever else happens to other sites, the docs stay alongside the archive data). The docs page got quite long once I’d pasted in all the schemas and sources and ID systems and so on, so I split it up into a few pages and used MkDocs to generate a site from the markdown files. All the source for the docs site is in the `archives` directory. There’s a workflow to build the site and upload it to the `/docs` folder in the `univaf-data-snapshots` S3 bucket where we serve the archives site from. Part of #1550.
- Loading branch information
Showing
16 changed files
with
1,043 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
name: Deploy Archives Docs | ||
|
||
on: | ||
pull_request: | ||
paths: | ||
- "archives/**" | ||
push: | ||
branches: | ||
- main | ||
paths: | ||
- "archives/**" | ||
|
||
workflow_dispatch: {} | ||
|
||
permissions: | ||
contents: read | ||
pages: write | ||
id-token: write | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
|
||
- name: Install Python | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: "3.11" | ||
cache: "pip" | ||
|
||
# No venv here: caching is easier, the environment is ephemeral anyway. | ||
- name: Install dependencies | ||
run: | | ||
cd archives | ||
pip install -r requirements.txt | ||
- name: Build Archive Docs | ||
run: | | ||
cd archives | ||
mkdocs build | ||
# Combine with index redirect page | ||
mkdir site | ||
mv dist site/docs | ||
cp index.html site/ | ||
- uses: actions/upload-artifact@v3 | ||
with: | ||
name: archives-docs | ||
path: archives/site/ | ||
|
||
deploy: | ||
if: github.ref == 'refs/heads/main' | ||
needs: | ||
- build | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/download-artifact@v3 | ||
with: | ||
name: archives-docs | ||
|
||
- name: Configure AWS credentials | ||
uses: aws-actions/configure-aws-credentials@v2 | ||
with: | ||
aws-access-key-id: ${{ secrets.ARCHIVE_DOCS_AWS_ACCESS_KEY_ID }} | ||
aws-secret-access-key: ${{ secrets.ARCHIVE_DOCS_AWS_SECRET_ACCESS_KEY }} | ||
aws-region: us-west-2 | ||
|
||
- name: Copy files to S3 | ||
env: | ||
ARCHIVE_BUCKET: univaf-data-snapshots | ||
run: | | ||
aws s3 cp index.html "s3://${ARCHIVE_BUCKET}/index.html" | ||
aws s3 sync docs "s3://${ARCHIVE_BUCKET}/docs/" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,6 +7,7 @@ lib-cov | |
|
||
# Dependency directory | ||
node_modules | ||
.venv | ||
|
||
# Editors | ||
.idea | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Historical Archives Documentation | ||
|
||
This directory hosts the source code for public documentation about UNIVAF's historical archives, which live at https://archives.getmyvax.org/. The docs site is built with [MkDocs](https://www.mkdocs.org/) and the [Material Theme](https://squidfunk.github.io/mkdocs-material/), and get published to `archives.getmyvax.org/docs/` (so they are clearly separated from the actual archive data). | ||
|
||
There is also a `index.html` that redirects from `/` to `/docs/` to send browsers that visit `archives.getmyvax.org/` to the docs. | ||
|
||
|
||
## Setup | ||
|
||
1. Make sure you have a recent version of Python 3 (MkDocs is Python-based). | ||
|
||
2. Run `./setup.sh` to set up a Python virtual environment and install the dependencies. | ||
|
||
This will set up the virtual environment in a `.venv` folder inside this folder, then use `pip` to install the dependencies from `requirements.txt`. | ||
|
||
3. To use MkDocs, run `./run-mkdocs.sh <your> <args> <here>`. | ||
|
||
- To run the development server: `./run-mkdocs.sh serve` | ||
- To create a static build: `./run-mkdocs.sh build` | ||
|
||
You can also activate the virtual environment and run mkdocs directly instead of using the helper: | ||
|
||
```bash | ||
# Activate the Python virtual environment: | ||
source ./.venv/bin/activate | ||
|
||
# Run mkdocs: | ||
mkdocs serve | ||
|
||
# When you're done, deactivate the virtual environment: | ||
deactivate | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# UNIVAF Historical Data Archives | ||
|
||
This site hosts historical data from UNIVAF, U.S. Digital Response’s COVID-19 Vaccine Appointment Finder API, <https://getmyvax.org/>. **The API is no longer live and was shut down on June 15, 2023.** | ||
|
||
The historical data in this archive includes: | ||
|
||
- A daily copy of each of the three main tables in the database (starting on June 3, 2021 and ending on June 15, 2023). | ||
- A copy of every update to a location’s availability, grouped into one file for each day (starting on May 19, 2021 and ending on June 15, 2023). | ||
- A final backup of the database in Postgres SQL format (June 16, 2023). | ||
- A final backup of the database in SQLite format (June 16, 2023). | ||
|
||
Keep in mind that, since these are historical archives, the format of data has changed over time and data files from different dates may contain different fields. Historical service outages and incidents also impact the data on some days. | ||
|
||
Also note that UNIVAF began operation in March 2023, but did not start archiving historical data until May. | ||
|
||
For an example of analyzing this data, see <https://github.com/usdigitalresponse/appointment-data-insights>. | ||
|
||
|
||
## Loading Data Files | ||
|
||
Except for the final backups, all files are stored as gzipped, [newline-delimited JSON (NDJSON)](http://ndjson.org/) files. | ||
|
||
|
||
### Database Copies | ||
|
||
Daily copies of the `provider_locations`, `external_ids`, and `availability` tables are stored in a separate directory for each table, and a separate file for each day. Files are named like: | ||
|
||
``` | ||
https://archives.getmyvax.org/<table>/<table>-<date>.ndjson.gz | ||
``` | ||
|
||
For example, for the contents of the `provider_locations` table on October 1, 2021, download: | ||
|
||
``` | ||
https://archives.getmyvax.org/provider_locations/provider_locations-2021-11-01.ndjson.gz | ||
``` | ||
|
||
Each record in the table is a separate JSON line in the file. | ||
|
||
|
||
### Availability Update Logs | ||
|
||
In addition to daily copies of the database, you can access lists of every single update to a location’s availability in the `/availability_log` directory. Updates are grouped by day, and files are named like: | ||
|
||
``` | ||
https://archives.getmyvax.org/availability_log/availability_log-<date>.ndjson.gz | ||
``` | ||
|
||
For example, to get every update on October 1, 2021, download: | ||
|
||
``` | ||
https://archives.getmyvax.org/availability_log/availability_log-2021-11-01.ndjson.gz | ||
``` | ||
|
||
Each update is a separate line in the file. The schema of each record is the same as the `availability` table, but in most cases, _only the fields that changed in that update are filled in_. To get a complete picture of the availability of a location at a given time, you will need to scan backwards in time through the availability logs to find the last complete record for the given source and location ID. | ||
|
||
|
||
### Final Database Backups | ||
|
||
A final copy of the database after the service stopped updating is available in Postgres-compatible SQL format and as a SQLite 3 file. Both are gzipped: | ||
|
||
- Postgres: `https://archives.getmyvax.org/sql/univaf_postgres_dump-2023-06-16.sql.gz` | ||
- SQLite: `https://archives.getmyvax.org/sql/univaf_sqlite-2023-06-16.sqlite3.gz` |
Oops, something went wrong.