Skip to content

Commit

Permalink
Update contributor docs to reflect San Diego reference implementation
Browse files Browse the repository at this point in the history
  • Loading branch information
zstumgoren committed Apr 13, 2024
1 parent fdd4e8b commit 021b145
Showing 1 changed file with 7 additions and 9 deletions.
16 changes: 7 additions & 9 deletions docs/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,16 +120,14 @@ class Site:
def scrape_meta(self, throttle=0):
# 1. Scrape metadata about available files, making sure to download and save file
# artifacts such as HTML pages along the way (we recommend using Cache.download)
# 2. Generate a metadata CSV and store in the cache
# 3. Return the path to the metadata CSV
# 2. Generate a metadata JSON file and store in the cache
# 3. Return the path to the metadata JSON
pass

def scrape(self, metadata_csv):
# 1. Use the metadata CSV generated by `scrape_meta` to download available files
# to the cache directory (once again, check out Cache.download).
# artifacts such as HTML "index" pages along the way in the cache
# 2. Generate a metadata CSV and store in the cache
# 3. Return the path to the metadata CSV
def scrape(self, throttle, filter):
# 1. Use the metadata JSON generated by `scrape_meta` to download available files
# to the cache/assets directory (once again, check out Cache.download).
# 2. Return a list of paths to downloaded files
pass
```

Expand All @@ -139,7 +137,7 @@ When creating a scraper, there are a few rules of thumb.
should be saved to the cache unedited. We aim to store pristine
versions of our source data.
1. The metadata about source files should be stored in a single
CSV file. Any intermediate files generated during file/data processing should
JSON file. Any intermediate files generated during file/data processing should
not be written to the data folder. Such files should be written to
the cache directory.
1. Files should be cached in a site-specific cache folder using the agency slug name: `ca_san_diego_pd`.
Expand Down

0 comments on commit 021b145

Please sign in to comment.