Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: metadata spec #49

Merged
merged 3 commits into from
Jul 30, 2024
Merged

docs: metadata spec #49

merged 3 commits into from
Jul 30, 2024

Conversation

newsroomdev
Copy link
Member

Description

  • Adds anchor links to specific examples for quicker reference over Slack or GitHub.
  • Adds docs/decisions to facilitate discussions around codebase changes and memorialize them

How to review

🔍 Feedback on deprecating scrape method in favor of more scrape_meta test development

🤔 Open-ended Questions:

  • Additional opinions (pros/cons) on docs/decisions format?
  • Other ideas to help with onboarding?

@newsroomdev newsroomdev marked this pull request as ready for review July 30, 2024 01:19
@newsroomdev
Copy link
Member Author

approved by serdar earlier via zoom

@newsroomdev newsroomdev merged commit f689f6b into dev Jul 30, 2024
1 check passed
@newsroomdev newsroomdev deleted the docs/metadata-spec branch July 30, 2024 01:19
newsroomdev added a commit that referenced this pull request Jul 31, 2024
* docs: metadata spec

* docs: remove refs to scrape

---------

Co-authored-by: Gerald Rich <[email protected]>
newsroomdev added a commit that referenced this pull request Aug 1, 2024
* feat: sacramento pd scraper

* fix: isort

* scrape most child pages; todo: get sub-sub pages

* more recursively grab child pages

* inline comments

* fix: fn names, py type

* feat: collect zip & pdfs; todo: handle dupe assets

* chore: ci

* feat: download youtube videos & playlists; remove print stmts

* style: naming

* ops: clean-prefect import clean

* ops: fix runner test (#44)

* ops: fix runner test

* ops: avoid redundant gha runs on prs

---------

Co-authored-by: Gerald Rich <[email protected]>

* ops: current reqs

* naming

* refactor: move around methods

* refactor: add case_num

* Tiny typo fixs

* Ca 43 santa rosa scraper (#45)

* added santa rosa

* Added The scraper for Humboldt with successful pre-commit run (#48)

* Added The scraper for Humboldt with successful pre-commit run
* Required Changes done
* removed download page where identical

* docs: metadata spec (#49)

* docs: metadata spec

* docs: remove refs to scrape

---------

Co-authored-by: Gerald Rich <[email protected]>

* Update contributing.md

* fix: metadata dict types

* fix: import typing_extensions

---------

Co-authored-by: Gerald Rich <[email protected]>
Co-authored-by: Mike Stucka <[email protected]>
Co-authored-by: naumansharifwork <[email protected]>
naumansharifwork pushed a commit to naumansharifwork/clean-scraper that referenced this pull request Oct 23, 2024
* docs: metadata spec

* docs: remove refs to scrape

---------

Co-authored-by: Gerald Rich <[email protected]>
naumansharifwork added a commit to naumansharifwork/clean-scraper that referenced this pull request Oct 23, 2024
* feat: sacramento pd scraper

* fix: isort

* scrape most child pages; todo: get sub-sub pages

* more recursively grab child pages

* inline comments

* fix: fn names, py type

* feat: collect zip & pdfs; todo: handle dupe assets

* chore: ci

* feat: download youtube videos & playlists; remove print stmts

* style: naming

* ops: clean-prefect import clean

* ops: fix runner test (biglocalnews#44)

* ops: fix runner test

* ops: avoid redundant gha runs on prs

---------

Co-authored-by: Gerald Rich <[email protected]>

* ops: current reqs

* naming

* refactor: move around methods

* refactor: add case_num

* Tiny typo fixs

* Ca 43 santa rosa scraper (biglocalnews#45)

* added santa rosa

* Added The scraper for Humboldt with successful pre-commit run (biglocalnews#48)

* Added The scraper for Humboldt with successful pre-commit run
* Required Changes done
* removed download page where identical

* docs: metadata spec (biglocalnews#49)

* docs: metadata spec

* docs: remove refs to scrape

---------

Co-authored-by: Gerald Rich <[email protected]>

* Update contributing.md

* fix: metadata dict types

* fix: import typing_extensions

---------

Co-authored-by: Gerald Rich <[email protected]>
Co-authored-by: Mike Stucka <[email protected]>
Co-authored-by: naumansharifwork <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant