You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This appears to stem from our handling of the meeting_id variable, which is used in Asset.download to generate the file name.
Need to either debug for this locale and/or adopt an alternate convention for standardizing file names in CivicPlusSite (and generally).
An ideal solution would be storing file artifacts based on a combination of place, agency, date of meeting, committee type, document type and document format (i.e. the file suffix). For example:
# Note, place may need more careful handling
/tmp/civic_scraper/assets/ca_belvedere/20210604_city_council_agenda_packet.pdf
/tmp/civic_scraper/assets/ca_belvedere/20210604_city_council_agenda_packet.html
It's likely that we may not have all this information available for all platforms, so we may need platform specific solutions.
Or we can go in a totally different direction and just generate unique names based on a file hash, and then use asset metadata (e.g. stored in the metadata CSV) to link given files with their unique names.
The text was updated successfully, but these errors were encountered:
On a test scrape for Belvedere, CA for roughly June through early August, the scrape generated less-than-helpful names for downloaded files:
This appears to stem from our handling of the
meeting_id
variable, which is used inAsset.download
to generate the file name.Need to either debug for this locale and/or adopt an alternate convention for standardizing file names in
CivicPlusSite
(and generally).An ideal solution would be storing file artifacts based on a combination of place, agency, date of meeting, committee type, document type and document format (i.e. the file suffix). For example:
It's likely that we may not have all this information available for all platforms, so we may need platform specific solutions.
Or we can go in a totally different direction and just generate unique names based on a file hash, and then use asset metadata (e.g. stored in the metadata CSV) to link given files with their unique names.
The text was updated successfully, but these errors were encountered: