Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn dataset owners when dataset resources are HTTP; replace with HTTPS if not 404 #2985

Open
3 tasks
mogul opened this issue Mar 11, 2021 · 3 comments
Open
3 tasks
Labels
component/catalog Related to catalog component playbooks/roles H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0

Comments

@mogul
Copy link
Contributor

mogul commented Mar 11, 2021

User Story

In order to maintain trust and accessibility of datasets we index (by preventing browser warnings), we want to ensure catalog.data.gov doesn't generate mixed http/https content.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • GIVEN a harvest source includes an http:// link
    AND there is an equivalent https:// link that doesn't error
    WHEN a harvest of that data source happens
    THEN a warning about the HTTP link is generated in the harvest report
    AND the HTTPS link is the one that's recorded.
  • GIVEN a harvest source includes an http:// link
    AND there is NO equivalent https:// link that doesn't error
    WHEN a harvest of that data source happens
    THEN a warning about the HTTP link is generated in the harvest report
    AND the HTTP link should not be recorded.
  • The next Nessus scan of our services should not present any findings about mixed content

Background

https://blog.chromium.org/2020/02/protecting-users-from-insecure.html

Lynda reported this.
https://catalog.data.gov/dataset/fws-critical-habitat-for-threatened-and-endangered-species-datasetd55fc
Links to http resources, and chrome (and other modern web browsers) will block these downloads.

Example of the problem (at the time of issue creation):
http://ecos.fws.gov/docs/crithab/crithab_all/crithab_all_layers.zip
http://ecos.fws.gov/docs/crithab/crithab_all/crithab_all_shapefiles.zip

Security Considerations (required)

This change prevents catalog.data.gov from ever presenting mixed-content.

Sketch

[Notes or a checklist reflecting our understanding of the selected approach]
Note the upstream issue we filed.

@hkdctol hkdctol moved this to Product Backlog in data.gov team board Aug 2, 2022
@jbrown-xentity
Copy link
Contributor

Related to #3974 and #3476

@hkdctol
Copy link
Contributor

hkdctol commented Dec 8, 2022

Archiving for now

@hkdctol hkdctol moved this from 📔 Product Backlog to 🧊 Icebox in data.gov team board May 30, 2023
@btylerburton btylerburton moved this from 🧊 Icebox to 📔 Product Backlog in data.gov team board Aug 12, 2024
@btylerburton btylerburton moved this from 📔 Product Backlog to 🧊 Icebox in data.gov team board Aug 12, 2024
@btylerburton btylerburton moved this from 🧊 Icebox to 📔 Product Backlog in data.gov team board Aug 12, 2024
@Bagesary Bagesary moved this to 📔 Product Backlog in data.gov team board Aug 15, 2024
@btylerburton btylerburton added the component/catalog Related to catalog component playbooks/roles label Oct 10, 2024
@btylerburton btylerburton moved this from 📔 Product Backlog to 📥 Queue in data.gov team board Oct 10, 2024
@jbrown-xentity jbrown-xentity added the H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0 label Nov 14, 2024
@jbrown-xentity
Copy link
Contributor

Planning to implement this as a warning in the Harvesting 2.0 process. Will be an improvement for data providers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/catalog Related to catalog component playbooks/roles H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0
Projects
Status: 📥 Queue
Development

No branches or pull requests

4 participants