Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL Verification Disabled in WebPageHelper #98

Open
2 tasks
rmcc3 opened this issue Jul 22, 2024 · 3 comments
Open
2 tasks

SSL Verification Disabled in WebPageHelper #98

rmcc3 opened this issue Jul 22, 2024 · 3 comments

Comments

@rmcc3
Copy link
Contributor

rmcc3 commented Jul 22, 2024

Description

In the file utils.py, the WebPageHelper class disables SSL verification when making HTTP requests:

self.httpx_client = httpx.Client(verify=False)

This is a significant security issue that should addressed.

Why this is problematic

  1. Man-in-the-Middle (MITM) Attacks: Disabling SSL verification makes the application vulnerable to MITM attacks. An attacker could intercept the communication between the application and the web servers it's querying, potentially injecting malicious content.

  2. Compromised Knowledge Integrity: For a knowledge curation system like STORM, the integrity of the information is important. If an attacker can intercept and modify the content being retrieved, they could inject false or misleading information into the knowledge base. This could lead to the generation of inaccurate or even harmful content.

  3. Violation of Security Best Practices: Disabling SSL verification goes against security best practices and could potentially violate compliance requirements if the system is handling any sensitive or regulated data.

  4. Propagation of Insecure Practices: If users or other developers see this in the codebase, they might assume it's an acceptable practice and replicate it in other parts of the codebase.

How it affects knowledge generation

  1. Unreliable Sources: The system may unknowingly use information from compromised or spoofed websites, leading to the generation of unreliable or false knowledge.

  2. Inconsistent Information: If the same query yields different results due to MITM attacks, it could lead to inconsistencies in the generated knowledge.

Proposed Solution

  1. Remove the verify=False parameter from the httpx.Client() initialization.
  2. Implement proper SSL certificate validation.
  3. If there are specific cases where self-signed certificates need to be handled, implement a more secure solution such as certificate pinning or providing a custom certificate authority.

Action Items

  • Remove verify=False from httpx.Client() initialization
  • Test the system with proper SSL verification enabled
@shaoyijia
Copy link
Collaborator

This issue is reasonable. Any plan to help resolve it?

@rmcc3
Copy link
Contributor Author

rmcc3 commented Aug 1, 2024

This issue is reasonable. Any plan to help resolve it?

Are there any major changes planned to how networking will be done? If not, I can go ahead and see what I can do.

@shaoyijia
Copy link
Collaborator

No, we won't touch the networking part right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants