Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export results transformation (subtext and hashed) #582

Merged
merged 30 commits into from
Aug 14, 2024

Conversation

babenek
Copy link
Contributor

@babenek babenek commented Jul 14, 2024

Description

Please include a summary of the change and which is fixed.

  • Add --subtex to shrink long lines in report and prevent extra memory usage during json load
  • Add --hashed to hide sensible information in report
  • Use absolute value positions for ML training (Absolute positions of value CredData#160)

How has this been tested?

Please describe the tests that you ran to verify your changes.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 85.36585% with 6 lines in your changes missing coverage. Please review.

Project coverage is 90.12%. Comparing base (31dcd1d) to head (31e74e2).
Report is 1 commits behind head on main.

Files Patch % Lines
credsweeper/credentials/line_data.py 73.33% 3 Missing and 1 partial ⚠️
credsweeper/credentials/candidate.py 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #582      +/-   ##
==========================================
- Coverage   90.15%   90.12%   -0.04%     
==========================================
  Files         129      131       +2     
  Lines        4642     4708      +66     
  Branches      752      759       +7     
==========================================
+ Hits         4185     4243      +58     
- Misses        304      310       +6     
- Partials      153      155       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@babenek babenek changed the title Subhashtext Export results transformation (subtext and hashed) Jul 14, 2024
@babenek babenek marked this pull request as ready for review July 14, 2024 07:42
@babenek babenek requested a review from a team as a code owner July 14, 2024 07:42
Copy link
Contributor

@xDizzix xDizzix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation is rather tangled. The solution for a quite rare task is not worth the overcomplication of this part of the code.

@babenek babenek marked this pull request as draft July 16, 2024 10:46
@babenek babenek force-pushed the subhashtext branch 2 times, most recently from f2344ae to d26ed82 Compare July 17, 2024 14:49
tests/test_main.py Outdated Show resolved Hide resolved
@babenek babenek requested a review from xDizzix August 14, 2024 07:07
@babenek babenek marked this pull request as ready for review August 14, 2024 08:24
@babenek
Copy link
Contributor Author

babenek commented Aug 14, 2024

Implementation is rather tangled. The solution for a quite rare task is not worth the overcomplication of this part of the code.

--subtext reduces json report size of unfiltered data for ML train.
--hashes allows use report of sensetive data for BM without data disclosure

@babenek babenek merged commit 5e2bf59 into Samsung:main Aug 14, 2024
27 checks passed
@babenek babenek deleted the subhashtext branch August 14, 2024 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants