Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/redaction] Add support for keys patterns and ability to specify mask string #35830

Open
krokwen-tftc opened this issue Oct 16, 2024 · 1 comment
Labels
enhancement New feature or request needs triage New item requiring triage processor/redaction Redaction processor

Comments

@krokwen-tftc
Copy link

Component(s)

processor/redaction

Is your feature request related to a problem? Please describe.

I want to redact my http access logs.
We log full request data including POST, cookies, etc...
There is a lot of various fields containing tokens, that I want to hide, these fields have common patterns like 'token' or 'apiKey'.
It will be complicated to collect all the variations of these keys and their values formats.
Also, I don't want to remove these keys from log attributes, because it's important to see if the field exists or not.

In addition, it may be useful to add hashing processing, to hash masked value instead of replacing with mask to keep ability to track logs by similar hash values in keys but without exposing the actual value.

Describe the solution you'd like

Masking option

processors:
  redaction/nginx_access_redact_secrets:
    allow_all_keys: true
    blocked_keys_patterns:
      - ".*token.*"
      - ".*api_key.*"
      - ".*apiKey.*"
      - ".*password.*"
    mask_string: "<redacted>"

And as result to get attributes like request_args.secret_client_token: <redacted>

Or hashing option

processors:
  redaction/nginx_access_hash_secrets:
    allow_all_keys: true
    blocked_keys_patterns:
      - ".*token.*"
      - ".*api_key.*"
      - ".*apiKey.*"
      - ".*password.*"
    hashing: sha1 # by default set 'none'

And as result to get attributes like request_args.secret_client_token: <sha1 sum>

Describe alternatives you've considered

Using transform processor:
But it's more complicated and it's possible due to a bug-feature inside replace_all_patterns ottl function, like:

...
statements:
# this won't work according to docs (it should replace keys, not values), but it works in real (v0.103)
  - replace_all_patterns(attributes["request_args"]["query"], "key", ".*token.*", "redacted", SHA1, "redacted %s") where IsMap(attributes["request_args"]["query"])
  - replace_all_patterns(attributes["request_args"]["query"], "value", "^redacted.*", "<redacted>") where IsMap(attributes["request_args"]["query"])
# yes, there is a nested map, but it's merged into attributes root later
...

And I believe it can work faster as strict functionality in "redaction" processor than the statements pipeline in transform processor

Additional context

No response

@krokwen-tftc krokwen-tftc added enhancement New feature or request needs triage New item requiring triage labels Oct 16, 2024
@github-actions github-actions bot added the processor/redaction Redaction processor label Oct 16, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs triage New item requiring triage processor/redaction Redaction processor
Projects
None yet
Development

No branches or pull requests

1 participant