Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for reading CloudWatch Logs JSON using json codec #5045

Closed
dlvenable opened this issue Oct 10, 2024 · 1 comment · Fixed by #5054
Closed

Support for reading CloudWatch Logs JSON using json codec #5045

dlvenable opened this issue Oct 10, 2024 · 1 comment · Fixed by #5054
Labels
enhancement New feature or request
Milestone

Comments

@dlvenable
Copy link
Member

Is your feature request related to a problem? Please describe.

We want to be able to decode CloudWatch Logs subscription filters.

https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html

{
    "owner": "111111111111",
    "logGroup": "CloudTrail/logs",
    "logStream": "111111111111_CloudTrail/logs_us-east-1",
    "subscriptionFilters": [
        "Destination"
    ],
    "messageType": "DATA_MESSAGE",
    "logEvents": [
        {
            "id": "31953106606966983378809025079804211143289615424298221568",
            "timestamp": 1432826855000,
            "message": "{\"eventVersion\":\"1.03\",\"userIdentity\":{\"type\":\"Root\"}"
        },
        {
            "id": "31953106606966983378809025079804211143289615424298221569",
            "timestamp": 1432826855000,
            "message": "{\"eventVersion\":\"1.03\",\"userIdentity\":{\"type\":\"Root\"}"
        },
        {
            "id": "31953106606966983378809025079804211143289615424298221570",
            "timestamp": 1432826855000,
            "message": "{\"eventVersion\":\"1.03\",\"userIdentity\":{\"type\":\"Root\"}"
        }
    ]
}

There are a few things the existing json codec will have trouble with here.

  1. It picks the first JSON array. In this case it will be subscriptionFilters rather than logEvents.
  2. There is useful metadata that never appears in the generated events. For example, logStream.

Describe the solution you'd like

I'd like the existing json codec to support a few new features.

  1. Allow the pipeline author to choose the key within the JSON to parse.
codec:
  json:
    key_name: logEvents

The key_name matches the json output codec's similar configuration.

@JsonProperty("key_name")
@Size(min = 1, max = 2048)
private String keyName = DEFAULT_KEY_NAME;

  1. Allow selecting data from the root of the JSON to include in each object.
codec:
  json:
    include_keys: ['owner', logGroup', 'logStream' ]

This would output events like the following:

{
  "id": "31953106606966983378809025079804211143289615424298221568",
  "timestamp": 1432826855000,
  "message": "{\"eventVersion\":\"1.03\",\"userIdentity\":{\"type\":\"Root\"}",
  "owner": "111111111111",
  "logGroup": "CloudTrail/logs",
  "logStream": "111111111111_CloudTrail/logs_us-east-1",
}
  1. Allow selecting data from the root of the JSON to include in the metadata for each object.
codec:
  json:
    include_keys_metadata: ['owner', logGroup', 'logStream' ]
@sb2k16
Copy link
Member

sb2k16 commented Oct 10, 2024

I would like to work on this issue. Could you please assign this to me.

@dlvenable dlvenable added enhancement New feature or request and removed untriaged labels Oct 15, 2024
@dlvenable dlvenable added this to the v2.10 milestone Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging a pull request may close this issue.

2 participants