Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes metadata no longer visible to plugin #50

Open
flynnecf2 opened this issue Jun 20, 2019 · 10 comments
Open

Kubernetes metadata no longer visible to plugin #50

flynnecf2 opened this issue Jun 20, 2019 · 10 comments

Comments

@flynnecf2
Copy link

Before upgrading to 1.4.1, we used to dynamically set our Sumo source/host/category based on the K8s metadata, as follows:

<match **>
@type sumologic
endpoint https://endpoint1.collection.us2.sumologic.com/receiver/v1/http/XXXX
log_format json
source_category ${record['kubernetes']['namespace_name']}
source_name ${record['kubernetes']['container_name']}
source_host ${record['kubernetes']['pod_name']}
open_timeout 10

With 1.4.1, we're getting the above hardcoded string values (i.e...)instead of the dynamic K8s metadata. I've noticed that I'm still able to access the tag values by using something like ${tag[n]}, for instance (although it used to be tag_parts[n], but that no longer works either). Is this intentional, expected, or am I doing something wrong?

@frankreno
Copy link
Contributor

@flynnecf2 any reason you are not using our Kubernetes FluentD Plugin to send the data? It offers this same functionality.

https://github.com/SumoLogic/fluentd-kubernetes-sumologic

@frankreno
Copy link
Contributor

Also what version were you on before this so we can chase this down...

@malcolmrebughini
Copy link

I'm running into the same issue. @frankreno the fluentd plugin has now been deprecated. Is there any workaround on this? Cannot use https://github.com/SumoLogic/sumologic-kubernetes-collection

@frankreno
Copy link
Contributor

@malcolmrebughini - Can you share why cannot you not use the new collection you linked to?

@malcolmrebughini
Copy link

@frankreno it is an existing fluentd configuration and I would prefer to keep changes on the cluster to a minimum.

@frankreno
Copy link
Contributor

Can you please share your config? Can take a look and see what I can determine.

Unfortunately we have no plans to modify this plugin to support the kind of dynamic generation at this moment. We would of course welcome a PR.

Our new collection process preserves the metadata of course and sends the data via HTTP header instead of in the log line which is also more cost effective on the bytes ingested and has many other benefits. I would definitely recommend upgrading it when you can as it is now the supported solution.

@malcolmrebughini
Copy link

Here's the config:

<system>
      log_level debug
    </system>


    <source>
      @type tail
      @label @containers
      path /var/log/containers/*.log
      exclude_path ["/var/log/containers/cloudwatch-agent*", "/var/log/containers/fluentd*", "/var/log/containers/kube*", "/var/log/containers/monitoring*", "/var/log/containers/calico*", ]
      pos_file /var/log/td-agent/fluentd-docker.pos
      tag core.*
      read_from_head false
      <parse>
        @type json
        time_format %Y-%m-%dT%H:%M:%S.%NZ
      </parse>
    </source>

    <label @containers>
      <filter core.**>
        @type kubernetes_metadata
        @log_level debug
        annotation_match [".*"]
        de_dot false
        tag_to_kubernetes_name_regexp ".+?\\.containers\\.(?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\\.log$"
        container_name_to_kubernetes_regexp "^(?<name_prefix>[^_]+)_(?<container_name>[^\\._]+)(\\.(?<container_hash>[^_]+))?_(?<pod_name>[^_]+)_(?<namespace>[^_]+)_[^_]+_[^_]+$"
      </filter>

      <match core.**>
        @type copy
        <store>
          @type sumologic
          @log_level debug
          endpoint "#{ENV['SUMOLOGIC_ENDPOINT']}"
          log_format json_merge
          log_key log
          source_category "#{ENV['ENV']}/core/${record['kubernetes']['container_name']}"
          source_name ${record['kubernetes']['labels']['app']}
          open_timeout 10
        </store>
        <store>
            ...s3 config
        </store>
      </match>
    </label>

@malcolmrebughini
Copy link

I've found the code change that cause this.

In 1.4.0 there was a function called expand_param that looked for record. this was replaced in 1.4.1 with extract_placeholders. Not very familiar with ruby so not sure where that function comes from. Seems to be from fluentd itself?

@malcolmrebughini
Copy link

malcolmrebughini commented Feb 10, 2021

After digging a bit, in newer versions of the fluentd api the proper way of doing this is adding a buffer and then referencing the chunk_key as $.path.to.something:

<match rewrite.**>
        @type copy
        <store>
          @type sumologic
          @log_level debug
          endpoint "#{ENV['SUMOLOGIC_ENDPOINT']}"
          log_key log
          source_category "#{ENV['ENV']}/core/${$.kubernetes.container_name}"
          source_name TESTING
          source_host ${$.kubernetes.pod_name}
          open_timeout 10

          <buffer $.kubernetes.container_name, $.kubernetes.pod_name>
            @type memory
          </buffer>
        </store>
      </match>

I think this solves the issue. So feel free to close this. (And sorry for resurrecting such an old issue)

@frankreno
Copy link
Contributor

No sorry needed, glad you were able to get this to work. I do hope you will look at updating to the newer supported collection method in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants