Skip to content
This repository has been archived by the owner on Mar 10, 2022. It is now read-only.

ES Indexing errors with this plugin #47

Open
TotalGriffLock opened this issue Aug 4, 2021 · 3 comments
Open

ES Indexing errors with this plugin #47

TotalGriffLock opened this issue Aug 4, 2021 · 3 comments

Comments

@TotalGriffLock
Copy link

TotalGriffLock commented Aug 4, 2021

I'm using Graylog 4.11 with version 3.0.0 of the metrics-reporter-gelf plugin running to log metrics back into Graylog. I've done no plugin configuration short of

metrics_gelf_enabled = true

in server.conf.

Most metrics are being logged every 15 seconds as expected but there are obviously some that are being dumped as I have 100k of indexing failures. I've narrowed it down to this plugin by routing all messages from my gelf input into a separate index. The only thing I have generating gelf messages into that input is this plugin. The input only listens on localhost so it isn't outside interference.

Every 5 minutes I get these indexer failures:

Timestamp Index Letter ID Error message
a few seconds ago gelf_0 0786ab1e-f535-11eb-8a1b-00155d366e62 ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [value] of type [long] in document with id '0786ab1e-f535-11eb-8a1b-00155d366e62'. Preview of field's value: 'Wed Aug 04 15:01:52 UTC 2021']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "Wed Aug 04 15:01:52 UTC 2021"]];
a few seconds ago gelf_0 0785c096-f535-11eb-8a1b-00155d366e62 ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [value] of type [long] in document with id '0785c096-f535-11eb-8a1b-00155d366e62'. Preview of field's value: '[]']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "[]"]];
a few seconds ago gelf_0 fe953d41-f534-11eb-8a1b-00155d366e62 ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [value] of type [long] in document with id 'fe953d41-f534-11eb-8a1b-00155d366e62'. Preview of field's value: '[]']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "[]"]];
a few seconds ago gelf_0 fe95b270-f534-11eb-8a1b-00155d366e62 ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [value] of type [long] in document with id 'fe95b270-f534-11eb-8a1b-00155d366e62'. Preview of field's value: 'Wed Aug 04 15:01:52 UTC 2021']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "Wed Aug 04 15:01:52 UTC 2021"]];
a few seconds ago gelf_0 f5aafb66-f534-11eb-8a1b-00155d366e62 ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [value] of type [long] in document with id 'f5aafb66-f534-11eb-8a1b-00155d366e62'. Preview of field's value: 'Wed Aug 04 15:01:52 UTC 2021']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "Wed Aug 04 15:01:52 UTC 2021"]];
a few seconds ago gelf_0 f5aa8648-f534-11eb-8a1b-00155d366e62 ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [value] of type [long] in document with id 'f5aa8648-f534-11eb-8a1b-00155d366e62'. Preview of field's value: '[]']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "[]"]];
a minute ago gelf_0 ecb82e12-f534-11eb-8a1b-00155d366e62 ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [value] of type [long] in document with id 'ecb82e12-f534-11eb-8a1b-00155d366e62'. Preview of field's value: '[]']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "[]"]];
a minute ago gelf_0 ecb87c4e-f534-11eb-8a1b-00155d366e62 ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [value] of type [long] in document with id 'ecb87c4e-f534-11eb-8a1b-00155d366e62'. Preview of field's value: 'Wed Aug 04 15:01:52 UTC 2021']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "Wed Aug 04 15:01:52 UTC 2021"]];

My understanding is that GL will have calculated the field types for this input based on the message content and set that as the index's template in ES. Field refresh on this index is set to 5 seconds. I assume that something is being logged with the timestamp in a field which the ES indexer has determined should be a long, and again with something which is [] into a field defined as a long. So I think this could be resolved with a static ES template for this index?

Any suggestions as to how to resolve this gratefully received.

@TotalGriffLock
Copy link
Author

TotalGriffLock commented Aug 4, 2021

Here's the dynamic template generated for this index (and therefore this plugin's messages because nothing else logs to that input)

$ curl -X GET "localhost:9200/_template/gelf-template?pretty=true"

{
  "gelf-template" : {
    "order" : -1,
    "index_patterns" : [
      "gelf_*"
    ],
    "settings" : {
      "index" : {
        "analysis" : {
          "analyzer" : {
            "analyzer_keyword" : {
              "filter" : "lowercase",
              "tokenizer" : "keyword"
            }
          }
        }
      }
    },
    "mappings" : {
      "_source" : {
        "enabled" : true
      },
      "dynamic_templates" : [
        {
          "internal_fields" : {
            "mapping" : {
              "type" : "keyword"
            },
            "match_mapping_type" : "string",
            "match" : "gl2_*"
          }
        },
        {
          "store_generic" : {
            "mapping" : {
              "type" : "keyword"
            },
            "match_mapping_type" : "string"
          }
        }
      ],
      "properties" : {
        "gl2_processing_timestamp" : {
          "format" : "uuuu-MM-dd HH:mm:ss.SSS",
          "type" : "date"
        },
        "gl2_accounted_message_size" : {
          "type" : "long"
        },
        "gl2_receive_timestamp" : {
          "format" : "uuuu-MM-dd HH:mm:ss.SSS",
          "type" : "date"
        },
        "full_message" : {
          "fielddata" : false,
          "analyzer" : "standard",
          "type" : "text"
        },
        "streams" : {
          "type" : "keyword"
        },
        "source" : {
          "fielddata" : true,
          "analyzer" : "analyzer_keyword",
          "type" : "text"
        },
        "message" : {
          "fielddata" : false,
          "analyzer" : "standard",
          "type" : "text"
        },
        "timestamp" : {
          "format" : "uuuu-MM-dd HH:mm:ss.SSS",
          "type" : "date"
        }
      }
    },
    "aliases" : { }
  }
}

The only field which is a [long] is gl2_accounted_message_size. So is this plugin causing that field to sometimes contain the timestamp or a null value?

@TotalGriffLock
Copy link
Author

TotalGriffLock commented Aug 4, 2021

I have resolved this myself, via https://community.graylog.org/t/graylog-metrics-plugin-feeding-data-via-gelf-to-graylog-causing-parsing-errors/16356/3

Most of the values for metrics are numbers so Graylog/ES correctly decide to store the "value" field as a [long]. However there are 2 metrics (at the time of writing):
org.graylog2.journal.oldest-segment
jvm.threads.deadlocks
where the value is either a string (timestamp) or a collection/array. Obviously this data will not go in a field with the type of long. The graylog community URL above provides a solution but only for 1 specific metric. I've put the GELF metrics input through a pipeline with the following rule, which has resolved the errors for me and should work as new metrics are added which are not numeric:

Rule "Cleanup: Non-numeric metrics value field"
when
has_field("value") AND
not is_long("value")
then
rename_field(
old_field: "value",
new_field: "value_string"
);
end

@TotalGriffLock
Copy link
Author

That didn't appear to be working either, but this does. Can't spend any more time on it right now, but if anyone else is having the same problem this will fix it.

Rule "Cleanup: Non-numeric metrics value field"
when
has_field("name") AND
has_field("value") AND
(to_string($message.name) == "org.graylog2.journal.oldest-segment" OR
to_string($message.name) == "jvm.threads.deadlocks")
then
let value_string = to_string($message.value);
set_field ("value_string",value_string);
remove_field("value");
end

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant