Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple wildcards in filter patterns (namepass etc) give inconsistent output #9265

Open
hackery opened this issue May 12, 2021 · 1 comment
Labels
bug unexpected problem or unintended behavior

Comments

@hackery
Copy link
Contributor

hackery commented May 12, 2021

Relevant telegraf.conf:

[[inputs.exec]]
  data_format = "influx"

  commands = [
    "echo cpu.process32:testcase.exe:caserunner.2222 value=42",
    "echo process32:testcase.exe:caserunner.2222 value=42",
    "echo otherWantedMetric value=0",
    "echo unwantedMetric value=0",
  ]

  namepass = [
        '*process32:*.exe:*.*',
        'otherWantedMetric*',
  ]

System info:

Telegraf 1.13.3 (git: HEAD da36455)

Steps to reproduce:

  1. Add multiple wildcard patterns to any filter clause
  2. Feed matching and non-matching lines into input

Expected behavior:

Metrics are correctly filtered:

:!/usr/bin/telegraf -config etc.testcase.filter/telegraf.conf --test
2021-05-12T11:09:19Z I! Starting Telegraf 1.13.3
> cpu.process32:testcase.exe:caserunner.2222 value=42 1620817760000000000
> process32:testcase.exe:caserunner.2222 value=42 1620817760000000000
> otherWantedMetric value=0 1620817760000000000

Actual behavior:

Some metrics are dropped when they should be passed (or vice versa for "drop" rules):

2021-05-12T11:10:46Z I! Starting Telegraf 1.13.3
> otherWantedMetric value=0 1620817847000000000

Additional info:

Removing otherWantedMetric* from the filter list, permits the "process32" ones to pass:

2021-05-12T11:11:49Z I! Starting Telegraf 1.13.3
> cpu.process32:testcase.exe:caserunner.2222 value=42 1620817910000000000
> process32:testcase.exe:caserunner.2222 value=42 1620817910000000000

Depending on the specific wildcards used in a set of patterns, there is sometimes also an ordering dependency, where switching two patterns filters correctly.

Yes, I know the naming here is an antipattern and they should be tagged like process32,exe=testcase.exe,activity=caserunner,act_id=2222 cpu=42 ... but these metrics are from a legacy system, we're having to ingest them using the existing names "for historical reasons".

I believe this behaviour is due to bugs in the gobwas/glob library (several cases of unexpected pattern behaviour have been reported over its lifetime) and I've created issue gobwas/glob#50 there, but there may also be mitigations or changes to make in Telegraf:

  • Telegraf currently composes the individual patterns into a single glob "alternates" construct, e.g. {*process32:*.exe:*.*,otherWantedMetric*} which seems to be a case where gobwas/glob fails
  • it might be worth looking at alternative invocations, like explicit checks on each pattern - if the performance of that isn't dreadful
  • I'd suggest trying another glob library, but I've had a look and can't find anything suitable (supporting *, ?, [] and {} which existing Telegraf users may depend on (although, if they're not reliable, cough))
  • How much work would it be to implement a suitable/compatible glob function internal to telegraf vs. contributing fixes to the gobwas library ...
@hackery hackery added the bug unexpected problem or unintended behavior label May 12, 2021
@hackery
Copy link
Contributor Author

hackery commented May 12, 2021

I realise 1.13 is fairly old now, so I've just repeated the test using a 1.18.2 binary download, and with a local build from nearly-current master branch:

$ ./telegraf-1.18.2/usr/bin/telegraf --config etc.testcase.filter/telegraf.conf --test
2021-05-12T17:31:29Z I! Starting Telegraf 1.18.2
> otherWantedMetric value=0 1620840690000000000

$ ../src/influxdata/telegraf/telegraf --version
Telegraf unknown (git: master b56ffdc4)
$ ../src/influxdata/telegraf/telegraf --config etc.testcase.filter/telegraf.conf --test
2021-05-12T17:32:48Z I! Starting Telegraf 
> otherWantedMetric value=0 1620840769000000000

@sspaink sspaink removed their assignment Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants