Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer is not flushed when buffer_strategy = "disk" #15868

Closed
zumby opened this issue Sep 11, 2024 · 5 comments · Fixed by #15969
Closed

Buffer is not flushed when buffer_strategy = "disk" #15868

zumby opened this issue Sep 11, 2024 · 5 comments · Fixed by #15969
Assignees
Labels
bug unexpected problem or unintended behavior

Comments

@zumby
Copy link

zumby commented Sep 11, 2024

Relevant telegraf.conf

[agent]
    interval = "1s"
    round_interval = true
    metric_batch_size = 1000
    collection_jitter = "1s"
    flush_interval = "5s"
    flush_jitter = "5s"
    precision = ""
    debug = true
    quiet = false
    logfile = ""
    omit_hostname = true
    metric_buffer_limit = 10000
    buffer_strategy = "disk"
    buffer_directory = "/etc/telegraf/buffer"

  [[inputs.http_listener_v2]]
    service_address = ":8080"
    data_source = "body"
    paths = ["/telemetry"]
    path_tag = true
    methods = ["POST", "GET"]
    data_format = "json"

    json_string_fields = ["*"]

  [[processors.converter]]
    [processors.converter.fields]
      timestamp = ["timestamp"]
      timestamp_format = "unix_ms"
      tag = ["display_name", "uuid"]

  [[outputs.file]]
    files = ["stdout"]
    data_format = "json"

  [[outputs.postgresql]]
    name_override = "telemetry"
    connection = "postgres://redacted"
    timestamp_column_name = "timestamp"
    fields_as_jsonb = true
    tagexclude = ["http_listener_v2_path"]
    [outputs.postgresql.tagpass]
      http_listener_v2_path = ["/telemetry"]

Logs from Telegraf

k logs -f -n temp-telegraf telegraf-56c9f5675d-w8d52
2024-09-11T09:00:18Z I! Loading config: /etc/telegraf/telegraf.conf
2024-09-11T09:00:18Z W! Using disk buffer strategy for plugin outputs.file, this is an experimental feature
2024-09-11T09:00:18Z W! Using disk buffer strategy for plugin outputs.postgresql, this is an experimental feature
2024-09-11T09:00:18Z W! Using disk buffer strategy for plugin outputs.postgresql, this is an experimental feature
2024-09-11T09:00:18Z I! Starting Telegraf 1.32.0 brought to you by InfluxData the makers of InfluxDB
2024-09-11T09:00:18Z I! Available plugins: 235 inputs, 9 aggregators, 32 processors, 26 parsers, 62 outputs, 6 secret-stores
2024-09-11T09:00:18Z I! Loaded inputs: http_listener_v2
2024-09-11T09:00:18Z I! Loaded aggregators:
2024-09-11T09:00:18Z I! Loaded processors: converter
2024-09-11T09:00:18Z I! Loaded secretstores:
2024-09-11T09:00:18Z I! Loaded outputs: file postgresql (2x)
2024-09-11T09:00:18Z I! Tags enabled:
2024-09-11T09:00:18Z I! [agent] Config: Interval:1s, Quiet:false, Hostname:"", Flush Interval:5s
2024-09-11T09:00:18Z D! [agent] Initializing plugins
2024-09-11T09:00:18Z D! [agent] Connecting outputs
2024-09-11T09:00:18Z D! [agent] Attempting connection to [outputs.file]
2024-09-11T09:00:18Z D! [agent] Successfully connected to outputs.file
2024-09-11T09:00:18Z D! [agent] Attempting connection to [outputs.postgresql]
2024-09-11T09:00:18Z D! [agent] Successfully connected to outputs.postgresql
2024-09-11T09:00:18Z D! [agent] Attempting connection to [outputs.postgresql]
2024-09-11T09:00:18Z D! [agent] Successfully connected to outputs.postgresql
2024-09-11T09:00:18Z D! [agent] Starting service inputs
2024-09-11T09:00:18Z I! [inputs.http_listener_v2] Listening on [::]:8080
2024-09-11T09:00:18Z D! [inputs.http_listener_v2] No metrics were created from a message. Verify your parser settings. This message is only printed once.
2024-09-11T09:00:25Z D! [outputs.postgresql] Wrote batch of 1 metrics in 30.685764ms
2024-09-11T09:00:25Z D! [outputs.postgresql] Buffer fullness: 1 metrics
2024-09-11T09:00:26Z D! [outputs.postgresql] Wrote batch of 1 metrics in 22.168378ms
2024-09-11T09:00:26Z D! [outputs.postgresql] Buffer fullness: 1 metrics
2024-09-11T09:00:28Z D! [outputs.file] Wrote batch of 2 metrics in 63.535µs
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
2024-09-11T09:00:28Z D! [outputs.file] Buffer fullness: 2 metrics
2024-09-11T09:00:33Z D! [outputs.postgresql] Wrote batch of 1 metrics in 6.196823ms
2024-09-11T09:00:33Z D! [outputs.postgresql] Buffer fullness: 1 metrics
2024-09-11T09:00:34Z D! [outputs.postgresql] Wrote batch of 1 metrics in 4.288579ms
2024-09-11T09:00:34Z D! [outputs.postgresql] Buffer fullness: 1 metrics
2024-09-11T09:00:36Z D! [outputs.file] Wrote batch of 2 metrics in 74.988µs
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
2024-09-11T09:00:36Z D! [outputs.file] Buffer fullness: 2 metrics
2024-09-11T09:00:40Z D! [outputs.postgresql] Wrote batch of 1 metrics in 5.891064ms
2024-09-11T09:00:40Z D! [outputs.postgresql] Buffer fullness: 1 metrics
2024-09-11T09:00:42Z D! [outputs.postgresql] Wrote batch of 1 metrics in 4.470253ms
2024-09-11T09:00:42Z D! [outputs.postgresql] Buffer fullness: 2 metrics
2024-09-11T09:00:45Z D! [outputs.file] Wrote batch of 3 metrics in 85.644µs
2024-09-11T09:00:45Z D! [outputs.file] Buffer fullness: 3 metrics
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
2024-09-11T09:00:49Z D! [outputs.postgresql] Wrote batch of 2 metrics in 4.185766ms
2024-09-11T09:00:49Z D! [outputs.postgresql] Buffer fullness: 2 metrics
2024-09-11T09:00:49Z D! [outputs.postgresql] Wrote batch of 2 metrics in 5.102989ms
2024-09-11T09:00:49Z D! [outputs.postgresql] Buffer fullness: 2 metrics
2024-09-11T09:00:50Z D! [outputs.file] Wrote batch of 3 metrics in 71.809µs
2024-09-11T09:00:50Z D! [outputs.file] Buffer fullness: 3 metrics
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
2024-09-11T09:00:57Z D! [outputs.postgresql] Wrote batch of 2 metrics in 4.627152ms
2024-09-11T09:00:57Z D! [outputs.postgresql] Buffer fullness: 2 metrics
2024-09-11T09:00:57Z D! [outputs.postgresql] Wrote batch of 2 metrics in 5.116924ms
2024-09-11T09:00:57Z D! [outputs.postgresql] Buffer fullness: 2 metrics
2024-09-11T09:00:57Z D! [outputs.file] Wrote batch of 3 metrics in 70.53µs
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
2024-09-11T09:00:57Z D! [outputs.file] Buffer fullness: 3 metrics
2024-09-11T09:01:05Z D! [outputs.postgresql] Wrote batch of 2 metrics in 4.597189ms
2024-09-11T09:01:05Z D! [outputs.postgresql] Buffer fullness: 2 metrics
2024-09-11T09:01:06Z D! [outputs.postgresql] Wrote batch of 2 metrics in 5.342177ms
2024-09-11T09:01:06Z D! [outputs.postgresql] Buffer fullness: 2 metrics
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
2024-09-11T09:01:07Z D! [outputs.file] Wrote batch of 3 metrics in 83.017µs
2024-09-11T09:01:07Z D! [outputs.file] Buffer fullness: 3 metrics
2024-09-11T09:01:11Z D! [outputs.postgresql] Wrote batch of 2 metrics in 4.334252ms
2024-09-11T09:01:11Z D! [outputs.postgresql] Buffer fullness: 2 metrics
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
2024-09-11T09:01:13Z D! [outputs.file] Wrote batch of 3 metrics in 105.662µs
2024-09-11T09:01:13Z D! [outputs.file] Buffer fullness: 3 metrics
2024-09-11T09:01:16Z D! [outputs.postgresql] Wrote batch of 2 metrics in 5.896565ms
2024-09-11T09:01:16Z D! [outputs.postgresql] Buffer fullness: 2 metrics
2024-09-11T09:01:18Z D! [outputs.postgresql] Wrote batch of 2 metrics in 4.738697ms
2024-09-11T09:01:18Z D! [outputs.postgresql] Buffer fullness: 2 metrics
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
2024-09-11T09:01:19Z D! [outputs.file] Wrote batch of 3 metrics in 68.483µs
2024-09-11T09:01:19Z D! [outputs.file] Buffer fullness: 3 metrics
2024-09-11T09:01:25Z D! [outputs.postgresql] Wrote batch of 2 metrics in 4.192269ms
2024-09-11T09:01:25Z D! [outputs.postgresql] Buffer fullness: 2 metrics
2024-09-11T09:01:26Z D! [outputs.postgresql] Wrote batch of 2 metrics in 4.906111ms
2024-09-11T09:01:26Z D! [outputs.postgresql] Buffer fullness: 2 metrics
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
{"fields":{"active_state":"ready","water_level":-1},"name":"http_listener_v2","tags":{"display_name":"redacted","http_listener_v2_path":"/telemetry","uuid":"1234-redacted"},"timestamp":1723883479}
2024-09-11T09:01:26Z D! [outputs.file] Wrote batch of 3 metrics in 83.013µs
2024-09-11T09:01:26Z D! [outputs.file] Buffer fullness: 3 metrics
2024-09-11T09:01:31Z D! [outputs.postgresql] Wrote batch of 2 metrics in 6.573357ms
2024-09-11T09:01:31Z D! [outputs.postgresql] Buffer fullness: 2 metrics
...

System info

1.32-alpine from dockerHub (https://hub.docker.com/_/telegraf)

Docker

No response

Steps to reproduce

  1. Deploy telegraf into AWS EKS as a POD with a config (specified)
  2. Send only 1 request to telegraf with the metrics in JSON (input = http_listener_v2)
  3. Get duplicated records in the output source TimescaleDB (PostgreSQL)
  4. See that the Buffer is not being flushed (emptied) at all

Expected behavior

Metrics are sent to input and processed to output without duplication.

Actual behavior

The Telegraf is deployed into Kubernetes (AWS EKS)
We tried to use buffer_strategy = "disk" and started receiving tons of duplicated records into TimescaleDB (PostgreSQL)

It might be a bug or our config issue, though we've tried changing different properties at [agent] level.
So far, looks like we will stick with memory strategy until this is figured out.

Additional info

No response

@zumby zumby added the bug unexpected problem or unintended behavior label Sep 11, 2024
@zumby
Copy link
Author

zumby commented Sep 11, 2024

related PR - #15564
fyi @powersj @srebhan @DStrand1

@DStrand1
Copy link
Member

Thanks for trying out the feature! As far as I can tell, this doesn't seem like a configuration issue on your end. I'm looking into it now and will get back to you once I've found a solution

@DStrand1
Copy link
Member

DStrand1 commented Oct 2, 2024

@zumby Thanks for your patience! I have a potential fix in #15969, are you able to give it a test? Thanks!

@zumby
Copy link
Author

zumby commented Oct 4, 2024

@DStrand1 yeah. thanks
It will take time though, next week i will try to test it. I'll need to build docker image out of the artifacts in the PR - unless you have it pushed to dockehub already with some tag

@zumby
Copy link
Author

zumby commented Nov 6, 2024

i can confirm the issue is fully resolved. Thank you @DStrand1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants