diff --git a/docs/content/en/docs/concepts/metrics.md b/docs/content/en/docs/concepts/metrics.md deleted file mode 100644 index 6628c2f3dae..00000000000 --- a/docs/content/en/docs/concepts/metrics.md +++ /dev/null @@ -1,102 +0,0 @@ ---- -title: "Metrics" -weight: 2 -description: "Documentation for Tetragon metrics" ---- - -Tetragon's metrics are exposed to the system through an HTTP endpoint. These -are used to expose event summaries and information about the state of the -Tetragon agent. - -## Kubernetes - -Tetragon pods exposes a metrics endpoint by default. The chart also creates a -service named `tetragon` that exposes metrics on the specified port. - -### Getting metrics port - -Check if the `tetragon` service exists: - -```shell -kubectl get services tetragon -n kube-system -``` - -The output should be similar to: -``` -NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -tetragon ClusterIP 10.96.54.218 2112/TCP 3m -``` - -{{< note >}} -In the previous output it shows, 2112 is the port on which the service is -listening. It is also the port on which the Tetragon metrics server listens -with the default Helm values. -{{< /note >}} - -### Port Forwarding - -To forward the metrics port locally, use `kubectl port forward`: - -```shell -kubectl -n kube-system port-forward service/tetragon 2112:2112 -``` - -## Local Package Install - -By default, metrics are disabled when using release packages to install locally. The -metrics can be enabled using `--metrics-server` flag to specify the address. - -Alternatively, the [examples/configuration/tetragon.yaml](https://github.com/cilium/tetragon/blob/main/examples/configuration/tetragon.yaml) -file contains example entries showing the defaults for the address of -metrics-server. Local overrides can be created by editing and copying this file -into `/etc/tetragon/tetragon.yaml`, or by editing and copying "drop-ins" from -the [examples/configuration/tetragon.conf.d](https://github.com/cilium/tetragon/tree/main/examples/configuration/tetragon.conf.d) -directory into the `/etc/tetragon/tetragon.conf.d/` subdirectory. The latter is -generally recommended. - -### Set Metrics Address - -Run `sudo tetragon --metrics-server localhost:2112` to set metrics address to `localhost:2112` and export metrics. - -```shell -sudo tetragon --metrics-server localhost:2112 -``` - -The output should be similar to this: - -``` -time="2023-09-21T13:17:08+05:30" level=info msg="Starting tetragon" -version=v0.11.0 -time="2023-09-21T13:17:08+05:30" level=info msg="config settings" -config="mapeased -time="2023-09-22T23:16:24+05:30" level=info msg="Starting metrics server" -addr="localhost:2112" -[...] -time="2023-09-21T13:17:08+05:30" level=info msg="Listening for events..." -``` - -Alternatively, a file named `server-address` can be created in `etc/tetragon/tetragon.conf.d/metrics-server` with content specifying -a port like this `localhost:2112`, or any port of your choice as mentioned -above. - -## Fetch the Metrics - -After the metrics are exposed, either by port forwarding in case of -Kubernetes installation or by setting metrics address in case of Package -installation, the metrics can be fetched using -`curl` on `localhost:2112/metrics`: - -```shell -curl localhost:2112/metrics -``` - -The output should be similar to this: -``` -# HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler. -# TYPE promhttp_metric_handler_errors_total counter -promhttp_metric_handler_errors_total{cause="encoding"} 0 -promhttp_metric_handler_errors_total{cause="gathering"} 0 -# HELP tetragon_errors_total The total number of Tetragon errors. For internal use only. -# TYPE tetragon_errors_total counter -[...] -``` diff --git a/docs/content/en/docs/installation/configuration.md b/docs/content/en/docs/installation/configuration.md index 6002dbf3cdf..2c860b8e0e1 100644 --- a/docs/content/en/docs/installation/configuration.md +++ b/docs/content/en/docs/installation/configuration.md @@ -1,7 +1,7 @@ --- title: "Configure Tetragon" linkTitle: "Configuration" -weight: 5 +weight: 6 --- Depending on your deployment mode, Tetragon configuration can be changed by: diff --git a/docs/content/en/docs/installation/metrics.md b/docs/content/en/docs/installation/metrics.md new file mode 100644 index 00000000000..7e4b4c61c71 --- /dev/null +++ b/docs/content/en/docs/installation/metrics.md @@ -0,0 +1,118 @@ +--- +title: "Metrics" +weight: 7 +description: "Learn how to configure and access Prometheus metrics." +aliases: ["/docs/concepts/metrics"] +--- + +Tetragon exposes a number of Prometheus metrics that can be used for two main purposes: + +1. Monitoring the health of Tetragon itself +2. Monitoring the activity of processes observed by Tetragon + +For the full list, refer to [metrics reference]({{< ref "/docs/reference/metrics" >}}). + +## Enable/Disable Metrics + +### Kubernetes + +In a [Kubernetes installation]({{< ref "/docs/installation/kubernetes" >}}), metrics are enabled by default and exposed +via `tetragon` service at endpoint `/metrics` on port `2112`. + +You can change the port via Helm values: + +```yaml +tetragon: + prometheus: + port: 2222 # default is 2112 +``` + +Or entirely disable the metrics server: + +```yaml +tetragon: + prometheus: + enabled: false # default is true +``` + +### Non-Kubernetes + +In a non-Kubernetes installation, metrics are disabled by default. You can enable them by setting the metrics server +address, for example `:2112`, via the `--metrics-server` flag. + +If using [systemd]({{< ref "/docs/installation/package" >}}), set the `metrics-address` entry in a file under the +`/etc/tetragon/tetragon.conf.d/` directory. + +## Verify that metrics are exposed + +To verify that the metrics server has started, check the logs of the Tetragon Agent. +In Kubernetes, run: + +```shell +kubectl -n kube-system logs ds/tetragon +``` + +The logs should contain a line similar to the following: +``` +time="2023-09-22T23:16:24+05:30" level=info msg="Starting metrics server" addr="localhost:2112" +``` + +To see what metrics are exposed, you can access the metrics endpoint directly. +In Kubernetes, forward the metrics port: + +```shell +kubectl -n kube-system port-forward svc/tetragon 2112:2112 +``` + +Access `localhost:2112/metrics` endpoint either in a browser or for example using `curl`. +You should see a list of metrics similar to the following: +``` +# HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler. +# TYPE promhttp_metric_handler_errors_total counter +promhttp_metric_handler_errors_total{cause="encoding"} 0 +promhttp_metric_handler_errors_total{cause="gathering"} 0 +# HELP tetragon_errors_total The total number of Tetragon errors. For internal use only. +# TYPE tetragon_errors_total counter +[...] +``` + +## Configure labels on events metrics + +Depending on the workloads running in the environment, [Events Metrics]({{< ref "/docs/reference/metrics#tetragon-events-metrics" >}}) +may have very high cardinality. This is particularly likely in Kubernetes environments, where each pod creates +a separate timeseries. To avoid overwhelming Prometheus, Tetragon provides an option to choose which labels are +populated in these metrics. + +You can configure the labels via Helm values or the `--metrics-label-filter` flag. Set the value to a comma-separated +list of enabled labels: + +```yaml +tetragon: + prometheus: + metricsLabelFilter: "namespace,workload,binary" # "pod" label is disabled +``` + +## Scrape metrics + +Typically, metrics are scraped by Prometheus or another compatible agent (for example OpenTelemetry Collector), stored +in Prometheus or another compatible database, then queried and visualized for example using Grafana. + +In Kubernetes, you can install Prometheus and Grafana using the `kube-prometheus-stack` Helm chart: + +```shell +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \ + --namespace monitoring --create-namespace \ + --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false +``` + +The `kube-prometheus-stack` Helm chart includes [Prometheus Operator](https://prometheus-operator.dev/), which allows +you to configure Prometheus via Kubernetes custom resources. Tetragon comes with a default `ServiceMonitor` resource +containing the scrape confguration. You can enable it via Helm values: + +```yaml +tetragon: + prometheus: + serviceMonitor: + enabled: true +```