Skip to content

Commit

Permalink
Updating monitoring components (#422)
Browse files Browse the repository at this point in the history
* Updating monitoring components

* Updated CHANGELOG with version changes

* Updating monitoring components

* Upgrading monitoring components

* Upgrading monitoring components

* Delete grafana-values-4.11.yaml

Experimental file that was accidentally added.  Removing.

* Upgrading monitoring components

* Add logic to remove old DaemonSet during upgrade process

* Added CHANGELOG message to address security issue

* Including fix to address Openshift error

* Upgrade ghostunnel TLS proxy sidecar

* Upgrading ghostunnel from 1.6.1 to 1.7.0

Co-authored-by: gsmith-sas <[email protected]>
  • Loading branch information
cumcke and gsmith-sas authored Nov 5, 2022
1 parent 30df998 commit 66d2564
Show file tree
Hide file tree
Showing 22 changed files with 71 additions and 42 deletions.
14 changes: 10 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
# SAS Viya Monitoring for Kubernetes

## UNRELEASED

* **Overall**
## Version 1.2.5 (04NOV22)

* **Metrics**
* [SECURITY] Upgraded metrics monitoring components to address CVE-2022-37434
* [DEPRECATION] For security reasons, access to Prometheus and AlertManager via NodePort is no longer enabled by default. Set the environment variable PROM_NODEPORT_ENABLE=true to replicate previous behavior.
* [UPGRADE] - Kube-prometheus-stack has been upgraded from version 36.6.1 to 41.7.3
* [UPGRADE] - Prometheus has been upgraded from version 2.36.2 to 2.39.0
* [UPGRADE] - Prometheus Operator has been upgraded from version 0.57.0 to 0.60.0
* [UPGRADE] - Grafana has been upgraded from version 9.0.3 to 9.2.3
* [UPGRADE] - Kube State Metrics has been upgraded from version 2.5.0 to 2.6.0
* [UPGRADE] - K8s-sidecar used with Grafana has been upgraded from 1.19.2 to 1.19.5
* [UPGRADE] - TLS Proxy sidecar (ghostunnel) for monitoring components has been upgraded from 1.6.1 to 1.7.0

* **Logging**

Expand Down Expand Up @@ -35,7 +41,7 @@
* [UPGRADE] - Prometheus has been upgraded from version 2.33.1 to 2.36.2
* [UPGRADE] - Prometheus Operator has been upgraded from version 0.54.0 to 0.57.0
* [UPGRADE] - Grafana has been upgraded from version 8.4.1 to 9.0.3
* [UPGRADE] - AlertManager has been upgraded from version to 0.24.0
* [UPGRADE] - AlertManager has been upgraded from version 0.23.0 to 0.24.0
* [UPGRADE] - Kube State Metrics has been upgraded from version 2.3.0 to 2.5.0
* [UPGRADE] - PushGateway has been upgraded from version 1.4.2 to 1.4.3

Expand Down
7 changes: 5 additions & 2 deletions monitoring/bin/deploy_monitoring_cluster.sh
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ fi

# Check if Prometheus Operator CRDs are already installed
PROM_OPERATOR_CRD_UPDATE=${PROM_OPERATOR_CRD_UPDATE:-true}
PROM_OPERATOR_CRD_VERSION=${PROM_OPERATOR_CRD_VERSION:-v0.57.0}
PROM_OPERATOR_CRD_VERSION=${PROM_OPERATOR_CRD_VERSION:-v0.60.0}
if [ "$PROM_OPERATOR_CRD_UPDATE" == "true" ]; then
log_verbose "Updating Prometheus Operator custom resource definitions"
crds=( alertmanagerconfigs alertmanagers prometheuses prometheusrules podmonitors servicemonitors thanosrulers probes )
Expand All @@ -80,6 +80,9 @@ else
log_debug "Prometheus Operator CRD update disabled"
fi

# Remove existing DaemonSets in case of an upgrade-in-place
kubectl delete daemonset -n $MON_NS -l app=prometheus-node-exporter --ignore-not-found

# Optional workload node placement support
MON_NODE_PLACEMENT_ENABLE=${MON_NODE_PLACEMENT_ENABLE:-${NODE_PLACEMENT_ENABLE:-false}}
if [ "$MON_NODE_PLACEMENT_ENABLE" == "true" ]; then
Expand Down Expand Up @@ -151,7 +154,7 @@ if [ "$V4M_CURRENT_VERSION_MAJOR" == "1" ] && [[ "$V4M_CURRENT_VERSION_MINOR" =~
-l app.kubernetes.io/instance=v4m-prometheus-operator,app.kubernetes.io/name=kube-state-metrics
fi

KUBE_PROM_STACK_CHART_VERSION=${KUBE_PROM_STACK_CHART_VERSION:-36.6.1}
KUBE_PROM_STACK_CHART_VERSION=${KUBE_PROM_STACK_CHART_VERSION:-41.7.3}
helm $helmDebug upgrade --install $promRelease \
--namespace $MON_NS \
-f monitoring/values-prom-operator.yaml \
Expand Down
10 changes: 9 additions & 1 deletion monitoring/bin/deploy_monitoring_openshift.sh
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,14 @@ if [ -z "$(kubectl get serviceAccount -n $MON_NS grafana-serviceaccount -o name
log_info "Creating Grafana serviceAccount..."
kubectl create serviceaccount -n $MON_NS grafana-serviceaccount
fi

# OCP 4.11: We need to patch service account to add API Token
if [ "$OSHIFT_MAJOR_VERSION" -eq "4" ] && [ "$OSHIFT_MINOR_VERSION" -gt "10" ]; then
token=$(kubectl describe -n $MON_NS serviceaccount grafana-serviceaccount |grep "Tokens:"|awk '{print $2}')
log_debug "Patching serviceAccount to link to token...[$token]"
kubectl -n $MON_NS patch serviceaccount grafana-serviceaccount --type=json -p='[{"op":"add","path":"/secrets/1","value":{"name":"'$token'"}}]'
fi

log_debug "Adding cluster role..."
oc adm policy add-cluster-role-to-user cluster-monitoring-view -z grafana-serviceaccount -n $MON_NS
log_debug "Obtaining token..."
Expand Down Expand Up @@ -138,7 +146,7 @@ else
fi

log_info "Deploying Grafana..."
OPENSHIFT_GRAFANA_CHART_VERSION=${OPENSHIFT_GRAFANA_CHART_VERSION:-6.32.6}
OPENSHIFT_GRAFANA_CHART_VERSION=${OPENSHIFT_GRAFANA_CHART_VERSION:-6.43.3}
helm upgrade --install $helmDebug \
-n "$MON_NS" \
-f "$wnpValuesFile" \
Expand Down
2 changes: 1 addition & 1 deletion monitoring/bin/deploy_monitoring_tenant.sh
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ else
fi

# Deploy Grafana using Helm
GRAFANA_CHART_VERSION_TENANT=${GRAFANA_CHART_VERSION_TENANT:-6.32.6}
GRAFANA_CHART_VERSION_TENANT=${GRAFANA_CHART_VERSION_TENANT:-6.43.3}
helm upgrade --install $helmDebug \
-n "$VIYA_NS" \
-f "$wnpGrafanaValuesFile" \
Expand Down
2 changes: 1 addition & 1 deletion monitoring/bin/deploy_monitoring_tenant_openshift.sh
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ fi

log_info "Deploying Grafana..."
grafanaYAML=$tenantDir/openshift/mt-grafana-openshift-values.yaml
OPENSHIFT_GRAFANA_CHART_VERSION=${OPENSHIFT_GRAFANA_CHART_VERSION:-6.32.6}
OPENSHIFT_GRAFANA_CHART_VERSION=${OPENSHIFT_GRAFANA_CHART_VERSION:-6.43.3}
helm upgrade --install $helmDebug \
-n "$VIYA_NS" \
-f "$wnpValuesFile" \
Expand Down
5 changes: 4 additions & 1 deletion monitoring/multitenant/mt-grafana-values.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
image:
tag: "9.0.3"
tag: "9.2.3"
extraLabels:
v4m.sas.com/tenant: __TENANT__
readinessProbe: null
Expand All @@ -11,6 +11,9 @@ sidecar:
datasources:
enabled: true
label: grafana_datasource-__TENANT__
image:
repository: quay.io/kiwigrid/k8s-sidecar
tag: 1.19.5
deploymentStrategy:
type: Recreate
persistence:
Expand Down
4 changes: 2 additions & 2 deletions monitoring/multitenant/mt-prometheus.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ spec:
additionalScrapeConfigs:
name: prometheus-federate-__TENANT__
key: cluster-federate-job
image: quay.io/prometheus/prometheus:v2.36.2
image: quay.io/prometheus/prometheus:v2.39.0
enableAdminAPI: false
listenLocal: false
logFormat: json
Expand All @@ -73,4 +73,4 @@ spec:
ruleSelector:
matchLabels:
v4m.sas.com/tenant: __TENANT__
version: v2.36.2
version: v2.39.0
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
image:
tag: "9.0.3"
tag: "9.2.3"
extraLabels:
v4m.sas.com/tenant: __TENANT__
readinessProbe: null
Expand All @@ -13,6 +13,9 @@ sidecar:
datasources:
enabled: true
label: grafana_datasource-__TENANT__
image:
repository: quay.io/kiwigrid/k8s-sidecar
tag: 1.19.5
deploymentStrategy:
type: Recreate
persistence:
Expand Down
6 changes: 3 additions & 3 deletions monitoring/multitenant/openshift/mt-prometheus-openshift.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ spec:
- --key=/cert/tls.key
- --cert=/cert/tls.crt
- --disable-authentication
image: ghostunnel/ghostunnel:v1.6.1
image: ghostunnel/ghostunnel:v1.7.0
imagePullPolicy: IfNotPresent
ports:
- name: https
Expand Down Expand Up @@ -66,7 +66,7 @@ spec:
additionalScrapeConfigs:
name: prometheus-federate-__TENANT__
key: cluster-federate-job
image: quay.io/prometheus/prometheus:v2.36.2
image: quay.io/prometheus/prometheus:v2.39.0
enableAdminAPI: false
logFormat: json
logLevel: info
Expand All @@ -88,4 +88,4 @@ spec:
ruleSelector:
matchLabels:
v4m.sas.com/tenant: __TENANT__
version: v2.36.2
version: v2.39.0
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ metadata:
service.beta.openshift.io/serving-cert-secret-name: v4m-grafana-__TENANT__-tls-secret
labels:
app.kubernetes.io/name: grafana
app.kubernetes.io/version: 9.0.3
app.kubernetes.io/version: 9.2.3
v4m.sas.com/tenant: __TENANT__
spec:
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ metadata:
service.beta.openshift.io/serving-cert-secret-name: v4m-prometheus-__TENANT__-tls-secret
labels:
app.kubernetes.io/name: prometheus
app.kubernetes.io/version: 2.36.2
app.kubernetes.io/version: 2.39.0
v4m.sas.com/tenant: __TENANT__
spec:
ports:
Expand Down
2 changes: 1 addition & 1 deletion monitoring/multitenant/tls/mt-grafana-tls-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ extraContainers: |
- --key=/cert/tls.key
- --cert=/cert/tls.crt
- --disable-authentication
image: ghostunnel/ghostunnel:v1.6.1
image: ghostunnel/ghostunnel:v1.7.0
imagePullPolicy: IfNotPresent
ports:
- name: https
Expand Down
6 changes: 3 additions & 3 deletions monitoring/multitenant/tls/mt-prometheus-tls.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ spec:
- --key=/cert/tls.key
- --cert=/cert/tls.crt
- --disable-authentication
image: ghostunnel/ghostunnel:v1.6.1
image: ghostunnel/ghostunnel:v1.7.0
imagePullPolicy: IfNotPresent
ports:
- name: https
Expand All @@ -74,7 +74,7 @@ spec:
additionalScrapeConfigs:
name: prometheus-federate-__TENANT__
key: cluster-federate-job
image: quay.io/prometheus/prometheus:v2.36.2
image: quay.io/prometheus/prometheus:v2.39.0
# alerting:
# alertmanagers:
# - apiVersion: v2
Expand Down Expand Up @@ -108,4 +108,4 @@ spec:
ruleSelector:
matchLabels:
v4m.sas.com/tenant: __TENANT__
version: v2.36.2
version: v2.39.0
5 changes: 4 additions & 1 deletion monitoring/openshift/grafana-values.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
image:
tag: "9.0.3"
tag: "9.2.3"
readinessProbe: null
livenessProbe: null
sidecar:
Expand All @@ -9,6 +9,9 @@ sidecar:
datasources:
enabled: true
label: grafana_datasource
image:
repository: quay.io/kiwigrid/k8s-sidecar
tag: 1.19.5
deploymentStrategy:
type: Recreate
persistence:
Expand Down
2 changes: 1 addition & 1 deletion monitoring/openshift/v4m-grafana-svc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ metadata:
service.beta.openshift.io/serving-cert-secret-name: v4m-grafana-tls-secret
labels:
app.kubernetes.io/name: grafana
app.kubernetes.io/version: 9.0.3
app.kubernetes.io/version: 9.2.3
spec:
ports:
- name: service
Expand Down
6 changes: 3 additions & 3 deletions monitoring/tls/values-prom-operator-tls.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ prometheus:
- --key=/cert/tls.key
- --cert=/cert/tls.crt
- --disable-authentication
image: ghostunnel/ghostunnel:v1.6.1
image: ghostunnel/ghostunnel:v1.7.0
imagePullPolicy: IfNotPresent
ports:
- name: https
Expand Down Expand Up @@ -58,7 +58,7 @@ prometheus:
# - --key=cert/tls.key
# - --cert=cert/tls.crt
# - --disable-authentication
# image: ghostunnel/ghostunnel:v1.6.1
# image: ghostunnel/ghostunnel:v1.7.0
# imagePullPolicy: IfNotPresent
# ports:
# - containerPort: 443
Expand Down Expand Up @@ -115,7 +115,7 @@ grafana:
- --key=/cert/tls.key
- --cert=/cert/tls.crt
- --disable-authentication
image: ghostunnel/ghostunnel:v1.6.1
image: ghostunnel/ghostunnel:v1.7.0
imagePullPolicy: IfNotPresent
ports:
- name: https
Expand Down
4 changes: 2 additions & 2 deletions monitoring/user.env
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@
# match the value of prometheusOperator.image.tag in the helm YAML
# if changed from the default.
# See https://github.com/prometheus-operator/prometheus-operator/releases
# PROM_OPERATOR_CRD_VERSION=v0.57.0
# PROM_OPERATOR_CRD_VERSION=v0.60.0

# Version of the kube-prometheus-stack helm chart to use
# KUBE_PROM_STACK_CHART_VERSION=36.6.1
# KUBE_PROM_STACK_CHART_VERSION=41.7.3

# Initial password of the Grafana admin user
# GRAFANA_ADMIN_PASSWORD=yourPasswordHere
Expand Down
15 changes: 9 additions & 6 deletions monitoring/values-prom-operator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ commonLabels:
# https://github.com/coreos/prometheus-operator
prometheusOperator:
image:
tag: v0.57.0
tag: v0.60.0
logFormat: json
logLevel: info
createCustomResource: false
Expand All @@ -37,7 +37,7 @@ prometheusOperator:
prometheusConfigReloader:
image:
repository: quay.io/prometheus-operator/prometheus-config-reloader
tag: v0.57.0
tag: v0.60.0

# ======================
# kubelet ServiceMonitor
Expand All @@ -62,7 +62,7 @@ kubeStateMetrics:
# https://github.com/helm/charts/tree/master/stable/kube-state-metrics
kube-state-metrics:
image:
tag: v2.5.0
tag: v2.6.0
resources:
requests:
cpu: "25m"
Expand All @@ -83,7 +83,7 @@ prometheus:
nodePort: null
prometheusSpec:
image:
tag: v2.36.2
tag: v2.39.0
logLevel: info
logFormat: json
podAntiAffinity: soft
Expand Down Expand Up @@ -117,7 +117,7 @@ prometheus:
alertmanager:
service:
type: ClusterIP
nodePort: null
nodePort: null
alertmanagerSpec:
image:
tag: v0.24.0
Expand Down Expand Up @@ -175,7 +175,7 @@ prometheus-node-exporter:
# https://github.com/grafana/helm-charts/tree/main/charts/grafana
grafana:
image:
tag: "9.0.3"
tag: "9.2.3"
"grafana.ini":
analytics:
check_for_updates: false
Expand Down Expand Up @@ -207,6 +207,9 @@ grafana:
requests:
cpu: "50m"
memory: "100Mi"
image:
repository: quay.io/kiwigrid/k8s-sidecar
tag: 1.19.5
deploymentStrategy:
type: Recreate
persistence:
Expand Down
4 changes: 2 additions & 2 deletions samples/generic-base/monitoring/user.env
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,10 @@ MON_TLS_PATH_INGRESS=false
# match the value of prometheusOperator.image.tag in the helm YAML
# if changed from the default.
# See https://github.com/prometheus-operator/prometheus-operator/releases
# PROM_OPERATOR_CRD_VERSION=v0.57.0
# PROM_OPERATOR_CRD_VERSION=v0.60.0

# Version of the kube-prometheus-stack helm chart to use
# KUBE_PROM_STACK_CHART_VERSION=36.6.1
# KUBE_PROM_STACK_CHART_VERSION=41.7.3

# Set a specific password for the Grafana admin user
# Default is to generate a random password
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
image:
tag: "9.0.3"
tag: "9.2.3"
service:
type: ClusterIP
sidecar:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ metadata:
app: prometheus
name: prometheus-viya
spec:
image: quay.io/prometheus/prometheus:v2.36.2
image: quay.io/prometheus/prometheus:v2.39.0
alerting:
alertmanagers:
- apiVersion: v2
Expand Down Expand Up @@ -110,7 +110,7 @@ spec:
sas.com/viya-namespace: viya-one
ruleSelector: {}

version: v2.36.2
version: v2.39.0
---
apiVersion: v1
kind: Service
Expand Down
Loading

0 comments on commit 66d2564

Please sign in to comment.