We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This is the pod log
Defaulted container "prometheus-server-configmap-reload" out of: prometheus-server-configmap-reload, prometheus-server level=info ts=2024-11-11T05:34:37.595571992Z caller=main.go:137 msg="Starting prometheus-config-reloader" version="(version=0.70.0, branch=refs/tags/v0.70.0, revision=c2c673f7123f3745a2a982b4a2bdc43a11f50fad)" level=info ts=2024-11-11T05:34:37.595624649Z caller=main.go:138 build_context="(go=go1.21.4, platform=linux/amd64, user=Action-Run-ID-7048794395, date=20231130-15:42:49, tags=unknown)" level=info ts=2024-11-11T05:34:37.595943074Z caller=reloader.go:246 msg="reloading via HTTP" level=info ts=2024-11-11T05:34:37.596019966Z caller=reloader.go:282 msg="started watching config file and directories for changes" cfg= out= dirs=/etc/config level=error ts=2024-11-11T05:37:37.596706711Z caller=runutil.go:100 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post "http://127.0.0.1:9090/-/reload\": dial tcp 127.0.0.1:9090: connect: connection refused" -->
What happened? I have updated the EKS cluster from 1.28 to V1.29 and after that the Prometheus-server pod went to crash loop back off state.
Did you expect to see some different?
How to reproduce it (as minimally and precisely as possible):
Environment
Prometheus Operator version: apiVersion: apps/v1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "7" meta.helm.sh/release-name: prometheus meta.helm.sh/release-namespace: prometheus creationTimestamp: "2024-11-08T07:03:40Z" generation: 9 labels: app.kubernetes.io/component: server app.kubernetes.io/instance: prometheus app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: prometheus app.kubernetes.io/part-of: prometheus app.kubernetes.io/version: v2.48.1 helm.sh/chart: prometheus-25.8.2 name: prometheus-server namespace: prometheus resourceVersion: "156438078" uid: 6b46deb7-981a-427a-b30e-08ddfd593fee spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app.kubernetes.io/component: server app.kubernetes.io/instance: prometheus app.kubernetes.io/name: prometheus strategy: type: Recreate template: metadata: annotations: kubectl.kubernetes.io/restartedAt: "2024-08-21T12:10:15Z" creationTimestamp: null labels: app.kubernetes.io/component: server app.kubernetes.io/instance: prometheus app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: prometheus app.kubernetes.io/part-of: prometheus app.kubernetes.io/version: v2.48.1 helm.sh/chart: prometheus-25.8.2 spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: hubble.nodeType.Nats operator: In values: - Allowed topologyKey: failure-domain.beta.kubernetes.io/zone containers: - args: - --watched-dir=/etc/config - --reload-url=http://127.0.0.1:9090/-/reload image: quay.io/prometheus-operator/prometheus-config-reloader:v0.70.0 imagePullPolicy: IfNotPresent name: prometheus-server-configmap-reload resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/config name: config-volume readOnly: true - args: - --storage.tsdb.retention.time=15d - --config.file=/etc/config/prometheus.yml - --storage.tsdb.wal-compression - --web.console.libraries=/etc/prometheus/console_libraries - --web.console.templates=/etc/prometheus/consoles - --web.enable-lifecycle image: quay.io/prometheus/prometheus:v2.51.2 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /-/healthy port: 9090 scheme: HTTP initialDelaySeconds: 90 periodSeconds: 15 successThreshold: 1 timeoutSeconds: 10 name: prometheus-server ports: - containerPort: 9090 protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /-/ready port: 9090 scheme: HTTP initialDelaySeconds: 90 periodSeconds: 5 successThreshold: 1 timeoutSeconds: 4 resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/config name: config-volume - mountPath: /data name: storage-volume dnsPolicy: ClusterFirst enableServiceLinks: true nodeSelector: hubble.nodeType.Nats: Allowed restartPolicy: Always schedulerName: default-scheduler securityContext: fsGroup: 65534 runAsGroup: 65534 runAsNonRoot: true runAsUser: 65534 serviceAccount: prometheus-server serviceAccountName: prometheus-server terminationGracePeriodSeconds: 300 volumes: - configMap: defaultMode: 420 name: prometheus-server name: config-volume - name: storage-volume persistentVolumeClaim: claimName: prometheus-server
Kubernetes version information:
kubectl version Client Version: v1.30.1 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.8-eks-a737599
kubectl version
kind: Deployment
* Prometheus Logs:
level=error ts=2024-11-08T06:19:48.406640681Z caller=runutil.go:100 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post "http://127.0.0.1:9090/-/reload\": dial tcp 127.0.0.1:9090: connect: connection refused"
**Anything else we need to know?**:
The text was updated successfully, but these errors were encountered:
@lilic Can you please help ?
Sorry, something went wrong.
No branches or pull requests
This is the pod log
Defaulted container "prometheus-server-configmap-reload" out of: prometheus-server-configmap-reload, prometheus-server
level=info ts=2024-11-11T05:34:37.595571992Z caller=main.go:137 msg="Starting prometheus-config-reloader" version="(version=0.70.0, branch=refs/tags/v0.70.0, revision=c2c673f7123f3745a2a982b4a2bdc43a11f50fad)"
level=info ts=2024-11-11T05:34:37.595624649Z caller=main.go:138 build_context="(go=go1.21.4, platform=linux/amd64, user=Action-Run-ID-7048794395, date=20231130-15:42:49, tags=unknown)"
level=info ts=2024-11-11T05:34:37.595943074Z caller=reloader.go:246 msg="reloading via HTTP"
level=info ts=2024-11-11T05:34:37.596019966Z caller=reloader.go:282 msg="started watching config file and directories for changes" cfg= out= dirs=/etc/config
level=error ts=2024-11-11T05:37:37.596706711Z caller=runutil.go:100 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post "http://127.0.0.1:9090/-/reload\": dial tcp 127.0.0.1:9090: connect: connection refused"
-->
What happened?
I have updated the EKS cluster from 1.28 to V1.29 and after that the Prometheus-server pod went to crash loop back off state.
Did you expect to see some different?
How to reproduce it (as minimally and precisely as possible):
Environment
Prometheus Operator version:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "7"
meta.helm.sh/release-name: prometheus
meta.helm.sh/release-namespace: prometheus
creationTimestamp: "2024-11-08T07:03:40Z"
generation: 9
labels:
app.kubernetes.io/component: server
app.kubernetes.io/instance: prometheus
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: prometheus
app.kubernetes.io/version: v2.48.1
helm.sh/chart: prometheus-25.8.2
name: prometheus-server
namespace: prometheus
resourceVersion: "156438078"
uid: 6b46deb7-981a-427a-b30e-08ddfd593fee
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/component: server
app.kubernetes.io/instance: prometheus
app.kubernetes.io/name: prometheus
strategy:
type: Recreate
template:
metadata:
annotations:
kubectl.kubernetes.io/restartedAt: "2024-08-21T12:10:15Z"
creationTimestamp: null
labels:
app.kubernetes.io/component: server
app.kubernetes.io/instance: prometheus
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: prometheus
app.kubernetes.io/version: v2.48.1
helm.sh/chart: prometheus-25.8.2
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: hubble.nodeType.Nats
operator: In
values:
- Allowed
topologyKey: failure-domain.beta.kubernetes.io/zone
containers:
- args:
- --watched-dir=/etc/config
- --reload-url=http://127.0.0.1:9090/-/reload
image: quay.io/prometheus-operator/prometheus-config-reloader:v0.70.0
imagePullPolicy: IfNotPresent
name: prometheus-server-configmap-reload
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/config
name: config-volume
readOnly: true
- args:
- --storage.tsdb.retention.time=15d
- --config.file=/etc/config/prometheus.yml
- --storage.tsdb.wal-compression
- --web.console.libraries=/etc/prometheus/console_libraries
- --web.console.templates=/etc/prometheus/consoles
- --web.enable-lifecycle
image: quay.io/prometheus/prometheus:v2.51.2
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /-/healthy
port: 9090
scheme: HTTP
initialDelaySeconds: 90
periodSeconds: 15
successThreshold: 1
timeoutSeconds: 10
name: prometheus-server
ports:
- containerPort: 9090
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /-/ready
port: 9090
scheme: HTTP
initialDelaySeconds: 90
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 4
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/config
name: config-volume
- mountPath: /data
name: storage-volume
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeSelector:
hubble.nodeType.Nats: Allowed
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 65534
runAsGroup: 65534
runAsNonRoot: true
runAsUser: 65534
serviceAccount: prometheus-server
serviceAccountName: prometheus-server
terminationGracePeriodSeconds: 300
volumes:
- configMap:
defaultMode: 420
name: prometheus-server
name: config-volume
- name: storage-volume
persistentVolumeClaim:
claimName: prometheus-server
Kubernetes version information:
kubectl version
Client Version: v1.30.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.8-eks-a737599
kind: Deployment
level=error ts=2024-11-08T06:19:48.406640681Z caller=runutil.go:100 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post "http://127.0.0.1:9090/-/reload\": dial tcp 127.0.0.1:9090: connect: connection refused"
The text was updated successfully, but these errors were encountered: