Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to deploy stackStorm HA through with kubeadm #285

Open
simonli866 opened this issue Feb 18, 2022 · 7 comments
Open

Failed to deploy stackStorm HA through with kubeadm #285

simonli866 opened this issue Feb 18, 2022 · 7 comments
Labels
question Further information is requested

Comments

@simonli866
Copy link

simonli866 commented Feb 18, 2022

image
image
image

Two PODS cannot be started and the Web interface cannot be accessed, but the console interface shows that the installation is successful
image

the error log in here:

[root@centos-master ~]# kubectl logs stackstorm-redis-node-0
error: a container name must be specified for pod stackstorm-redis-node-0, choose one of: [redis sentinel]
[root@centos-master ~]# kubectl logs stackstorm-redis-node-0 -c redis
I am master
redis 02:25:38.52 INFO  ==> ** Starting Redis **
1:C 18 Feb 2022 02:25:38.548 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 18 Feb 2022 02:25:38.548 # Redis version=6.0.9, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 18 Feb 2022 02:25:38.548 # Configuration loaded
1:M 18 Feb 2022 02:25:38.554 * Running mode=standalone, port=6379.
1:M 18 Feb 2022 02:25:38.554 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 18 Feb 2022 02:25:38.554 # Server initialized
1:M 18 Feb 2022 02:25:38.554 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
1:M 18 Feb 2022 02:25:38.560 * Reading RDB preamble from AOF file...
1:M 18 Feb 2022 02:25:38.560 * Loading RDB produced by version 6.0.9
1:M 18 Feb 2022 02:25:38.560 * RDB age 881 seconds
1:M 18 Feb 2022 02:25:38.560 * RDB memory usage when created 1.87 Mb
1:M 18 Feb 2022 02:25:38.560 * RDB has an AOF tail
1:M 18 Feb 2022 02:25:38.560 * Reading the remaining AOF tail...
1:M 18 Feb 2022 02:25:38.561 * DB loaded from append only file: 0.007 seconds
1:M 18 Feb 2022 02:25:38.561 * Ready to accept connections
1:M 18 Feb 2022 02:26:19.887 * Replica 192.168.170.253:6379 asks for synchronization
1:M 18 Feb 2022 02:26:19.887 * Full resync requested by replica 192.168.170.253:6379
1:M 18 Feb 2022 02:26:19.887 * Replication backlog created, my new replication IDs are '356f8e0eaf71f966ffce779720a7be37b39e79f9' and '0000000000000000000000000000000000000000'
1:M 18 Feb 2022 02:26:19.887 * Starting BGSAVE for SYNC with target: disk
1:M 18 Feb 2022 02:26:19.888 * Background saving started by pid 58
58:C 18 Feb 2022 02:26:19.893 * DB saved on disk
58:C 18 Feb 2022 02:26:19.894 * RDB: 0 MB of memory used by copy-on-write
1:M 18 Feb 2022 02:26:19.984 * Background saving terminated with success
1:M 18 Feb 2022 02:26:19.985 * Synchronization with replica 192.168.170.253:6379 succeeded
1:M 18 Feb 2022 02:27:17.687 * Replica 192.168.170.212:6379 asks for synchronization
1:M 18 Feb 2022 02:27:17.687 * Full resync requested by replica 192.168.170.212:6379
1:M 18 Feb 2022 02:27:17.687 * Starting BGSAVE for SYNC with target: disk
1:M 18 Feb 2022 02:27:17.688 * Background saving started by pid 169
169:C 18 Feb 2022 02:27:18.486 * DB saved on disk
169:C 18 Feb 2022 02:27:18.490 * RDB: 0 MB of memory used by copy-on-write
1:M 18 Feb 2022 02:27:18.566 * Background saving terminated with success
1:M 18 Feb 2022 02:27:18.576 * Synchronization with replica 192.168.170.212:6379 succeeded
[root@centos-master ~]# kubectl logs stackstorm-redis-node-0 -c sentinel
Could not connect to Redis at 192.168.170.195:26379: Connection refused
@arm4b
Copy link
Member

arm4b commented Feb 18, 2022

Instead of kubectl logs stackstorm-redis-node-0 -c sentinel, use kubectl logs --previous stackstorm-redis-node-0 -c sentinel. I suspect the most important messages weren't included for the failing container.

It's interesting that redis-node-2 has finally reached it's alive and up state, while others are down.
Can you compare those for any differences and anomalies, including logs from other pods?

Also show the full kubectl describe for the failing pods. kubectl get pv,pvc,sc would help too.

Could you describe the resources (memory/cpu/storage) you have on that K8s cluster?

@arm4b arm4b added the question Further information is requested label Feb 18, 2022
@simonli866
Copy link
Author

Sorry, this problem can not be repeated every time, I will update the content when it is repeated next time

@simonli866
Copy link
Author

图片
The question of Redis arises again

@simonli866
Copy link
Author

@armab

@arms11
Copy link
Contributor

arms11 commented Mar 4, 2022

@shiminglee is behavior different/better when the redis is deployed directly from bitnami with it being disabled in stackstorm-ha values.yaml? You may have to provide the connection string in st2.conf (via configmap).

@simonli866
Copy link
Author

simonli866 commented Mar 4, 2022

@arms11 I use the bitnami directly. why not use values.yaml directly? Why need to configure connection strings?

@arm4b
Copy link
Member

arm4b commented Mar 5, 2022

Another advantage of what @arms11 suggested is that trying the Redis chart in isolation could help to pinpoint the root cause of the issue so you don't need to re-deploy the st2 cluster every time, but deal with Redis issue only.

BTW could you provide more info about your K8s environment and resources?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants