-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mongooseimctl ipv6 support #4127
Comments
Hi @andywhite37, reach out if you like, i spent a lot of time getting MongooseIM going in k8's, in particular with Helm. There are a lot of little gotchas to overcome, the base helm deployments don't really work too well. and i had to tear the charts apart and the start script. I cant share my helm charts here as they are wrapped up with our deployment and a bit of a PITA to unravel without pulling usefulness out. |
Hey @adamramage - I'm glad to know you're out there and have worked through similar things, so we're not alone! I think we're still fiddling/debugging to see if we can narrow down the problem, but I will reach out if we get totally stuck. As a caveat to all this, I'm not an Erlang developer, so I'm having to learn some of this stuff as I go. The thing we looked at today was related to the Erlang dist communcation on port 4369 and 9100. We had a theory that we didn't have the right ports open in our AWS security group, so we made sure 4369 and 9100 were open, which we think they are. We noticed on the mongooseim machines, if we run I tried setting # mongooseimctl fails with eaddrinuse
root@mongooseim-0:/# mongooseimctl
Protocol 'inet6_tcp': register/listen error: eaddrinuse
# Trying to run mongooseim ping gives more error info
root@mongooseim-0:/# ./usr/lib/mongooseim/bin/mongooseim ping
=INFO REPORT==== 20-Sep-2023::21:16:44.606961 ===
Protocol 'inet6_tcp': register/listen error: eaddrinuse
=SUPERVISOR REPORT==== 20-Sep-2023::21:16:44.607088 ===
supervisor: {local,net_sup}
errorContext: start_error
reason: {'EXIT',nodistribution}
offender: [{pid,undefined},
{id,net_kernel},
{mfargs,{net_kernel,start_link,
[#{clean_halt => false,
name =>
'mongooseim_maint_154@mongooseim-0.mongooseim.my-identifier.svc.cluster.local',
name_domain => longnames,
net_tickintensity => 4,
net_ticktime => 60,
supervisor => net_sup_dynamic}]}},
{restart_type,permanent},
{significant,false},
{shutdown,2000},
{child_type,worker}]
=CRASH REPORT==== 20-Sep-2023::21:16:44.607221 ===
crasher:
initial call: net_kernel:init/1
pid: <0.82.0>
registered_name: []
exception exit: {error,badarg}
in function gen_server:init_it/6 (gen_server.erl, line 835)
ancestors: [net_sup,kernel_sup,<0.47.0>]
message_queue_len: 0
messages: []
links: [<0.79.0>]
dictionary: [{longnames,true}]
trap_exit: true
status: running
heap_size: 2586
stack_size: 28
reductions: 3171
neighbours:
escript: exception error: no match of right hand side value
{error,
{{shutdown,
{failed_to_start_child,net_kernel,
{'EXIT',nodistribution}}},
{child,undefined,net_sup_dynamic,
{erl_distribution,start_link,
[#{clean_halt => false,
name =>
'mongooseim_maint_154@mongooseim-0.mongooseim.my-identifier.svc.cluster.local',
name_domain => longnames,
net_tickintensity => 4,net_ticktime => 60,
supervisor => net_sup_dynamic}]},
permanent,false,1000,supervisor,
[erl_distribution]}}}
root@mongooseim-0:/# The annoyance here is that we are not able to use mongooseimctl, even though the server appears to be working normally (I can can successfully connect to it via an XMPP client, the graphql endpoints are working, etc.) |
its 11pm here so bear with me.. Theres a few problems to address.
The error you posted shows the problem lies with mongoooseim exec trying to connect to the epdm process. It does this with the nodetool exec but will go and look at your vm.args and call your hostname to figure out where to talk to. remember the start script will eat this config file up and rewrite the -name / -sname fields you spec. Check your vm.args is correctly setup, check you can do something like
our naming syntax is like mongooseim-ctrl-0.ctrl.mongooseim.svc.cluster.local
Check this out https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-sethostnameasfqdn-field
service.yaml
statefulset.yaml
|
Give this a whirl.. It was used to get a MVP going for us and it does work, your mileage may vary. Any issues, let me know. feel free to contribute to it, I'll continue to maintain it as a bit of the templating is incomplete |
@adamramage this is amazing info, thanks so much for taking the time to put all this out there. I'll check all of this out and report back! |
no dramas. keep me posted how you go. keen to see if you get it going |
I'm becoming pretty convinced that my problems are more related to ipv6 support as opposed to anything in Kubernetes/etc. I posted this other issue #4132 to see if anyone can provide any other clues. |
@adamramage We appreciate PRs or feedback. So, if there is a PR with clear explanation and some way to test it, it sounds great. |
No worries. the repo i forked was adding changes specific to our uses and requirements. merging them back in might break features for other users. we're still investigating / tuning the charts and deployment so a PR might follow |
MongooseIM version: 6.1.0
Installed from: Kubernetes+MongooseHelm (exported the helm templates to k8s manifests using default values)
Erlang/OTP version: version from helm chart
Background
I'm running MongooseIM 6.1.0 in a Kubernetes cluster in AWS, using the Helm charts from esl/MongooseHelm. (We are not actually using the helm charts as is, but we exported the manifests from the charts with the default values, and have been updating them from there.) I have it running with two replicas, and I believe the server is up and running correctly - it is listening on the desired ports (
5222
,5280
, etc.).Our cluster is configured for ipv6 only. I have configured all the listeners to use
ip_version = 6
, and it all appears to be working.The hostnames are something like:
mongooseim-0.mongooseim.my-identifier.svc.cluster.local
andmongooseim-1...
for the second replica. I'm able totelnet -6 mongooseim-{n}.mongooseim.my-identifier.svc.cluster.local 5222
from either pod, connecting to itself and the other pod. The telnet connection responds with an XML stream error, which I believe indicates the servers are running. I can alsoping6
either host.Problem
The problem I'm having is
mongooseimctl
reportsFailed RPC connection to the node
mongooseim@mongooseim-0.mongooseim.my-identifier.svc.custer.local': nodedown` when I try to run any command.I've poked around a bit, and my theory is that
mongooseimctl
is not working correctly with hostnames that resolve to ipv6 addresses. Below are some example commands I've tried running:Question
Am I on the right track - is it expected that
mongooseimctl
wouldn't work with ipv6?The text was updated successfully, but these errors were encountered: