-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reordered net context connect operation #69145 #69146
Reordered net context connect operation #69145 #69146
Conversation
Hello @matt-wood-ct, and thank you very much for your first pull request to the Zephyr project! |
5ece74b
to
ff7dd99
Compare
Renamed commit to follow guidelines |
Moved socket offload connect ahead of non offload connect implementation to fix issue with multiple network interfaces Signed-off-by: Matt Wood <[email protected]>
ff7dd99
to
3a302ff
Compare
Replace the 4 space scheme with tab for compliance Signed-off-by: Matt Wood <[email protected]>
4173e11
to
2b3fb93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of comments:
- please squash the patches together, we normally do not do fix up commits in the same PR
- could you elaborate in the commit message why/how the change solves the issue, otherwise it is difficult at some later date to figure out from the commit why this change fixed the issue
Overall, I am not convinced about the change. The checks in the original code were done before the offloaded call in order to check the addresses etc. Now we are not doing those checks for offloaded interface.
Where/why exactly the crash happens, could we fix things there instead of this shortcut?
Forgot to mention earlier, you should add |
Agreed, plus when it comes to NET_OFFLOAD and mutliple interfaces it's not really supported well, there's a more serious issue on the table: |
Thanks for the feedback, I think you are both right. I've been doing more testing and my "fix" does indeed just make other things worse, I am surprised it worked at all. I will continue my testing and see if I can find a better fix. Either way I must find a workable solution as the product I am working on does have all 3 interfaces and we do need to switch between the in runtime 😢 |
Check this comment where the user describes how they handle this limitation - as a workaround, they call |
Hi Robert,
So currently we are getting crashes with the default configured correctly, any suggestions? (I am still intently debugging of course) |
It would be really helpful if you could pin point the exact line where the crash happens, without this info it is difficult to suggest anything. |
Network ifaces (to show interface numbering):
Log when wifi is enabled and comes up first
So in the above log the wifi is up and tested (testing consists of creating a socket, opening it then closing it to 1.1.1.1). Hope this log helps to explain the series of events more clearly Edit:
|
Hmm, but we access the data behind conn pointer in previous line and it does not crash there? |
I don't think it does, that's just how the PC is rendered by the debugger, I interpret that to mean it crashed on the line before. As a confirmation I added in a null trap and it did indeed stop the crash (but of course the network didn't work) |
Ok, after another hour or so of bashing my head against this I've discovered the context inside
snippet of code from the function to show where those prints are:
so |
I've been investigating the case and was going to ask about this, it seems that with current implementation if Could you check if the following patch helps? diff --git a/subsys/net/ip/net_context.c b/subsys/net/ip/net_context.c
index ae7a4a4e848..fb6e1a51237 100644
--- a/subsys/net/ip/net_context.c
+++ b/subsys/net/ip/net_context.c
@@ -542,7 +542,7 @@ int net_context_get(sa_family_t family, enum net_sock_type type, uint16_t proto,
return ret;
}
- net_context_set_iface(*context, net_if_get_default());
+ net_context_bind_iface(*context, net_if_get_default());
}
return 0; The change just makes sure that the current net_context is permanently bound to the interface, which should be the case of offloaded contexts. |
Hi Robert,
|
However I did just test that patch with ethernet and plugging that in stops PPP from working and ethernet does not work:
Looks like it is trying to bind PPP when it tries to use ETH, so similar issue but slightly different |
Does the IP address remain registered on the PPP interface when it goes down? As before, on binding net stack will try to select best interface based on remote/local addresses, so if the "best match" remains on the PPP interface, it'd be kept being selected. Or, you can also enforce that a specific "native" (non-offloaded) socket is bound to a specific interface by calling
That sounds quite fishy, I don't see any obvious reason why would that be happening... |
Now this is interesting, thank you so much for pointing out this socket option, setting this does seem to make the setup work even without the patch you described. Still testing but early signs are promising... |
Can confirm setting the default interface and using the socket option |
You are welcome, I am glad the issue is sorted. If everything is working ok, please close the issue and the PR. |
Moved socket offload connect ahead of non offload connect implementation to fix issue with multiple network interfaces
See #69145