Skip to content
This repository has been archived by the owner on Feb 8, 2024. It is now read-only.

Kernel 4.1? What about Ethernet hw bug ? #148

Open
PaoloBi opened this issue Dec 13, 2018 · 15 comments
Open

Kernel 4.1? What about Ethernet hw bug ? #148

PaoloBi opened this issue Dec 13, 2018 · 15 comments

Comments

@PaoloBi
Copy link

PaoloBi commented Dec 13, 2018

No description provided.

@PaoloBi
Copy link
Author

PaoloBi commented Dec 13, 2018

I am migrating to RC branch ‘cause of an annoying bug in pathplanner that hangs my printer with 2.0.8. New Ubuntu-based distro uses kernel 4....but BBB has an annoying and well-known hardware problem, when powered-up from expansion connector 30..40% of times Ethernet doesn’t work, you need a hw reset to fix this. I discovered this and * thanks * following this link
https://wp.josh.com/2018/06/04/a-software-only-solution-to-the-vexing-beagle-bone-black-phy-issue/#more-7355
I managed to detect a phy malfunction and to hw reset my BBB with my (modified) replicape.
So far, so good....but kernel 4.x broke this fix ! Now I can choose between a 2.08 kamikaze/debian that hangs on certain gcodes and a 2.1 ubuntu version where Ethernet mostly doesn’t work....gosh...

@PaoloBi PaoloBi changed the title Kernel 4.1? Kernel 4.1? What about Ethernet hw bug ? Dec 13, 2018
@ThatWileyGuy
Copy link
Member

Would you like a development image to try? If so, do you use Toggle on a touchscreen?

@PaoloBi
Copy link
Author

PaoloBi commented Dec 13, 2018

I tried yesterday with 2.1.0, the phy problem is there and my workaround doesn’t work any more. My display is a custom project, 1024x600, based on Qt, I talk mostly with Octoprint using its APIs

@ThatWileyGuy
Copy link
Member

If you're feeling adventurous, give https://wiley.pub/umikaze/Umikaze-2.2.1-1804test4.img.xz a try.

I run three BBBs on ethernet and I haven't experienced the bug you seem to be hitting, but I've also been running much more recent kernels. That image has a 4.14 kernel.

@PaoloBi
Copy link
Author

PaoloBi commented Dec 14, 2018

Thank you, I will check it during the holidays. In the meanwhile, I see that you made a "New path planner" pull request some time ago. As my current version is older (2017-03-24), do you think that your commit should fix my path planner hangup problems ?

@ThatWileyGuy
Copy link
Member

Yes. I believe you're hitting a deadlock that could occur when a sync event was queued to wait for the currently queued paths to complete. If the queue only had a single path in it, the sync event would sometimes never fire. I fixed it as part of rewriting the path queue.

@PaoloBi
Copy link
Author

PaoloBi commented Dec 14, 2018

Hi Andrew .... Yes ! now the deadlock has gone away (at least with the two "killer" files I found). Thank you very much. I think this pull should be adopted as soon also in "master" branch, because elsewhere this (major in my opinion, as it freezes your printer forever) bug will surely chime in sooner or later. I will try the test4 img as soon, thanks again

@PaoloBi
Copy link
Author

PaoloBi commented Dec 15, 2018

Anyway, according to
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/Documentation/devicetree/bindings/net/mdio.txt?id=69226896ad636b94f6d2e55d75ff21a29c4de83b

Some boards [1] leave the PHYs at an invalid state
during system power-up or reset thus causing unreliability
issues with the PHY which manifests as PHY not being detected
or link not functional. To fix this, these PHYs need to be RESET
via a GPIO connected to the PHY's RESET pin.

The proposed fix is broken with kernel 4.x, so there are chances that a printer will accidentally have ethernet not working at startup

@ThatWileyGuy
Copy link
Member

The blog post you linked earlier is a usermode fix that's broken in 4.x.
The kernel patch you linked shows that the kernel is now automatically resetting the entire MDIO bus to get the ethernet PHY to reset properly.

Are you sure this is still an issue on any kernel made in the last year?

@goeland86
Copy link
Member

@PaoloBi I've never encountered this issue since 2.1.0 was released. I may have accidentally hit on it during the release cycles for 2.1.0, but we were already releasing with a 4.4.x series in 2.1.1. The latest dev images have kernels that have fixed this problem.

@goeland86
Copy link
Member

@PaoloBi can you confirm you have managed to work past this issue yet?

@PaoloBi
Copy link
Author

PaoloBi commented Jan 27, 2019

Yes, the deadlock has gone away, my old kernel still suffers the "dead on start" Ethernet problem, jus yesterday my printer self-restarted once due to this problem but the fix works always. The latest dev images have kernels that have fixed this problem ?...If I were absolutely sure I'd start to migrate to new kernel (but not for now). Thanks to all for helping !

@goeland86
Copy link
Member

@PaoloBi you're more than welcome to try and flash an SD card with the newer image and edit the uEnv to boot from the SD instead of flashing it to the eMMC.

@PaoloBi
Copy link
Author

PaoloBi commented Jan 27, 2019

Thanks...problem is, I modified quite a lot my redeem sources to add new sensors/probes/actuators (I talk with a custom made replicape board with an STM32F1xx) and to talk with my GUI program, so I will have to reply all these changes on new image. Hope to find the time and inclination to.

@goeland86
Copy link
Member

You can use the RCN provided scripts to update the kernel on your running image - they're in /opt/scripts/tools/update-kernel.sh

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants