FRU-Device does not work well with 16bit eeproms #1

feistjj · 2019-07-19T16:26:13Z

Known issue, unfortunately I don't have any 16bit eeproms in my system to play with.

Start of solution here: https://gerrit.openbmc-project.xyz/c/openbmc/entity-manager/+/18783

feistjj · 2019-08-15T21:17:27Z

Adding @pstrinkle, @amithash and @vijaykhemka as they are / have worked with this issue.

vijaykhemka · 2019-08-16T16:11:59Z

Main issue is that it is hard to detect a device 8 bit vs 16 bit by reading it. In current implementation, assumption is device comes up with index pointer pointing to 0 offset. If it points to different offset/page then can't read header without writing.

pstrinkle · 2019-08-16T16:17:07Z

Yup. That's the primary difficulty. I have a device that is 16-bit addressed, but every other boot of the BMC, FruDevice changes its mind. So I implemented a quick hint-lookup that'll check and see if a device is "hard-coded" to be one or the other. However, this requires a lot of board knowledge -- and we mix 8-bit and 16-bit at the same smbus address. Although, we have some knowledge that if it's on bus 6 or 7 (for example) then it must be 16-bit. So, I have those hints available to the code. With the hint in place, it always works for me.

When using multiple dbus-probe types, we were seeing: Program received signal SIGBUS, Bus error. 0x00475c6c in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() () (gdb) bt #0 0x00475c6c in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() () #1 0x00477820 in std::vector<std::shared_ptr<PerformProbe>, std::allocator<std::shared_ptr<PerformProbe> > >::clear() () #2 0x0046d594 in ?? () #3 0x0046e14c in ?? () #4 0x76f60bd0 in ?? () from /lib/libsystemd.so.0 Backtrace stopped: previous frame identical to this frame (corrupt stack?) The logic in this was quite bad, by moving the storage of PerformProbe shared_ptrs into the captures, we don't need to worry about calling clear ever, so we won't run into this problem. This was reordered to fix the issue. Tested: On system that frequently saw the crash, it went away, all sensors still available. Change-Id: Icacb8861466816df64b24efe940e5a732102345a Signed-off-by: James Feist <[email protected]>

iwoloschin mentioned this issue Sep 8, 2021

isDevice16bit function works incorrectly for the 24LC128 EEPROM #15

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FRU-Device does not work well with 16bit eeproms #1

FRU-Device does not work well with 16bit eeproms #1

feistjj commented Jul 19, 2019

feistjj commented Aug 15, 2019 •

edited

Loading

vijaykhemka commented Aug 16, 2019

pstrinkle commented Aug 16, 2019

FRU-Device does not work well with 16bit eeproms #1

FRU-Device does not work well with 16bit eeproms #1

Comments

feistjj commented Jul 19, 2019

feistjj commented Aug 15, 2019 • edited Loading

vijaykhemka commented Aug 16, 2019

pstrinkle commented Aug 16, 2019

feistjj commented Aug 15, 2019 •

edited

Loading