Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for broken Eaton (and others?) string language ID descriptor #2604

Merged
merged 12 commits into from
Sep 5, 2024

Conversation

tormodvolden
Copy link
Contributor

@tormodvolden tormodvolden commented Aug 26, 2024

@masterwishx has tested this with his Eaton device, both on libusb0 and libusb1. Thanks!

The first two commits revert something that was an unsuccessful attempt at fixing something that turned out to be a buggy device. A more effective workaround is suggested in commits 4-5, which also tidies up and unifies things. Commit 3 reverts "retrials" for easier refactoring in 4-5 and similar functionality is added back in a more unified way in commit 4. Commits 6-7-8 are just tidying up things for future work, not strictly needed.

Fixes #1925
References #414

TODOs fixed in rework:

  • retries should be in nut_get_string and not in API wrapper
  • usb1 *_simple should just be kept as simple libusb0/1 API wrapper (if needed at all)

Ideas for future work in other PRs:

  • all drivers should use "nut_get_string" instead of *_simple and *_ascii
  • langid should be cached somewhere (in USBDevice_t or a new USBHandle_t ?)

@masterwishx
Copy link
Contributor

@masterwishx Can you please test this with your Eaton device?

Wow , Thanks a lot , sure i will test it ...

@masterwishx
Copy link
Contributor

I will try to compile without unraid nut plugin as author of plugin don't have time for it...

@tormodvolden tormodvolden force-pushed the eaton_broken_strings branch 2 times, most recently from 7d61ea3 to a3d2773 Compare August 27, 2024 08:16
@masterwishx
Copy link
Contributor

Can I test now?

@tormodvolden
Copy link
Contributor Author

Yes. I had only done minor/optical refinements.

@masterwishx
Copy link
Contributor

masterwishx commented Aug 27, 2024

@tormodvolden Thanks a lot , YES, after compiled i have all info :

@jimklimov just battery.voltage and battery.voltage.nominal is missing now :(

battery.capacity: 7.00
battery.charge: 100
battery.charge.low: 20
battery.charge.restart: 0
battery.protection: yes
battery.runtime: 3399
battery.runtime.low: 180
battery.type: PbA                         <  ----   wasnt avalible befor !!!
device.mfr: EATO                           <  ----   wasnt avalible befor !!!  EATON should be ? 
device.model: Eaton 9 2000      <  ----   wasnt avalible befor !!!
device.serial:   xxxxxxxxxxx        #REDACTED by me # !!! serial is shown now !!! 
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.synchronous: auto
driver.state: updateinfo
driver.version: 2.8.2.1
driver.version.data: MGE HID 1.48
driver.version.internal: 0.56
driver.version.usb: libusb-1.0.26 (API: 0x1000108)
input.bypass.frequency: 50.0
input.bypass.voltage: 232.0
input.frequency: 50.0
input.frequency.nominal: 50
input.transfer.high: 300
input.transfer.low: 100
input.voltage: 232.0
input.voltage.nominal: 233
outlet.1.status: on
outlet.desc: Main Outlet
outlet.id: 0
outlet.switchable: no
output.current: 1.60
output.frequency: 50.0
output.frequency.nominal: 50
output.voltage: 230.0
output.voltage.nominal: 230
ups.beeper.status: enabled
ups.delay.shutdown: 20
ups.delay.start: 30
ups.firmware: 00.02.00    <  ----   wasnt avalible befor !!!
ups.load: 20
ups.load.high: 105
ups.mfr: EATO                   <  ----   wasnt avalible befor !!!    EATON should be ?
ups.model: Eaton 9 2000         <  ----   wasnt avalible befor !!!    
ups.power: 398
ups.power.nominal: 2000
ups.productid: ffff
ups.realpower: 265
ups.realpower.nominal: 1600
ups.serial: xxxxxxxxxxxx                   #REDACTED by me # !!! serial is shown now !!! 
ups.shutdown: enabled
ups.start.auto: yes
ups.start.battery: yes
ups.start.reboot: yes
ups.status: OL
ups.temperature: 24.9
ups.test.interval: 604800
ups.test.result: Done and passed
ups.timer.shutdown: -1
ups.timer.start: -1
ups.type: online
ups.vendorid: 0463 

@masterwishx
Copy link
Contributor

Also maybe Eaton 9E 2000i should be instead of Eaton 9 2000 ... and PbAc instead of PbA ...

it seems like the last character is missing in values, also in serial number

@tormodvolden
Copy link
Contributor Author

Thanks for your testing. I probably made an off-by-one error in the string decoding. I'll fix that later.

@tormodvolden
Copy link
Contributor Author

The missing last character should be fixed now.

@masterwishx
Copy link
Contributor

The missing last character should be fixed now.

Now its fine and all values avaliable :

battery.capacity: 7.00
battery.charge: 100
battery.charge.low: 20
battery.charge.restart: 0
battery.protection: yes
battery.runtime: 3444
battery.runtime.low: 180
battery.type: PbAc
battery.voltage: 83.0
battery.voltage.nominal: 72
device.mfr: EATON
device.model: 9E2000i
device.serial:  xxxxxxxxxxxx                   #REDACTED by me # !!! serial is shown now !!!
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.synchronous: auto
driver.state: updateinfo
driver.version: 2.8.2.1
driver.version.data: MGE HID 1.48
driver.version.internal: 0.56
driver.version.usb: libusb-1.0.26 (API: 0x1000108)
input.bypass.frequency: 49.9
input.bypass.voltage: 232.0
input.frequency: 49.9
input.frequency.nominal: 50
input.transfer.high: 300
input.transfer.low: 100
input.voltage: 232.0
input.voltage.nominal: 231
outlet.1.status: on
outlet.desc: Main Outlet
outlet.id: 0
outlet.switchable: no
output.current: 1.60
output.frequency: 49.9
output.frequency.nominal: 50
output.voltage: 229.0
output.voltage.nominal: 230
ups.beeper.status: enabled
ups.delay.shutdown: 20
ups.delay.start: 30
ups.firmware: 00.02.000
ups.load: 19
ups.load.high: 105
ups.mfr: EATON
ups.model: 9E2000i
ups.power: 393
ups.power.nominal: 2000
ups.productid: ffff
ups.realpower: 261
ups.realpower.nominal: 1600
ups.serial:  xxxxxxxxxxxx                   #REDACTED by me # !!! serial is shown now !!!
ups.shutdown: enabled
ups.start.auto: yes
ups.start.battery: yes
ups.start.reboot: yes
ups.status: OL
ups.temperature: 24.9
ups.test.interval: 604800
ups.test.result: Done and passed
ups.timer.shutdown: -1
ups.timer.start: -1
ups.type: online
ups.vendorid: 0463

@masterwishx
Copy link
Contributor

@jimklimov when you will have time, please check it, the issue is solved :)

@jimklimov
Copy link
Member

Sounds great, thanks a lot! I'll take a closer look later though.

Interesting how battery.voltage(.nominal) was missing and came back after the off-by-one error fix. Was it supposed to be impacted? Not a string, right?..

@jimklimov
Copy link
Member

@tormodvolden : do you expect this fix is generic enough to drop the need for older *langid* options?

@masterwishx
Copy link
Contributor

Sounds great, thanks a lot! I'll take a closer look later though.

Interesting how battery.voltage(.nominal) was missing and came back after the off-by-one error fix. Was it supposed to be impacted? Not a string, right?..

I'm not sure but maybe battery.voltage(.nominal) related to your fix for Eaton 9E names, so when name was not fully then it was not catched?

@masterwishx
Copy link
Contributor

@tormodvolden do you need to finish also TODO befor merging to main?

@tormodvolden
Copy link
Contributor Author

@masterwishx I have reworked it to address the most important TODO items. It should now work equally well with libsb-1.0 and libusb0 so I would appreciate if you can test both.

@jimklimov The USB code in nut seems complicated by the burden of supporting both libusb0 and libusb1. What are the reasons for supporting libusb0?

There seems to be drivers using the libusb0.c and libusb1.c through the "API" and there seems to be some drivers that do their own stuff. And an outlier tools/nut-scanner/scan_usb.c which does its own dlopen() fun and supports both. Then there are wrappers trying to unify the APIs but not being consistently used. Unifying all this and tidying up would probably help in the long run but could be a lot of work.

I would suggest as a next step to get rid of all direct use of usb_get_string_simple() and libusb_get_string_descriptor_ascii() and instead use the "API" functions or the new common nut_usb_get_string().

@tormodvolden
Copy link
Contributor Author

tormodvolden commented Aug 28, 2024

@tormodvolden : do you expect this fix is generic enough to drop the need for older *langid* options?

The fix in this PR doesn't honour the "langid_fix" option, but will always try to auto-detect by reading out the language ID. Having a way to override the fallback value of en_US could possibly be useful. This option should be a generic one, not limited to a single driver. I don't know if anyone used this option to set anything else than en_US (0x0409). If it only was used for setting 0x0409 and the affected devices (only seen in nutdrv_qx and blazer_usb) won't get hosed up by a langid request (but only return an invalid descriptor) then this option can be dropped now. EDIT: One device in data/driver.list.in uses 0x4095.

I am not sure the "noscanlangid" option adds much value, unless some broken device get hosed up by such requests. Other code is already "scanning" it (reading out the language ID), like is being done here in nut_usb_get_string() as well. The option might be just leftovers from previous work. Please check with those who use it (only seen in nutdrv_qx/hunnox). EDIT: The "hunnox" protocol seems to use a port-knocking scheme involving a sequence of non-standard string descriptors retrievals to unlock the device communication, so it has to do precise low-level string descriptor fetching, incompatible with the USB standard. Maybe nut_usb_get_string() will work fine after the unlocking.

@masterwishx
Copy link
Contributor

I have reworked it to address the most important TODO items. It should now work equally well with libsb-1.0 and libusb0 so I would appreciate if you can test both.

Sure I can test it, but don't know how to test both?

@jimklimov
Copy link
Member

Regarding libusb-0.1, on one hand this code was with NUT much longer (1.0 only getting into mainline a couple of years back with 2.8.0, so much less exposed and tested). On the other, NUT aims to please older platforms (and machines) where it once ran, and those may lack a build/package of libusb-1.0, I suppose.

@tormodvolden
Copy link
Contributor Author

Sure I can test it, but don't know how to test both?

After a fresh checkout or running "make distclean":

mkdir build0 && cd build0 && ../configure --with-usb=libusb-0.1 && make
mkdir build1 && cd build1 && ../configure --with-usb=libusb-1.0 && make

@jimklimov
Copy link
Member

As for testing, if the libusb-0.1-dev or similar is available, you can configure --with-libusb=0.1 && make

@tormodvolden
Copy link
Contributor Author

Let me know if there is any platform where nut builds but there is no libusb-1.0.

@masterwishx
Copy link
Contributor

Sure I can test it, but don't know how to test both?

After a fresh checkout or running "make distclean":

mkdir build0 && cd build0 && ../configure --with-usb=libusb-0.1 && make
mkdir build1 && cd build1 && ../configure --with-usb=libusb-1.0 && make

What I compiled in Slackware VM last time for test was nut.SlackBuild from Unraid NUT plugin modified by me. So it was libusb 1.0.26 that my Unraid version has.

So I can test latest PR tomorrow same way, but for libusb 0.1 I can add libusb-0.1 to .. /configure but will it work?

@jimklimov
Copy link
Member

Let me know if there is any platform where nut builds but there is no libusb-1.0.

The oldest occasionally tested that I know of, because there still is practical interest, is Solaris 8. Gotta check if that involves USB, and which, though.

@masterwishx
Copy link
Contributor

I have reworked it to address the most important TODO items. It should now work equally well with libsb-1.0 and libusb0 so I would appreciate if you can test both.

Compiled with libusb 1.0 , all working fine as before , all values available .

will check with libusb 0.1 as Unraid have libusb-compat 0.1.7

jimklimov added a commit to tormodvolden/nut that referenced this pull request Sep 3, 2024
@jimklimov
Copy link
Member

jimklimov commented Sep 3, 2024

Bumped the PR with a NEWS entry and resync with current master.

@tormodvolden : if you wish to squash some of the commits and rebase - feel free, just re-fetch to have this NEWS change attached. Or I can just merge as soon as CI does not bork on anything relevant. Law of big numbers: something random does hiccup eventually.

And thanks again for all the hard work on analysis and implementation, and bearing with us :D

@AppVeyorBot
Copy link

Build nut 2.8.2.2108-master failed (commit a87ba77d3c by @jimklimov)

@jimklimov
Copy link
Member

make[1]: *** [Makefile:1644: install-win-bundle-thirdparty] Error 1

make: *** [Makefile:1638: install-win-bundle] Error 2

Not sure what Appveyor disliked here, build/test were ok. So this fault is unrelated to this PR.

FWIW, it listed some DLLs that were not in mingw (FOSS) locations but that's normal (e.g. OS own libs). Not sure if the list is same as before, or is why the step failed. Maybe libexec/sockdebug.exe troubleshooting helper being not covered by DLL accompaniment?..

tormodvolden and others added 9 commits September 3, 2024 18:21
…Vendor, Product, Serial if NULL, after claiming it by other criteria [networkupstools#2562]"

This reverts commit ad83b70.

The reverted commit attempted to workaround a string descriptor
retrieval issue by retrying after claiming an interface on the device.
However, this did not help, and it was later found out that a broken
langid descriptor was the root issue all along.

Signed-off-by: Tormod Volden <[email protected]>
…ce->Vendor, Product, Serial if NULL, after claiming it by other criteria [networkupstools#2562]"

This reverts commit ea99f96.

The reverted commit attempted to workaround a string descriptor
retrieval issue by retrying after claiming an interface on the device.
However, this did not help, and it was later found out that a broken
langid descriptor was the root issue all along.

Signed-off-by: Tormod Volden <[email protected]>
…ial a few times if failed on the first"

This reverts commit 0fa5687.

The reverted commit introduced retries when failing to retrieve various
device string descriptors. In the following commits we will attempt this
in a more centralized way, and for the purpose of clarity it is easiest
to retract this and start with cleaner sheets.

Signed-off-by: Tormod Volden <[email protected]>
This function doesn't use usb_get_string_simple() from libusb0 or
libusb_get_string_descriptor_ascii() from libusb1 but replaces them to
be able to add a workaround for devices with broken langid descriptors.

If the langid descriptor is invalid, the en_US language is assumed.
(See networkupstools#1925)

It also adds retries to the string descriptor fetching, which may help
in some cases. (See networkupstools#414)
A small delay is introduced between the retries.

Signed-off-by: Tormod Volden <[email protected]>
The string index is a number between 1 and 255, simply because it is the
lower byte of wValue in the GET_DESCRIPTOR request. String index 0 is
used for retrieving the langid array, so it is not valid for this
function.

There is no reason to impose a limit to the length of the buffer that
the user is lending us, certainly not above the range of the involved
types, so just a minimal sanity check is enough. Since in the minimal
case we would return an empty zero-terminated string, it must at least
have room for this.

Signed-off-by: Tormod Volden <[email protected]>
The original names could also be confused with libusb-1.0 API
functions.

Also rename nut_libusb_strerror() although not a nut API function.

Signed-off-by: Tormod Volden <[email protected]>
@masterwishx
Copy link
Contributor

make[1]: *** [Makefile:1644: install-win-bundle-thirdparty] Error 1

make: *** [Makefile:1638: install-win-bundle] Error 2

Not sure what Appveyor disliked here, build/test were ok. So this fault is unrelated to this PR.

FWIW, it listed some DLLs that were not in mingw (FOSS) locations but that's normal (e.g. OS own libs). Not sure if the list is same as before, or is why the step failed. Maybe libexec/sockdebug.exe troubleshooting helper being not covered by DLL accompaniment?..

Also I saw it was fine in test befor...

jimklimov added a commit to tormodvolden/nut that referenced this pull request Sep 3, 2024
…t_libusb_set_altinterface() to maintain a consistent code style [networkupstools#2604]

Signed-off-by: Jim Klimov <[email protected]>
jimklimov added a commit to tormodvolden/nut that referenced this pull request Sep 3, 2024
jimklimov added a commit to tormodvolden/nut that referenced this pull request Sep 3, 2024
@AppVeyorBot
Copy link

Build nut 2.8.2.2109-master completed (commit 0511491169 by @jimklimov)

@masterwishx
Copy link
Contributor

Finally merge is possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Eaton enhancement impacts-release-2.8.2 Issues reported against NUT release 2.8.2 (maybe vanilla or with minor packaging tweaks) Incorrect or missing readings On some devices driver-reported values are systemically off (e.g. x10, x0.1, const+Value, etc.) USB
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

Eaton 9E2000I unknown device and ups model
6 participants