Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APC Back-UPS BX1600MI spurious LOWBATT/REPLACEBATT events #2347

Closed
grifferz opened this issue Mar 2, 2024 · 59 comments · Fixed by #2565
Closed

APC Back-UPS BX1600MI spurious LOWBATT/REPLACEBATT events #2347

grifferz opened this issue Mar 2, 2024 · 59 comments · Fixed by #2565
Labels
APC Connection stability issues Issues about driver<->device and/or networked connections (upsd<->upsmon...) going AWOL over time impacts-release-2.7.4 Issues reported against NUT release 2.7.4 (maybe vanilla or with minor packaging tweaks) impacts-release-2.8.1 Issues reported against NUT release 2.8.1 (maybe vanilla or with minor packaging tweaks) Incorrect or missing readings On some devices driver-reported values are systemically off (e.g. x10, x0.1, const+Value, etc.) USB
Milestone

Comments

@grifferz
Copy link

grifferz commented Mar 2, 2024

I have an APC Back-UPS BX1600MI connected by its supplied USB cable. With the version of nut in Debian 10 (2.7.4-8) this was unusable as the usbhid-ups driver kept disconnecting every few seconds.

I upgraded to the git HEAD of nut and communication is now stable, but spurious events come in every so often (once or twice an hour at the moment). When I call a notify script that calls upsc at the time of those spurious events, I can see that the ups.status does bear that out, but other values do not. Example:

battery.charge: 100
battery.charge.low: 10
battery.mfr.date: 2001/01/01
battery.runtime: 659
battery.runtime.low: 120
battery.type: PbAc
battery.voltage: 27.3
battery.voltage.nominal: 24.0
device.mfr: American Power Conversion
device.model: Back-UPS BX1600MI
device.serial: 9B2339A15993
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: usbhid-ups
driver.parameter.poll freq: 30
driver.parameter.poll interval: 2
driver.parameter.port : auto
driver.parameter.synchronous: auto
driver.state: quiet
driver.version: 2.8.1-884-gf2fc47067
driver.version.data: APC HID 0.100
driver.version.internal: 0.52
driver.version.usb: libusb-1.0.22 (API: 0x1000106)
input.sensitivity: medium
input.transfer.high: 295
input.transfer.low: 145
input.voltage: 234.0
input.voltage.nominal: 230
ups.beeper.status: disabled
ups.delay.shutdown: 20
ups.load: 30
ups.mfr: American Power Conversion
ups.mfr.date: 2023/10/05
ups.model: Back-UPS BX1600MI
ups.productid: 0002
ups.realpower.nominal: 900
ups.serial: 9B2339A15993
ups.status: OL LB RB
ups.test.result: Done and passed
ups.timer.reboot: 0
ups.timer.shutdown: -1
ups.vendorid: 051d

As you can see, the status does include LB and RB, but the charge is still 100 and the runtime is as expected. The bad status lasts less than 2 seconds before returning to simply OL.

I went to the effort of installing a Windows VM, passing USB through to it and trying APC's own Powerchute Serial Shutdown software in there. This does not report any spurious events. Both Powerchute and nut do report true events that I induce. A self-test of the device passed. I am unable to demonstrate any behaviour that APC consider incorrect.

I returned the UPS to the vendor as faulty and they sent a replacement. The replacement behaves the same.

I installed apcupsd just to see how it behaved. It maintained a connection but its rate of spurious events was even worse: every couple of minutes. Another apcupsd user reports the same symptoms as me:

https://sourceforge.net/p/apcupsd/mailman/message/58740970/

Interestingly, they had a BX1600MI working with apcupsd, replaced it with another BX1600MI and now they see what I see, implying that newer models of Back-UPS have something different about them even though they are the same model number.

Is there any way to work around this sort of thing? It's almost like I need a way to not believe such statuses unless they persist for at least 5 seconds, or something.

@ShiroDN
Copy link

ShiroDN commented Mar 4, 2024

I have a new APC Back-UPS BX1200MI (basically the same model, just with a smaller battery) connected to TrueNAS that is using NUT 2.8.0, and there is some weird behavior. Sometimes, 1-2 times an hour, it triggers LOWBATT/REPLACEBATT events. In a debug log, I can see "[D2] parse_status: [OL DISCHRG CHRG LB RB]".

When it works and the battery is charged to 100%, it seems to be switching between "OL" and "OL CHRG" every few seconds. I am not sure if that is a problem too, or just normal behavior, though.

Here is a debug log from the driver at the time of an event:

22097.933876    [D1] upsdrv_updateinfo...
22098.033355    [D2] file_report_buffer: expected 5 bytes, but got 512 instead
22098.033486    [D3] Report[err]: (512 bytes) => 16 63 82 42 fd 7f 00 00 01 00 00 00 00 00 00 00
22098.033549    [D3]  00 00 00 00 00 00 00 00 18 69 e6 c0 77 55 00 00 d5 ad e7 c0 77 55 00 00 80
22098.033611    [D3]  b0 82 42 fd 7f 00 00 16 00 00 00 00 00 00 00 35 61 e6 c0 77 55 00 00 00 00
22098.033671    [D3]  00 00 00 00 00 00 00 00 00 00 00 00 00 00 60 b2 82 42 fd 7f 00 00 90 b1 82
22098.033731    [D3]  42 fd 7f 00 00 2d 82 e5 65 00 00 00 00 b2 6c 00 00 00 00 00 00 02 00 00 00
22098.033792    [D3]  00 00 00 00 00 ba 75 9f c5 fe 19 5f ff ff ff ff 77 55 00 00 e0 b0 82 42 fd
22098.033856    [D3]  7f 00 00 00 00 00 00 00 00 00 00 e0 d2 1c 0f 3c 7f 00 00 00 00 00 00 00 00
22098.033917    [D3]  00 00 73 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ba 75 9f c5 fe 19
22098.033978    [D3]  5f 01 00 00 00 00 00 00 00 00 ba 75 9f c5 fe 19 5f 7b e7 e6 c0 77 55 00 00
22098.034038    [D3]  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
22098.034099    [D3]  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
22098.034159    [D3]  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
22098.034219    [D3]  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
22098.034280    [D3]  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
22098.034340    [D3]  00 00 00 00 00 00 00 00 00 00 00 50 4f 4e 47 0a 00 00 00 00 ba 75 9f c5 fe
22098.034399    [D3]  19 5f 6d 6f 64 65 6c 73 20 28 d1 e4 e6 c0 77 55 00 00 da de 5d 00 00 00 00
22098.034459    [D3]  00 17 00 00 00 00 00 00 00 e0 af 82 42 fd 7f 00 00 00 00 00 00 00 00 00 00
22098.034519    [D3]  01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 48 4a e8 c0 77 55 00 00 01
22098.034578    [D3]  00 00 00 00 00 00 00 1d 05 00 00 00 00 00 00 99 b0 e5 c0 77 55 00 00 50 4f
22098.034638    [D3]  4e 47 0a 00 00 00 97 94 e6 c0 77 55 00 00 6d 6f 64 65 00 00 00 00 70 b2 82
22098.034696    [D3]  42 fd 7f 00 00 98 e4 e6 c0 77 55 00 00 01 00 00 00 00 00 00 00
22098.034751    [D1] Got 9 HID objects...
22098.034807    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.034862    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.034915    [D5] Unit = 00000000, UnitExp = 0
22098.034968    [D5] Exponent = 0
22098.035022    [D5] hid_lookup_path: 00840004 -> UPS
22098.035076    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.035130    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.035185    [D5] hid_lookup_path: 00850044 -> Charging
22098.035243    [D2] Path: UPS.PowerSummary.PresentStatus.Charging, Type: Input, ReportID: 0x16, Offset: 0, Size: 1, Value: 1
22098.035299    [D5] hu_find_infoval: found chrg (value: 1)
22098.035352    [D5] process_boolean_info: chrg
22098.035408    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.035462    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.035515    [D5] Unit = 00000000, UnitExp = 0
22098.035568    [D5] Exponent = 0
22098.035621    [D5] hid_lookup_path: 00840004 -> UPS
22098.035674    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.035727    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.035781    [D5] hid_lookup_path: 00850045 -> Discharging
22098.035836    [D2] Path: UPS.PowerSummary.PresentStatus.Discharging, Type: Input, ReportID: 0x16, Offset: 1, Size: 1, Value: 1
22098.035891    [D5] hu_find_infoval: found dischrg (value: 1)
22098.035946    [D5] process_boolean_info: dischrg
22098.036001    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.036054    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.036107    [D5] Unit = 00000000, UnitExp = 0
22098.036159    [D5] Exponent = 0
22098.036212    [D5] hid_lookup_path: 00840004 -> UPS
22098.036265    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.036318    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.036372    [D5] hid_lookup_path: 008500d0 -> ACPresent
22098.036427    [D2] Path: UPS.PowerSummary.PresentStatus.ACPresent, Type: Input, ReportID: 0x16, Offset: 2, Size: 1, Value: 0
22098.036482    [D5] hu_find_infoval: found !online (value: 0)
22098.036534    [D5] process_boolean_info: !online
22098.036589    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.036642    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.036695    [D5] Unit = 00000000, UnitExp = 0
22098.036747    [D5] Exponent = 0
22098.036800    [D5] hid_lookup_path: 00840004 -> UPS
22098.036853    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.036906    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.036960    [D5] hid_lookup_path: 008500d1 -> BatteryPresent
22098.037014    [D2] Path: UPS.PowerSummary.PresentStatus.BatteryPresent, Type: Input, ReportID: 0x16, Offset: 3, Size: 1, Value: 0
22098.037067    [D5] hu_find_infoval: found nobattery (value: 0)
22098.037120    [D5] process_boolean_info: nobattery
22098.037238    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.037302    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.037354    [D5] Unit = 00000000, UnitExp = 0
22098.037407    [D5] Exponent = 0
22098.037460    [D5] hid_lookup_path: 00840004 -> UPS
22098.037513    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.037565    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.037618    [D5] hid_lookup_path: 00840069 -> ShutdownImminent
22098.037674    [D2] Path: UPS.PowerSummary.PresentStatus.ShutdownImminent, Type: Input, ReportID: 0x16, Offset: 4, Size: 1, Value: 0
22098.037728    [D5] hu_find_infoval: found !shutdownimm (value: 0)
22098.037780    [D5] process_boolean_info: !shutdownimm
22098.037835    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.037887    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.037939    [D5] Unit = 00000000, UnitExp = 0
22098.037991    [D5] Exponent = 0
22098.038043    [D5] hid_lookup_path: 00840004 -> UPS
22098.038095    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.038148    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.038201    [D5] hid_lookup_path: 00850043 -> RemainingTimeLimitExpired
22098.038255    [D2] Path: UPS.PowerSummary.PresentStatus.RemainingTimeLimitExpired, Type: Input, ReportID: 0x16, Offset: 5, Size: 1, Value: 1
22098.038309    [D5] hu_find_infoval: found timelimitexp (value: 1)
22098.038362    [D5] process_boolean_info: timelimitexp
22098.038416    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.038469    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.038522    [D5] Unit = 00000000, UnitExp = 0
22098.038573    [D5] Exponent = 0
22098.038625    [D5] hid_lookup_path: 00840004 -> UPS
22098.038679    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.038732    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.038785    [D5] hid_lookup_path: 0085004b -> NeedReplacement
22098.038844    [D2] Path: UPS.PowerSummary.PresentStatus.NeedReplacement, Type: Input, ReportID: 0x16, Offset: 6, Size: 1, Value: 1
22098.038855    [D5] hu_find_infoval: found replacebatt (value: 1)
22098.038867    [D5] process_boolean_info: replacebatt
22098.038879    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.038890    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.038902    [D5] Unit = 00000000, UnitExp = 0
22098.038916    [D5] Exponent = 0
22098.038928    [D5] hid_lookup_path: 00840004 -> UPS
22098.038939    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.038951    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.038962    [D5] hid_lookup_path: 00840065 -> Overload
22098.038974    [D2] Path: UPS.PowerSummary.PresentStatus.Overload, Type: Input, ReportID: 0x16, Offset: 7, Size: 1, Value: 0
22098.038986    [D5] hu_find_infoval: found !overload (value: 0)
22098.038998    [D5] process_boolean_info: !overload
22098.039010    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.039021    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.039033    [D5] Unit = 00000000, UnitExp = 0
22098.039044    [D5] Exponent = 0
22098.039056    [D5] hid_lookup_path: 00840004 -> UPS
22098.039067    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.039079    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.039090    [D5] hid_lookup_path: 008500db -> VoltageNotRegulated
22098.039105    [D2] Path: UPS.PowerSummary.PresentStatus.VoltageNotRegulated, Type: Input, ReportID: 0x16, Offset: 8, Size: 1, Value: 0
22098.039117    [D3] NUT doesn't use this HID object
22098.039130    [D1] Quick update...
22098.039141    [D4] Entering libusb_get_report
22098.039801    [D3] Report[get]: (3 bytes) => 15 ff ff
22098.039844    [D5] PhyMax = 0, PhyMin = 0, LogMax = 32767, LogMin = -1
22098.039857    [D5] Unit = 00001001, UnitExp = 0
22098.039869    [D5] Exponent = 0
22098.039882    [D2] Path: UPS.PowerSummary.DelayBeforeShutdown, Type: Feature, ReportID: 0x15, Offset: 0, Size: 16, Value: -1
22098.039896    [D4] Entering libusb_get_report
22098.040607    [D3] Report[get]: (2 bytes) => 40 00
22098.040650    [D5] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0
22098.040662    [D5] Unit = 00f0d121, UnitExp = 7
22098.040674    [D5] Exponent = 0
22098.040687    [D2] Path: UPS.APCGeneralCollection.APCDelayBeforeReboot, Type: Feature, ReportID: 0x40, Offset: 0, Size: 8, Value: 0
22098.040701    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.040716    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.040728    [D5] Unit = 00000000, UnitExp = 0
22098.040739    [D5] Exponent = 0
22098.040751    [D2] Path: UPS.PowerSummary.PresentStatus.ACPresent, Type: Feature, ReportID: 0x16, Offset: 2, Size: 1, Value: 0
22098.040763    [D5] hu_find_infoval: found !online (value: 0)
22098.040775    [D5] process_boolean_info: !online
22098.040787    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.040799    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.040810    [D5] Unit = 00000000, UnitExp = 0
22098.040822    [D5] Exponent = 0
22098.040834    [D2] Path: UPS.PowerSummary.PresentStatus.Discharging, Type: Feature, ReportID: 0x16, Offset: 1, Size: 1, Value: 1
22098.040845    [D5] hu_find_infoval: found dischrg (value: 1)
22098.040857    [D5] process_boolean_info: dischrg
22098.040869    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.040880    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.040892    [D5] Unit = 00000000, UnitExp = 0
22098.040904    [D5] Exponent = 0
22098.040916    [D2] Path: UPS.PowerSummary.PresentStatus.Charging, Type: Feature, ReportID: 0x16, Offset: 0, Size: 1, Value: 1
22098.040927    [D5] hu_find_infoval: found chrg (value: 1)
22098.040939    [D5] process_boolean_info: chrg
22098.040951    [D4] Entering libusb_get_report
22098.041601    [D3] Report[get]: (2 bytes) => 13 01
22098.041645    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.041658    [D5] Unit = 00001001, UnitExp = 0
22098.041670    [D5] Exponent = 0
22098.041683    [D2] Path: UPS.PowerSummary.ACPresent, Type: Feature, ReportID: 0x13, Offset: 0, Size: 8, Value: 1
22098.041695    [D5] hu_find_infoval: found online (value: 1)
22098.041707    [D5] process_boolean_info: online
22098.041719    [D4] Entering libusb_get_report
22098.042521    [D3] Report[get]: (2 bytes) => 14 00
22098.042563    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.042576    [D5] Unit = 00001001, UnitExp = 0
22098.042591    [D5] Exponent = 0
22098.042604    [D2] Path: UPS.PowerSummary.BelowRemainingCapacityLimit, Type: Feature, ReportID: 0x14, Offset: 0, Size: 8, Value: 0
22098.042616    [D5] hu_find_infoval: found !lowbatt (value: 0)
22098.042628    [D5] process_boolean_info: !lowbatt
22098.042641    ups_status_set: seems that UPS [apc] is in OL+DISCHRG state now. Is it calibrating or do you perhaps want to set 'onlinedischarge' option? Some UPS models (e.g. CyberPower UT series) emit OL+DISCHRG when offline.
22098.042656    [D5] send_to_all: SETINFO ups.status "OL DISCHRG CHRG LB RB"
22098.042679    [D6] send_to_all: write 43 bytes to socket 11 succeeded (ret=43): SETINFO ups.status "OL DISCHRG CHRG LB RB"

@jimklimov jimklimov added APC USB impacts-release-2.7.4 Issues reported against NUT release 2.7.4 (maybe vanilla or with minor packaging tweaks) Incorrect or missing readings On some devices driver-reported values are systemically off (e.g. x10, x0.1, const+Value, etc.) Connection stability issues Issues about driver<->device and/or networked connections (upsd<->upsmon...) going AWOL over time impacts-release-2.8.1 Issues reported against NUT release 2.8.1 (maybe vanilla or with minor packaging tweaks) labels Mar 14, 2024
@jimklimov
Copy link
Member

CC @desertwitch : I think some of your earlier investigations were about similar behavior; any ideas here?

Notably, in NUT master (after release 2.8.1) we had PR #2216 to address such notifications with new configuration options. From research and educated guesswork there, this may be just part of battery charge management which happens under the hood on all UPS devices that became visibly exposed on some firmwares.

@desertwitch
Copy link
Contributor

desertwitch commented Mar 17, 2024

It seems to be driver related, something that the UPS is sending byte-wise that's being parsed wrongly into RB (or LB)

22098.038416    [D3] Report[buf]: (5 bytes) => 16 63 82 42 fd
22098.038469    [D5] PhyMax = 0, PhyMin = 0, LogMax = 1, LogMin = 0
22098.038522    [D5] Unit = 00000000, UnitExp = 0
22098.038573    [D5] Exponent = 0
22098.038625    [D5] hid_lookup_path: 00840004 -> UPS
22098.038679    [D5] hid_lookup_path: 00840024 -> PowerSummary
22098.038732    [D5] hid_lookup_path: 00840002 -> PresentStatus
22098.038785    [D5] hid_lookup_path: 0085004b -> NeedReplacement
22098.038844    [D2] Path: UPS.PowerSummary.PresentStatus.NeedReplacement, Type: Input, ReportID: 0x16, Offset: 6, Size: 1, Value: 1
22098.038855    [D5] hu_find_infoval: found replacebatt (value: 1)
22098.038867    [D5] process_boolean_info: replacebatt

I couldn't find an issue in the upsmon handling or the later processes. I've since had another report from a user with an APC BX1200MI-GR also experiencing intermediate OB and LB events, so it seems to be that series. I'm in touch with the other user if it works on an older version of NUT so hopefully we can narrow it down by comparison.

@desertwitch
Copy link
Contributor

Got another report of an affected Back-UPS BX750MI with spurious RB events.
I'm hoping to get my hands on at least a BX750MI this week so I can mess around with the driver a bit.

@PilaScat
Copy link

Same thing for me, I've the same problem with BX1200MI-GR

@desertwitch
Copy link
Contributor

Unfortunately my contact didn't come through with the UPS so we're back to square one.

@bitmario
Copy link

I am seeing the same behavior on a newly purchased BX1600MI-GR, manufactured October 2023, on Debian 12's nut 2.8.0-7.

@desertwitch let me know if I can help by collecting logs or trying any patches

@marcgarciamarti
Copy link

Hello there!
I'm using the containerized version of nut available in home assistant through the official add-on https://github.com/hassio-addons/addon-nut/tree/main

I can confirm this behavior is also happening with that version.

Screenshot 2024-04-03 at 00 03 02

The pattern seems to be random. It sometimes lasts a few seconds or a few minutes.

@desertwitch
Copy link
Contributor

desertwitch commented Apr 3, 2024

I am seeing the same behavior on a newly purchased BX1600MI-GR, manufactured October 2023, on Debian 12's nut 2.8.0-7.

@desertwitch let me know if I can help by collecting logs or trying any patches

Any chance you could try if the problem persists putting this flag in your UPS configuration in UPS.CONF:

maxreport

So an example configuration would look like this:

[ups]
        driver = usbhid-ups
        port = auto
        maxreport

Not sure if it will help or stop UPS recognition at all, but it's worth an attempt since we don't have an UPS at hand.
I'm not an expert with the UPS drivers but if I can get my hands on an affected UPS eventually I hope to investigate more.

@jimklimov : Any ideas from the logs provided above? It's a relatively popular entry-level line of APC so would be lovely if we can get this addressed one way or another, I'm sure many people would be appreciative.

@KillianMelsen
Copy link

KillianMelsen commented Apr 3, 2024

Same (or similar) issue here on a brand-new BX1600MI-GR. I'm not able to try NUT or get any logs right now, but the Unraid webgui/apcupsd shows the following:

info
log

@jimklimov
Copy link
Member

The only thing that comes to mind is to add a yet another throttle for such reports - so to not expose the status if RB appears and dissipates quickly (I'd be wary of tweaking LB like that... maybe optionally-throttle tied to known OL/OB/BYPASS status?)

@bitmario
Copy link

bitmario commented Apr 3, 2024

@desertwitch I just tried maxreport but it doesn't seem to help - I got two instances of "battery is low"+"battery needs to be replaced" within 3 hours.

@desertwitch
Copy link
Contributor

desertwitch commented Apr 3, 2024

@desertwitch I just tried maxreport but it doesn't seem to help - I got two instances of "battery is low"+"battery needs to be replaced" within 3 hours.

Thanks for the attempt, it was a long shot. Is there any logic to the statuses, as in do you notice a rapid succession or cycling through of certain statuses (LB, RB, LB, OL or OL, LB, RB, OL) in the logs? In general, do you have any logs you can provide us from NUT itself with timestamps so we can try to investigate this some more? Particularly interesting would be if the LB (low battery state) is happening at the same time the UPS is also OL or if the UPS goes OB before/after.

The only thing that comes to mind is to add a yet another throttle for such reports - so to not expose the status if RB appears and dissipates quickly (I'd be wary of tweaking LB like that... maybe optionally-throttle tied to known OL/OB/BYPASS status?)

I'm curious though if this is something else entirely, which the driver misinterprets, or the UPS is actually sending these bogus statuses. I'm thinking it probably doesn't happen on their own APC software (if there is one for that series) or there would be more complaints on the APC forums, but considering that both APCUPSD and NUT are affected in a similar fashion it does seem to be some change in the firmware when compared to other functioning APC devices or even older models of the BX series which seem to work (firmware bug in newer models?). I agree with a LB throttle on its own being less than ideal, unless tied to another status or succession of states (if we can figure out any logic behind what's happening here).

@bitmario
Copy link

bitmario commented Apr 3, 2024

The general pattern is that it will usually be in a normal status like OL CHRG and then every hour or two I'll see this (on Debian 12, nut 2.8.0-7):

Apr 03 14:04:43 nut-ups-test usbhid-ups[556]: ups_status_set: seems that UPS [apc1] is in OL+DISCHRG state now. Is it calibrating or do you perhaps want to set 'onlinedischarge' option? Some UPS models (e.g. CyberPower UT series) emit OL+DISCHRG when offline.
Apr 03 14:04:44 nut-ups-test nut-monitor[560]: UPS apc1@localhost battery is low
Apr 03 14:04:44 nut-ups-test nut-monitor[560]: UPS apc1@localhost battery needs to be replaced

Afterwards it goes back to a normal status.

I've been running a patched apcupsd so don't have any long-running logs at the moment, but I'm happy to run NUT as long as needed to help figure this out. Should I enable any debug logging options? Do you want me to try a fresh master build or should I keep using 2.8.0?

@KillianMelsen
Copy link

@desertwitch I just tried maxreport but it doesn't seem to help - I got two instances of "battery is low"+"battery needs to be replaced" within 3 hours.

Thanks for the attempt, it was a long shot. Is there any logic to the statuses, as in do you notice a rapid succession or cycling through of certain statuses (LB, RB, LB, OL or OL, LB, RB, OL) in the logs? In general, do you have any logs you can provide us from NUT itself with timestamps so we can try to investigate this some more? Particularly interesting would be if the LB (low battery state) is happening at the same time the UPS is also OL or if the UPS goes OB before/after.

The only thing that comes to mind is to add a yet another throttle for such reports - so to not expose the status if RB appears and dissipates quickly (I'd be wary of tweaking LB like that... maybe optionally-throttle tied to known OL/OB/BYPASS status?)

I'm curious though if this is something else entirely, which the driver misinterprets, or the UPS is actually sending these bogus statuses. I'm thinking it probably doesn't happen on their own APC software (if there is one for that series) or there would be more complaints on the APC forums, but considering that both APCUPSD and NUT are affected in a similar fashion it does seem to be some change in the firmware when compared to other functioning APC devices or even older models of the BX series which seem to work (firmware bug in newer models?). I agree with a LB throttle on its own being less than ideal, unless tied to another status or succession of states (if we can figure out any logic behind what's happening here).

I just got home and installed NUT through the unraid apps. I'd be happy to provide logs but not sure how.

@desertwitch
Copy link
Contributor

desertwitch commented Apr 3, 2024

The general pattern is that it will usually be in a normal status like OL CHRG and then every hour or two I'll see this (on Debian 12, nut 2.8.0-7):

Apr 03 14:04:43 nut-ups-test usbhid-ups[556]: ups_status_set: seems that UPS [apc1] is in OL+DISCHRG state now. Is it calibrating or do you perhaps want to set 'onlinedischarge' option? Some UPS models (e.g. CyberPower UT series) emit OL+DISCHRG when offline.
Apr 03 14:04:44 nut-ups-test nut-monitor[560]: UPS apc1@localhost battery is low
Apr 03 14:04:44 nut-ups-test nut-monitor[560]: UPS apc1@localhost battery needs to be replaced

Afterwards it goes back to a normal status.

I've been running a patched apcupsd so don't have any long-running logs at the moment, but I'm happy to run NUT as long as needed to help figure this out. Should I enable any debug logging options? Do you want me to try a fresh master build or should I keep using 2.8.0?

This is very valuable, thanks, so we can also see here that the UPS is doing some kind of - perhaps - calibration before these LB and RB states. Maybe it is cycling through those states as part of some calibration mechanism. Can you confirm you're never having a "is on battery" status and you're also not getting any shutdowns from these intermittent low battery occurrences?

The other user's log above has shown a status of OL DISCHRG CHRG LB RB which seems to be also what's happening on your side (CHRG does not get reported in the logs as an event afaik). @jimklimov perhaps if users use the onlinedischarge_calibration to turn OL DISCHRG into CAL an additional toggle to basically ignore every other status while CAL is present could be a solution here (effectively suppressing CHRG LB RB in such cases and only reporting CAL)? We've already seen other UPS cycling through statuses while doing calibration, perhaps such a toggle could also be beneficial to other users? silent_calibration? But then the first user's status only shows OL LB RB, so that'd probably be useless in his case.

@grifferz
Copy link
Author

grifferz commented Apr 3, 2024

I upgraded to the git HEAD of nut and communication is now stable, but spurious events come in every so often (once or twice an hour at the moment). When I call a notify script that calls upsc at the time of those spurious events, I can see that the ups.status does bear that out, but other values do not. Example:

battery.charge: 100
battery.charge.low: 10
battery.mfr.date: 2001/01/01
battery.runtime: 659
battery.runtime.low: 120
battery.type: PbAc
battery.voltage: 27.3
battery.voltage.nominal: 24.0
device.mfr: American Power Conversion
device.model: Back-UPS BX1600MI
device.serial: 9B2339A15993
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: usbhid-ups
driver.parameter.poll freq: 30
driver.parameter.poll interval: 2
driver.parameter.port : auto
driver.parameter.synchronous: auto
driver.state: quiet
driver.version: 2.8.1-884-gf2fc47067
driver.version.data: APC HID 0.100
driver.version.internal: 0.52
driver.version.usb: libusb-1.0.22 (API: 0x1000106)
input.sensitivity: medium
input.transfer.high: 295
input.transfer.low: 145
input.voltage: 234.0
input.voltage.nominal: 230
ups.beeper.status: disabled
ups.delay.shutdown: 20
ups.load: 30
ups.mfr: American Power Conversion
ups.mfr.date: 2023/10/05
ups.model: Back-UPS BX1600MI
ups.productid: 0002
ups.realpower.nominal: 900
ups.serial: 9B2339A15993
ups.status: OL LB RB
ups.test.result: Done and passed
ups.timer.reboot: 0
ups.timer.shutdown: -1
ups.vendorid: 051d

I don't know whether it is of any use, but I did not change anything (kept running git HEAD of nut) and the above strange behaviour stopped after a few days. It has been behaving normally for weeks now.

Thanks,
Andy

@desertwitch
Copy link
Contributor

@desertwitch I just tried maxreport but it doesn't seem to help - I got two instances of "battery is low"+"battery needs to be replaced" within 3 hours.

Thanks for the attempt, it was a long shot. Is there any logic to the statuses, as in do you notice a rapid succession or cycling through of certain statuses (LB, RB, LB, OL or OL, LB, RB, OL) in the logs? In general, do you have any logs you can provide us from NUT itself with timestamps so we can try to investigate this some more? Particularly interesting would be if the LB (low battery state) is happening at the same time the UPS is also OL or if the UPS goes OB before/after.

The only thing that comes to mind is to add a yet another throttle for such reports - so to not expose the status if RB appears and dissipates quickly (I'd be wary of tweaking LB like that... maybe optionally-throttle tied to known OL/OB/BYPASS status?)

I'm curious though if this is something else entirely, which the driver misinterprets, or the UPS is actually sending these bogus statuses. I'm thinking it probably doesn't happen on their own APC software (if there is one for that series) or there would be more complaints on the APC forums, but considering that both APCUPSD and NUT are affected in a similar fashion it does seem to be some change in the firmware when compared to other functioning APC devices or even older models of the BX series which seem to work (firmware bug in newer models?). I agree with a LB throttle on its own being less than ideal, unless tied to another status or succession of states (if we can figure out any logic behind what's happening here).

I just got home and installed NUT through the unraid apps. I'd be happy to provide logs but not sure how.

Just set NUT up normally through the GUI as you want to have it and watch the SYSLOG (Tools->System Log) for any such strange events being reported while NUT is started, they should pop up in the log by themselves if they occur on your system.

I upgraded to the git HEAD of nut and communication is now stable, but spurious events come in every so often (once or twice an hour at the moment). When I call a notify script that calls upsc at the time of those spurious events, I can see that the ups.status does bear that out, but other values do not. Example:

battery.charge: 100
battery.charge.low: 10
battery.mfr.date: 2001/01/01
battery.runtime: 659
battery.runtime.low: 120
battery.type: PbAc
battery.voltage: 27.3
battery.voltage.nominal: 24.0
device.mfr: American Power Conversion
device.model: Back-UPS BX1600MI
device.serial: 9B2339A15993
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: usbhid-ups
driver.parameter.poll freq: 30
driver.parameter.poll interval: 2
driver.parameter.port : auto
driver.parameter.synchronous: auto
driver.state: quiet
driver.version: 2.8.1-884-gf2fc47067
driver.version.data: APC HID 0.100
driver.version.internal: 0.52
driver.version.usb: libusb-1.0.22 (API: 0x1000106)
input.sensitivity: medium
input.transfer.high: 295
input.transfer.low: 145
input.voltage: 234.0
input.voltage.nominal: 230
ups.beeper.status: disabled
ups.delay.shutdown: 20
ups.load: 30
ups.mfr: American Power Conversion
ups.mfr.date: 2023/10/05
ups.model: Back-UPS BX1600MI
ups.productid: 0002
ups.realpower.nominal: 900
ups.serial: 9B2339A15993
ups.status: OL LB RB
ups.test.result: Done and passed
ups.timer.reboot: 0
ups.timer.shutdown: -1
ups.vendorid: 051d

I don't know whether it is of any use, but I did not change anything (kept running git HEAD of nut) and the above strange behaviour stopped after a few days. It has been behaving normally for weeks now.

Thanks, Andy

That is extremely curious, makes me wonder even more what the UPS is doing there - thanks though!

@bitmario
Copy link

bitmario commented Apr 3, 2024

Maybe it is cycling through those states as part of some calibration mechanism. Can you confirm you're never having a "is on battery" status and you're also not getting any shutdowns from these intermittent low battery occurrences?

Correct - I only get "is low" and "needs to be replaced".

As a sanity check, I just tried pulling the plug for a few seconds and this is what I see in that case:

Apr 03 15:56:06 nut-ups-test nut-monitor[579]: UPS apc1@localhost on battery
Apr 03 15:56:06 nut-ups-test nut-monitor[643]: Network UPS Tools upsmon 2.8.0
Apr 03 15:56:23 nut-ups-test usbhid-ups[574]: ups_status_set: seems that UPS [apc1] is in OL+DISCHRG state now. Is it calibrating or do you perhaps want to set 'onlinedischarge' option? Som>
Apr 03 15:56:26 nut-ups-test nut-monitor[579]: UPS apc1@localhost on line power
Apr 03 15:56:26 nut-ups-test nut-monitor[646]: Network UPS Tools upsmon 2.8.0

Thanks for taking the time to look into this :)

@KillianMelsen
Copy link

Just set NUT up normally through the GUI as you want to have it and watch the SYSLOG (Tools->System Log) for any such strange events being reported while NUT is started, they should pop up in the log by themselves if they occur on your system.

Ah got it. They system log does show some low battery indications as seen below. I also pulled the plug and that all seems to be working fine. UPS switches to battery, I get an unraid notification, and it goes back online when plugged back in.

Apr  3 17:06:09 Tower upsmon[13245]: Startup successful
Apr  3 17:06:09 Tower upsmon[13245]: Warning: running as one big root process by request (upsmon -p)
Apr  3 17:06:09 Tower upsd[13239]: User [email protected] logged into UPS [ups]
Apr  3 17:07:14 Tower ool www[12635]: /usr/local/emhttp/plugins/nut-dw/scripts/write_config
Apr  3 17:07:14 Tower rc.nut: Writing NUT configuration...
Apr  3 17:07:15 Tower rc.nut: Updating permissions for NUT...
Apr  3 17:07:15 Tower rc.nut: Checking if the NUT Runtime Statistics Module should be enabled...
Apr  3 17:07:15 Tower rc.nut: Disabling the NUT Runtime Statistics Module...
...
Apr  3 17:14:09 Tower upsmon[13245]: UPS [email protected] battery is low
...
Apr  3 17:48:09 Tower upsmon[13245]: UPS [email protected] battery is low
...
Apr  3 18:09:34 Tower upsmon[13245]: UPS [email protected] on battery
Apr  3 18:09:34 Tower nut-notify: [ups] UPS is on battery. The system will shutdown when the UPS battery charge reaches 25%.
Apr  3 18:13:09 Tower upsmon[13245]: UPS [email protected] on line power
Apr  3 18:13:09 Tower nut-notify: [ups] UPS is online, any planned shutdowns were cancelled.

@desertwitch
Copy link
Contributor

Thanks for the logs everyone - just to double-check back here, has anyone gotten a shutdown from this problem?
Just to make sure that this is for the most part a log and notification problem and not an unwarranted shutdown one.

@KillianMelsen
Copy link

Thanks for the logs everyone - just to double-check back here, has anyone gotten a shutdown from this problem? Just to make sure that this is for the most part a log and notification problem and not an unwarranted shutdown one.

My unraid was running fine until I hooked it up to the UPS yesterday evening. This morning it was off, and it started a parity check after turning it on even though nothing was scheduled.

I can't find anything that logged the reason for the shutdown though. That was also using the built-in UPS utility, so I'll keep running NUT and see if I get any other unexpected shutdowns.

@KillianMelsen
Copy link

So I just checked the system log again and found a whole bunch of errors. Don't know if it's related. I left all configuration options on the default values except for the setting the shutdown rules to 25% battery left. I'm also running powertop --auto-tune in case that's relevant.

Apr  3 18:27:50 Tower usbhid-ups[13220]: nut_libusb_get_report: Input/Output Error
Apr  3 18:27:52 Tower usbhid-ups[13220]: #012Reconnecting. If you saw "nut_libusb_get_interrupt: Input/Output Error" or similar message in the log above, try setting "pollonly" flag in "ups.conf" options section for this driver!
Apr  3 18:41:45 Tower usbhid-ups[13220]: nut_libusb_get_report: Input/Output Error
Apr  3 18:41:47 Tower usbhid-ups[13220]: #012Reconnecting. If you saw "nut_libusb_get_interrupt: Input/Output Error" or similar message in the log above, try setting "pollonly" flag in "ups.conf" options section for this driver!
Apr  3 19:06:52 Tower usbhid-ups[13220]: nut_libusb_get_report: Input/Output Error
Apr  3 19:06:54 Tower usbhid-ups[13220]: #012Reconnecting. If you saw "nut_libusb_get_interrupt: Input/Output Error" or similar message in the log above, try setting "pollonly" flag in "ups.conf" options section for this driver!
Apr  3 19:29:52 Tower usbhid-ups[13220]: nut_libusb_get_report: Input/Output Error
Apr  3 19:29:54 Tower usbhid-ups[13220]: #012Reconnecting. If you saw "nut_libusb_get_interrupt: Input/Output Error" or similar message in the log above, try setting "pollonly" flag in "ups.conf" options section for this driver!
Apr  3 19:37:19 Tower upsmon[13245]: UPS [email protected] battery is low
Apr  3 19:38:58 Tower usbhid-ups[13220]: nut_libusb_get_report: Input/Output Error
Apr  3 19:39:00 Tower usbhid-ups[13220]: #012Reconnecting. If you saw "nut_libusb_get_interrupt: Input/Output Error" or similar message in the log above, try setting "pollonly" flag in "ups.conf" options section for this driver!

@statte
Copy link

statte commented Jun 7, 2024

I'm still on 2.8.0 (no update due to using truenas scale).
After adding pollonly the main issue went away. Getting flooded by alerts about battery status. Eg. Replace battery or battery low. Once "pollonly" was added 👌.

I did still notice in the kernel log that every 3 minutes (exactly 3 min) the ups's usb connection was lost and restored. I've changed usb cable , usb port but this didn't go away untill I disconnected the ups from the server, stopped the ups service. Plugged in the ups again and started the service afterwards. Now I haven't had the message for 6h+ .

Hope this helps someone.

@Sanderluc5
Copy link

Sanderluc5 commented Jun 11, 2024

I'm still on 2.8.0 (no update due to using truenas scale). After adding pollonly the main issue went away. Getting flooded by alerts about battery status. Eg. Replace battery or battery low. Once "pollonly" was added 👌.

I did still notice in the kernel log that every 3 minutes (exactly 3 min) the ups's usb connection was lost and restored. I've changed usb cable , usb port but this didn't go away untill I disconnected the ups from the server, stopped the ups service. Plugged in the ups again and started the service afterwards. Now I haven't had the message for 6h+ .

Hope this helps someone.

I was playing with my server and changed some settings in the bios and reconnected everything and the problem was gone. I don't know what the cause was.

@ivanjx
Copy link

ivanjx commented Jun 11, 2024

might be a bit unrelated since i use apcupsd directly but i tried compiling apcupsd with ./configure --enable-usb --with-generic-usb and it seems the battery disconnected/reattached events are gone now. the only downside is that the syslog for power failure and power restored are gone too now but the apcaccess is still able to detect the on battery status when i unplugged the ups from the power socket.

@gabrieleancora
Copy link

Got another report of an affected Back-UPS BX750MI with spurious RB events. I'm hoping to get my hands on at least a BX750MI this week so I can mess around with the driver a bit.

Hello, I do have the same exact model and I'm having the same problem. Let me know if you need any log/report and I'll provide them!

@JoshNansoz1
Copy link

I’m getting a bunch of the same logs here with a BX950MI.

Is there anything we can do to help fix this?

Provide some logs or perform any tests for you?

@rioachim
Copy link

rioachim commented Jul 18, 2024

Same issues here with BX1600MI exchanged under warranty. The old one was fine, the new one is unusable with apcupsd on freshtomato router firmware. The log is flooded by "battery disconnected/UPS battery must be replaced/Battery reattached" followed after some time by "hid-core.c: control queue full". I have tried the patch in the following thread (that ignores 'no battery' / 'replace battery' if not reported twice in a row), but it behaves the same: https://sourceforge.net/p/apcupsd/mailman/apcupsd-users/thread/9c51020c-9d8b-4eda-b4d4-3ff8dc08749c%40okazoo.eu/#msg58741334

I am aware that apcupsd's \drivers\usb\usb.c is totally different that nut's drivers/usbhid-ups.c but in the end it would be a similar approach on the misbehaving UPS status events info.

@jimklimov
Copy link
Member

jimklimov commented Jul 18, 2024

Thanks for the different reports and ideas and the recent link to apcupsd seeing the same problem - so at least it is more likely an APC's HW/FW regression in roughly 2024's makes of their devices, than NUT's, right?

Not really sure which logs to ask for (thanks for offering), but thought of one point that can impact development of a fix: when the status flips about "battery disconnected -> must replace -> was reattached", how long does such cycle take? Is it always a matter of a few seconds with a (relatively) long quiet phase of normal work in between?

It may make sense to add an option to suppress such messages if the issue gets resolved within a specified timeframe... (and/or following the apcupsd patch - immediately report repeated offenses that are not resolved at the moment).

Would anyone here be brave enough to try coding such a change and making a PR? :)

@JoshNansoz1
Copy link

JoshNansoz1 commented Jul 19, 2024

@jimklimov

Here are the event logs from my unraid server.

Jul 18 00:03:39 Garage-Server apcupsd[3492]: Power is back. UPS running on mains.
Jul 18 00:03:39 Garage-Server apcupsd[3492]: Battery disconnected.
Jul 18 00:03:40 Garage-Server apcupsd[3492]: Battery reattached.
Jul 18 00:03:56 Garage-Server apcupsd[3492]: Power failure.
Jul 18 00:03:57 Garage-Server apcupsd[3492]: Power is back. UPS running on mains.
Jul 18 00:03:57 Garage-Server apcupsd[3492]: Battery disconnected.
Jul 18 00:03:58 Garage-Server apcupsd[3492]: Battery reattached.
Jul 18 00:05:34 Garage-Server apcupsd[3492]: Power failure.
Jul 18 00:05:34 Garage-Server apcupsd[3492]: Power is back. UPS running on mains.
Jul 18 00:05:34 Garage-Server apcupsd[3492]: Battery disconnected.
Jul 18 00:05:35 Garage-Server apcupsd[3492]: Battery reattached.
Jul 18 00:05:55 Garage-Server apcupsd[3492]: Power failure.
Jul 18 00:05:56 Garage-Server apcupsd[3492]: Power is back. UPS running on mains.
Jul 18 00:05:56 Garage-Server apcupsd[3492]: Battery disconnected.
Jul 18 00:05:57 Garage-Server apcupsd[3492]: Battery reattached.
Jul 18 00:07:39 Garage-Server apcupsd[3492]: Power failure.
Jul 18 00:07:39 Garage-Server apcupsd[3492]: Power is back. UPS running on mains.
Jul 18 00:07:39 Garage-Server apcupsd[3492]: Battery disconnected.
Jul 18 00:07:40 Garage-Server apcupsd[3492]: Battery reattached.
Jul 18 00:09:49 Garage-Server apcupsd[3492]: Power failure.
Jul 18 00:09:49 Garage-Server apcupsd[3492]: Power is back. UPS running on mains.
Jul 18 00:09:49 Garage-Server apcupsd[3492]: Battery disconnected.
Jul 18 00:09:50 Garage-Server apcupsd[3492]: Battery reattached.
Jul 18 00:10:22 Garage-Server apcupsd[3492]: Power failure.
Jul 18 00:10:23 Garage-Server apcupsd[3492]: Power is back. UPS running on mains.
Jul 18 00:10:23 Garage-Server apcupsd[3492]: Battery disconnected.
Jul 18 00:10:24 Garage-Server apcupsd[3492]: Battery reattached.

@jimklimov
Copy link
Member

jimklimov commented Jul 19, 2024

Thanks. So it is fairly frequent, but each situation takes a couple of seconds to resolve. No particular rhythm or rhyme to frequency of the situations, though. (Thought if their comms chip goes to sleep if not poked every 30-120 sec like some other similar cases did)

@desertwitch
Copy link
Contributor

Sorry I haven't been able to get much further on this issue either, as there's no apparent logic to these conditions.
I'm just hoping that this will be fixed in future devices from the affected series and manufacturer.

@grifferz
Copy link
Author

I'm just hoping that this will be fixed in future devices from the affected series and manufacturer.

As the original reporter of this issue, I did go through this with APC support and they are not prepared to accept any software reports unless they are from their own Powerchute software on Windows. As these spurious events do not happen there, they closed my issue as not reproducible.

So in short, APC do not accept there is any kind of bug to fix I'm afraid.

Strangely enough as mentioned in earlier comment:

#2347 (comment)

the behaviour seems to have stopped for me by itself. It's been months and it hasn't happened. I have no idea why as all I did was change to what was at the time the HEAD of main.

@owenperkins111
Copy link

@jimklimov

Here are the event logs from my unraid server.

Jul 18 00:03:39 Garage-Server apcupsd[3492]: Power is back. UPS running on mains. Jul 18 00:03:39 Garage-Server apcupsd[3492]: Battery disconnected. Jul 18 00:03:40 Garage-Server apcupsd[3492]: Battery reattached. Jul 18 00:03:56 Garage-Server apcupsd[3492]: Power failure. Jul 18 00:03:57 Garage-Server apcupsd[3492]: Power is back. UPS running on mains. Jul 18 00:03:57 Garage-Server apcupsd[3492]: Battery disconnected. Jul 18 00:03:58 Garage-Server apcupsd[3492]: Battery reattached. Jul 18 00:05:34 Garage-Server apcupsd[3492]: Power failure. Jul 18 00:05:34 Garage-Server apcupsd[3492]: Power is back. UPS running on mains. Jul 18 00:05:34 Garage-Server apcupsd[3492]: Battery disconnected. Jul 18 00:05:35 Garage-Server apcupsd[3492]: Battery reattached. Jul 18 00:05:55 Garage-Server apcupsd[3492]: Power failure. Jul 18 00:05:56 Garage-Server apcupsd[3492]: Power is back. UPS running on mains. Jul 18 00:05:56 Garage-Server apcupsd[3492]: Battery disconnected. Jul 18 00:05:57 Garage-Server apcupsd[3492]: Battery reattached. Jul 18 00:07:39 Garage-Server apcupsd[3492]: Power failure. Jul 18 00:07:39 Garage-Server apcupsd[3492]: Power is back. UPS running on mains. Jul 18 00:07:39 Garage-Server apcupsd[3492]: Battery disconnected. Jul 18 00:07:40 Garage-Server apcupsd[3492]: Battery reattached. Jul 18 00:09:49 Garage-Server apcupsd[3492]: Power failure. Jul 18 00:09:49 Garage-Server apcupsd[3492]: Power is back. UPS running on mains. Jul 18 00:09:49 Garage-Server apcupsd[3492]: Battery disconnected. Jul 18 00:09:50 Garage-Server apcupsd[3492]: Battery reattached. Jul 18 00:10:22 Garage-Server apcupsd[3492]: Power failure. Jul 18 00:10:23 Garage-Server apcupsd[3492]: Power is back. UPS running on mains. Jul 18 00:10:23 Garage-Server apcupsd[3492]: Battery disconnected. Jul 18 00:10:24 Garage-Server apcupsd[3492]: Battery reattached.

I am getting the exact same on my proxmox server with apcupsd and a APC Back UPS BX950

@ivanjx
Copy link

ivanjx commented Jul 24, 2024

anyone else also getting lots of self test switch messages from apcupsd? i have BX750MI

jimklimov added a commit to jimklimov/nut that referenced this issue Jul 29, 2024
jimklimov added a commit to jimklimov/nut that referenced this issue Jul 29, 2024
jimklimov added a commit to jimklimov/nut that referenced this issue Jul 29, 2024
jimklimov added a commit to jimklimov/nut that referenced this issue Jul 29, 2024
@jimklimov jimklimov added this to the 2.8.3 milestone Jul 29, 2024
@jimklimov
Copy link
Member

Hello all, thank you for all the reports and logs. I've posted a prospective PR to address the situation by delaying LB/RB status propagation on impacted devices (user-configurable) and so hiding it from upsmon etc. clients which emit the warnings. Testing would be welcome, see #2565 for details.

jimklimov added a commit to jimklimov/nut that referenced this issue Aug 6, 2024
…o setting the lbrb_log_delay_without_calibrating flag [networkupstools#2347]

Signed-off-by: Jim Klimov <[email protected]>
jimklimov added a commit to jimklimov/nut that referenced this issue Aug 6, 2024
…o setting the lbrb_log_delay_without_calibrating flag [networkupstools#2347]

Signed-off-by: Jim Klimov <[email protected]>
@desertwitch
Copy link
Contributor

For anyone here on Unraid, I have rolled out a testing package with the latest plugin update today. Please read the instructions shown on the "NUT Settings" page on how to switch to the testing package and if possible do report back if the problems are resolved for you (or not) - thanks a lot for your support. 😉

@YahangWu
Copy link

YahangWu commented Aug 11, 2024

I’m currently on 6.12.11 and nut-2.8.2-x86_64-2master.ssl11 for about two days. Brand new Back-UPS BX1200MI, still getting the replacement battery error everyday, but the event count seems to be reduced.

jimklimov added a commit to jimklimov/nut that referenced this issue Aug 11, 2024
…r spurious LOWBATT/REPLACEBATT events on APC BXnnnnMI devices [networkupstools#2347]

Signed-off-by: Jim Klimov <[email protected]>
@jimklimov
Copy link
Member

As suggested in the PR discussion, try setting

lbrb_log_delay_without_calibrating
onlinedischarge_calibration

flags in your ups.conf section for the device, so the OL+DISCHRG combo would also be treated as a calibration and might tune into the change made here better.

jimklimov added a commit to jimklimov/nut that referenced this issue Aug 11, 2024
jimklimov added a commit to jimklimov/nut that referenced this issue Aug 12, 2024
…actly 0/1 [networkupstools#2347]

When building a complex text expression, we rely on maths in some spots.

Signed-off-by: Jim Klimov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
APC Connection stability issues Issues about driver<->device and/or networked connections (upsd<->upsmon...) going AWOL over time impacts-release-2.7.4 Issues reported against NUT release 2.7.4 (maybe vanilla or with minor packaging tweaks) impacts-release-2.8.1 Issues reported against NUT release 2.8.1 (maybe vanilla or with minor packaging tweaks) Incorrect or missing readings On some devices driver-reported values are systemically off (e.g. x10, x0.1, const+Value, etc.) USB
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.