Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data corruption after battery empty #3398

Open
pavelmachek opened this issue May 3, 2024 · 18 comments
Open

Data corruption after battery empty #3398

pavelmachek opened this issue May 3, 2024 · 18 comments
Labels

Comments

@pavelmachek
Copy link
Contributor

Affected hardware version

Bangle 2

Your firmware version

2v21

The bug

I was using watch for gps logging, and battery ran flat for first time in maybe half a year. And... data corruption after I charged it. So it seems like the old issue is still there. Should I try to create some kind of reproducer?

Installed apps

sixths (0.03) was used for gps logging, it seems to trigger corruption quite easily.

@bobrippling
Copy link
Collaborator

I was using watch for gps logging, and battery ran flat for first time in maybe half a year. And... data corruption after I charged it. So it seems like the old issue is still there. Should I try to create some kind of reproducer?

Yes please - do you have a link to the old issue too?

@pavelmachek
Copy link
Contributor Author

Old issue is: #3000

@pavelmachek
Copy link
Contributor Author

If you want to try to reproduce it in the meantime, install sixths and do "topright"- "bottomright" -"bottomleft" gesture. It will turn on GPS. I can reproduce it way too easily on horseback trips.

I'll try to hack up better reproducer.

@bobrippling
Copy link
Collaborator

Great, thank you (also tagging #2865 for reference too). I'll tag @gfwilliams just so you're aware, in case there was any storage changes you're working on in the meanwhile

@gfwilliams
Copy link
Member

sixths (0.03) was used for gps logging, it seems to trigger corruption quite easily.

Have you had corruption more than this one time in the last 6 months then?

At least now it should have popped up the recovery screen when you rebooted, and you have the choice to compact which may fix it, or you can connect via Bluetooth with the app loader at that point, do a backup, and then restore it and that'll definitely fix it too.

One issue is that storage is only scanned at boot, so we don't actually know whether the corruption did happen when the battery died, or whether it happened before and you only noticed when the watch rebooted.

Potential fixes for this might be:

  • Ensure sixths (and recorder app) automatically turn off at 10% battery
  • Make Espruino itself make Storage read-only if battery gets below 5-10%. That'll probably annoy some people but it would stop this kind of problem I guess.

@pavelmachek
Copy link
Contributor Author

So I'm trying to reproduce the corruption, and no success so far. No, I did not detect corruption in last 6 months.

I guess "storage is only scanned at boot" might be my problem. I suspect I did heavier-than-average use in the past. Is there way to force filesystem check on runtime -- maybe we should be doing that periodically...

@pavelmachek
Copy link
Contributor Author

(I guess I should try to simply reboot watch after "heavy" use, and see if I can find some kind of pattern? Is Settings|Utils|Turn off then turn it back on way to do that?)

@gfwilliams
Copy link
Member

No, I did not detect corruption in last 6 months.

Ahh, ok - it's just you said "sixths triggers corruption quite easily" so I thought maybe it had done it a few times.

Is there way to force filesystem check on runtime

Actually right now there isn't, but I guess that could be exposed.

Using turn off is probably nicer, but you could long-press the button (although I guess that risks potentially rebooting while writing to flash)

@pavelmachek
Copy link
Contributor Author

So, yes, I did have it few times, but it was more than 6 months ago :-).

So... what are the expected results of rebooting while writing to flash? I'm used to ext4 where there should be no ill results; recent writes may be lost but things should be consistent. How does BangleJS compare?

@pavelmachek
Copy link
Contributor Author

Ok, so I drained battery maybe 10 times this morning, then did some tweaks from App loader, and now App loader no longer connects. But watch still reboots ok.

@pavelmachek
Copy link
Contributor Author

For the record, may battery draining script was this:

function eat(on) {
Bangle.setOptions({backlightTimeout: 0})
Bangle.setGPSPower(on, "sixths");
Bangle.setCompassPower(on, "sixths");
Bangle.setBarometerPower(on, "sixths");
Bangle.setHRMPower(on, "sixths");
Bangle.setBacklight(on);
if (on) {
for (let i=0; i<10000; i++) ;
}
}

function logstamp(s) {
logfile.write("utime=" + getTime() + " " + s + "\n");
}

drain = 0;
cooldown = 0;

function periodic() {
bat = E.getBattery();

g.setColor(1, 1, 1);
g.fillRect(0, 24, 240, 240);
g.setFontAlign(-1, 1);
g.setColor(0, 0, 0);
g.setFont('Vector', 26);
let now = new Date();
msg = now.getHours() + ":" + now.getMinutes() + " " + bat + "% ";
msg2 = cooldown + "/" + drain;
g.drawString(msg, 10, 90);
g.drawString(msg2, 10, 130);

print(msg + msg2);
logstamp(msg + msg2)
if (bat < 2) {
if (drain)
drain--;
else {
if (cooldown) {
eat(0);
cooldown--;
} else {
cooldown = 760;
drain = 2
60;
require("Storage").compact(true);
eat(1);
}
}
} else {
eat(1);
drain = 2*60;
}
}

let logfile = require("Storage").open("crash.log", "a");
setInterval(periodic, 1000);

@gfwilliams
Copy link
Member

and now App loader no longer connects. But watch still reboots ok.

It might just be your PC's bluetooth connection is confused. I doubt it's a bangle issue.

If power fails while writing to flash I think it's similar, but the issue is really if it's writing a new file's header and those 32 bytes end up partially written. It'd be extremely unlikely for that to happen though

@pavelmachek
Copy link
Contributor Author

I have solid connection from the Web IDE, yet app loader has deterministic problems. Compaction still works:

require("Storage").compact(true);
=undefined

require("Storage").getStats(true)
={ totalBytes: 8388608, freeBytes: 2427696, fileBytes: 5960912, fileCount: 346,
trashBytes: 0, trashCount: 0 }

But I seem to read all the files okay,I'll try to debug it a bit more.

@pavelmachek
Copy link
Contributor Author

Ok, so now I have corruption, this time .boot0 contains 0xff's . I don't recall running battery flat, recently. (Is there easy way to fix it?)

00000ce0: 6c 3b 61 2e 6d 6f 76 65 6d 65 6e 74 2b 3d 65 2e l;a.movement+=e.
00000cf0: 63 68 61 72 43 6f 64 65 41 74 28 33 29 3b 61 2e charCodeAt(3);a.
00000d00: 6d 6f 76 43 6e 74 2b 2b 3b 0a 6c 26 26 61 2e 62 movCnt++;.l&&a.b
00000d10: 70 6d 43 6e 74 2b 2b 7d 68 2d 3d 34 7d 61 2e 62 pmCnt++}h-=4}a.b
00000d20: 70 6d 43 6e ff ff ff ff ff ff ff ff ff ff ff ff pmCn............
00000d30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
00000d40: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
00000d50: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
00000d60: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
00000d70

@gfwilliams
Copy link
Member

That's an odd one - if you look at the list of files in the Web IDE (click the storage icon towards the middle of the screen) are there any filenames that look corrupt?

If you just go to the Settings app then Utils and rewrite settings, and that should fix it?

It seems to me like somehow the bootupdate.js file got interrupted in the middle of writing .boot0 after the settings were changed. I've never seen that before, and I'm not quite sure I see how that could have happened unless it took a very long time to run, during which you were maybe holding the button?

@pavelmachek
Copy link
Contributor Author

Filenames look ok in the web ide. I had few files truncated to 0xffff bytes in the past. Let me try the "rewrite settings".

Is there documentation of filesystem properties somewhere? Knowing which actions can lead to corruption would be good. I now know "battery running empty" is dangerous. Is the "2 second press" dangerous? Is the "5 second press" dangerous?

@pavelmachek
Copy link
Contributor Author

(And yes, "rewrite settings" did the trick for me. Thanks!)

@gfwilliams
Copy link
Member

I now know "battery running empty" is dangerous.

In recent firmwares even that isn't bad - it just means that maybe if you're super unlucky a file header will be partially written, but you do have to be very unlucky for that.

Is the "2 second press" dangerous? Is the "5 second press" dangerous?

They shouldn't actually be very dangerous now either. I don't think it's very easy for a write to end up getting completely interrupted (apart from with the 5 second press) but even then it's just that a file ends up partially written like in this case.

All I can think happened for you is maybe you changed a setting and then the storage needed compacting when .boot0 was updated (it should display a message onscreen now), then maybe you held down the button to try and break out, and that caused Espruino to break out of execution. It seems very unlucky though.

Do you remember doing anything in particular right before you started noticing the error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants