-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about SPI NAND with ECC #727
Comments
Hi @Christian-Sander, I don't have all the answers but I can give some hopefully helpful comments:
This is not handled well by LittleFS at the moment. LittleFS tries to completely isolate reading and writing operations, so reading operations can not write. This avoids complicated surprises such as ENOMEM on reading a file, but creates problems for how to respond to read errors. At the moment if the block device returns LFS_ERR_CORRUPT on a read, LittleFS more-or-less just gives up and reports the error when it recieves it. The best option right now, if possible, is to wait to report the error on the next erase operation. Either by tracking the last read error in RAM or by checking for read errors before erasing if it's reproducible. Errors during erasing/programming force the data to be relocated to a better block. (It's also worth noting LittleFS also doesn't currently persist the knowledge of this failure, though the path to improve that is a bit more clear). Unfortunately this wouldn't cover cases where read errors need to be fixed ASAP due to decreased persistence. This is something I'm open to ideas on. Maybe a separate routine to explicitly check for read errors would help and could compliment explicit garbage-collection routines.
I would try to use the smaller program size as that would give you a smaller write granularity and likely improve the overall lifetime of the device. If LittleFS runs into any program errors it will evict the whole erase block. This is usually what you want as erasing is the destructive operation, programming does no damage on its own. So if one sub-page starts reporting errors the others are likely close to their end-of-life.
Yes, LittleFS uses blocks 0 and 1 as a sort of "anchor" to find the rest of the filesystem.
Ah, this is a really fun question if I'm understanding it right. It's actually not possible to persist data through power-loss with only a single erase block. At some point you need to erase that block to rewrite the data and if you lose power while doing so the entire filesystem will be lost. You need at minimum 2 blocks to survive any power-loss and this is true for any other updatedable filesystem/kvstore out there. Winbond is likely being conservative with their specs, but let me know if you find a 1-block NAND chip because I think that would be hilariously impractical. That being said a 1-block disk could be used as write-once memory. In that case you could prepare a LittleFS image offline using littlefs-fuse or mklittlefs or use a RAM block device as an intermediary to write the disk. If you configure the filesystem's block size to 1/2 the 1-block erase block it would be possible to mount read-only with LittleFS, as read-only LittleFS never calls erase. This is a fringe use-case, but could be useful if you already use LittleFS elsewhere in your codebase.
This is not uncommon, but LittleFS doesn't handle this as its not the system in control in this configuration. I think the most common solution is adding an MBR to the block device. This sacrifices the first block as a table to describe the partitions on the block device. Mbed-OS contains an implementation which might be a useful reference: Or you might be able to implement your own just based on the Wikipedia description: MBR is nice in that it's relatively simple and extremely common. Any PC OS has utilities to manipulate MBR partitions. If you want to sneak partition information into the device's bad block table that could also work, you would just need to make sure the information is passed to the block device in such a way that LittleFS sees block "0" as the first block it should use. |
Thanks for your detailed response. I will look into this in the next few weeks after I finish some other tasks.
Maybe it could be possible to report the first ECC error back to the application in the read call? Then the application could decide to call a function to fix it?
I think you may have misunderstood me somewhat. I meant that Winbond guarantees that the first block is definitely a good block, not that only one block in total is good. They have the usual 2% error rate afaik, but guarantee that the first block is a good one. So the question was related to what would happen if block 2 is a bad one. I could handle that in the driver if needed, as the flash has a look up table to remap bad blocks to known good blocks. |
This is sort of possible, if your block device implementation reports an error, eg LFS_ERR_CORRUPT, LittleFS halts and returns the error through whichever function is currently executing unmodified. The unfortunate thing is this doesn't give you access to the data that is there, so the options for recovering would be a bit limited. I suppose you could keep track of the most recent n read errors out-of-band of LittleFS. That might be one way to recover from a read error without losing data.
Ah, that makes sense. I was wondering if I misunderstood while writing the comment. Hmm, that's a tricky problem. The current approach in LittleFS is to exponentially backoff writes to blocks {0,1}. If one of these blocks are bad there's not really a way to recover as it's difficult to find the filesystem during mount. I was looking into making lfs_mount more flexible recently (#349), I suppose it could be possible to allow the superblocks to live in multiple places that could all be searched during mount. I'll have to think about this. For now the best option is either:
|
I'm currently implementing the glue logic between lfs and my device driver. Regarding the ECC during read operations, I now reserve 20 blocks (=bad block lookup table size) at the end of the flash as replacement blocks that are not used by any partition directly. On correctable ECC errors the block is copied to one of the replacement blocks and then an entry to the bad block lookup table is added, so further reads will read the replacement block instead. However, I'm still a bit unsure about the handling of the superblocks and the reliability. Also for my understanding: Do you need to erase/write the superblock on each file system write? Wouldn't you get much more wear on these blocks than on the other blocks then? I think I'm probably missing something here. |
Regarding the last question, I've seen that this doesn't happen, so please ignore that question. I've seen that there are many reads from first four bytes of the superblocks. Could those be cached by littlefs? I'm also seeing that iterating over files in a directory and getting their sizes involves a lot of reads and is thus relatively slow. I suppose this is a design limitation of a wear leveling file system that you need to read all pages of a file to get its size. |
And one more: |
Unfortunately this is a difficult situation for littlefs. If this happens littlefs locks up and becomes read-only. littlefs just tries its best to avoid this situation by writing to the superblocks as little as possible. Though it may not appear this way at first.
It may be worth understanding how the superblock anchor works (re-skimming DESIGN.md it looks like this is not well documented, also sorry if the names are confusing, at some point the name "superblock" sort of lost its originally meaning). littlefs actually has a linked-list of superblocks, each containing a copy of some read-only info, and a pointer to the next superblock. The last superblock also contains the root directory, which is the root of the directory tree. Writes anywhere in the filesystem can propagate up the tree + linked-list as a part of wear-leveling, at a rate controlled by the To keep writes to the anchor superblock as low as possible, littlefs increases the length of the superblock linked-list every So say you set
After 100 anchor writes, the superblock linked-list gets expanded:
And then after another 100 anchor writes (10,000 writes to the root directory):
And so on. This grows very quickly. Assuming each block lasts a certain number, say conservatively 1000, of erase cycles before going bad (which isn't quite how it works), you would need
Ah yes, every path lookup goes through the root directory, which needs to read the 4-byte revision count to know which of the two blocks in the metadata-pair is the most recent. That would be good to cache. littlefs is still being developed, but the priority has been on performance without RAM, so there's probably a lot of low-hanging fruit for smarter caching. I've been thinking a general-purpose block-pair => metadata cache that avoids metadata fetches (which require this revision-count read and CRC calculation) could be quite powerful, it just hasn't been a priority to investigate yet. Also being able to cache > a block of memory would be good if you have the RAM/code size.
The file size is stored with the other metadata so that's not quite it. You're likely seeing the scanning as a part of the block allocator. Blocks are garbage-collected on the first write after mount and kept track of in a lookahead bitmap RAM. Building this lookahead bitmap is expensive as we need to traverse the full filesystem tree. I'm hoping to improve this by moving to an on-disk free list of sorts, but that will take some time to get right.
To be honest the biggest value of this check is to catch integration mistakes. It's strictly unnecessary if you trust your drivers, bus, and storage to write correctly or report an error. Unfortunately, integration can be tricky to get right, and hard to debug if it goes wrong since you don't know if the write or the read is the problem. I think the original bug that lead to this check was an SPI bus running at a higher frequency than the capacitance of the traces allowed. That was tricky to track down until you put the bus on a scope and see it going wiggle wiggle wiggle. Maybe we should make this check optional, but when you compare the overhead of reading/checksumming those bytes versus erasing and programming, it hasn't been worth removing. Hopefully this also frames the current state of littlefs well, it can useful depending on the application, but is still improving. |
Thanks for the replies! It's definitely useful, I'm just looking for some ways to speed things up a little. Regarding the superblocks, I think one could use the block remapping feature of the W25N01GV flash. I attempted that last week but had issues implementing that specific command. I'm waiting for a reply from Winbond right now. |
I do have improved cycle-detection over hear, though it only detects cycles faster if that is the issue: #746 The stack trace looks like it's the first call to the garbage collector, which scans the filesystem to find free blocks to allocate. It's an expensive operation depending on how many blocks are in your filesystem. But the corrupted dir {1,0} errors are strange. That shouldn't happen in normal operation unless the filesystem is unformatted. Maybe writes aren't going through to disk for some reason? |
I had it running for a few minutes. I have 1024 blocks in total with 64 pages with 2 kB each. These blocks are spread across 3 filesystems and 20 spare blocks. I think the filesystem is question is apprxoimately 900 blocks. As mentioned, I will try to disable my cache now and see if I can reproduce this problem. I'll also check how the new cycle detection algorithm behaves when this problem occurs and report back. |
It was a bug in my caching code and it's fixed now. Erasing block 0 corrupted the contents of the block1 cache. I'll keep my eyes on this to see if the loop occurs again. |
The corrected ECC errors would have been nice to "bubble up" allowing handling somewhere but I guess this would have to be at the nand flash level and not in littlefs. |
Good question, it's currently the latter. littlefs doesn't keep track of what blocks have had an error, so it will eventually try to reuse them. BUT, in future designed but not-yet started work, the plan is to add an optional on-disk block-map for tracking in-use/unused/erased/bad blocks. This should also speed up block allocation, and allow for pre-erasing blocks in I'm also considering adding more advanced bd errors (hypothetical, names tbd):
Which you could decide between based on the number of bit errors. A single bit-flip may not be a bad sign yet, but multiple may mean the block needs to be retired, for example. This is likely highly hardware dependent. |
I am using lfs in Zephyr on Nordic nRF53 (using 3rd party added code, not the one in Zephyr branch in Nordic SDK as it laggs behind). Since the API in Zephyr does not allow these bit errors to bubble up I log them separately. I have noticed 1-3 bit errors now and then on quite new chips and usually it does not re-occur on same block upon next re-use. I do however see a few erase faults on some blocks on a few chips, and these appear persistent. My thought there was to mark these blocks somehow so a initial check from lfs could determine the block to be not worth using. |
This will need fleshing out when the hypothetical block map actually starts to take shape, but there we will probably also want something like
My understanding is there are a couple of different sources of bit errors in NAND:
These aren't a concern, but you may need error-correction to prevent filesystem corruption.
To handle the factory-bad-blocks, littlefs really needs this hypothetical block-map. A possible workaround for now, though it is a bit involved: You could reserve the first page of each block to store your own bookkeeping, and mark bad blocks somehow (either by relying on the factory marks, or by storing some big unique value in every good block). Then you could report LFS_ERR_CORRUPT if littlefs tries to erase such a block. This won't stop littlefs from trying to erase the block again, but on each erase failure littlefs will just move on to the next available block. Though long-term the hypothetical block-map is the correct solution to this. |
Hello,
I'm currently looking into writing an implementation for a Winbond W25N01GV NAND flash for ST MCUs over QSPI.
As I've seen there have been some others which have also done this (any source available?).
Originally I wanted to use AzureRTOS FileX+LevelX but it seems that LevelX has architectural problems with supporting ECC.
I would like to use the hardware ECC feature of the memory, so I have some questions around that topic:
The text was updated successfully, but these errors were encountered: