Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support get file path #976

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

crafcat7
Copy link

I added two new interfaces to obtain the path of the file/folder.

int lfs_dir_path(lfs_t *lfs, lfs_dir_t *dir, char *path, lfs_size_t size);
int lfs_file_path(lfs_t *lfs, lfs_file_t *file, char *path, lfs_size_t size);

Mainly rely on a new internal interface, to a recursive search target

static lfs_ssize_t lfs_dir_path_(lfs_t *lfs,
         lfs_mdir_t *dir, uint16_t id, char *path, lfs_size_t size)

@crafcat7 crafcat7 changed the title lfs.c:Support get file path Support get file path Apr 30, 2024
@geky-bot
Copy link
Collaborator

Tests passed ✓, Code: 17300 B (+1.4%), Stack: ∞ B (-100.0%), Structs: 812 B (+0.0%)
Code Stack Structs Coverage
Default 17300 B (+1.4%) ∞ B (-100.0%) 812 B (+0.0%) Lines 2394/2616 lines (-1.5%)
Readonly 6194 B (+0.0%) 448 B (+0.0%) 812 B (+0.0%) Branches 1245/1602 branches (-0.9%)
Threadsafe 18216 B (+1.6%) ∞ B (-100.0%) 820 B (+0.0%) Benchmarks
Multiversion 17360 B (+1.4%) ∞ B (-100.0%) 816 B (+0.0%) Readed 29369693876 B (+0.0%)
Migrate 18996 B (+1.3%) ∞ B (-100.0%) 816 B (+0.0%) Proged 1482874766 B (+0.0%)
Error-asserts 18012 B (+1.5%) ∞ B (-100.0%) 812 B (+0.0%) Erased 1568888832 B (+0.0%)

@geky geky added enhancement needs minor version new functionality only allowed in minor versions labels May 6, 2024
@geky
Copy link
Member

geky commented May 6, 2024

Hi @crafcat7, thanks for creating a PR for this.

This is a clever implementation, unfortunately I don't think it makes sense to add this API to littlefs. I need to default to rejecting features without a strong motivation in order to combat feature-creep/code cost, so sorry if this comes across a bit harsh.

The first issue is this seems relatively easy to emulate outside of the littlefs core. When a user opens a file, they must know the file path, so if they want access to the file path later, they can store the file path next to the lfs_file_t handle.

The second issue is the use of lfs_fs_parent. lfs_fs_parent is actually a very expensive function, requiring a $O(n)$ scan where $n$ is the number of files in the filesystem. For this reason lfs_fs_parent is currently limited to cases where you may expect heavy IO (mdir compaction, dir creation/remove, etc).

There is also work in progress to move away from lfs_fs_parent because it does not work well. And I'm not sure there is a practical implementation of these functions without lfs_fs_parent (at least not without just keeping the path in RAM)...

I think when PC-scale OSs provide this feature, they are just returning the path stored in RAM inside the kernel.

@yamt
Copy link
Contributor

yamt commented May 7, 2024

I think when PC-scale OSs provide this feature, they are just returning the path stored in RAM inside the kernel.

some OSes including linux, yes.

some others implement it by performing readdir on ".." as it would be done for a naive getcwd() implementation.
(such OSes usually don't support the operation for non-directory files.)
i suspect such an implementation is impossible with littlefs though, because littlefs doesn't seem to have user-visible file-id to match the correct directory entry.

@geky
Copy link
Member

geky commented May 7, 2024

some others implement it by performing readdir on ".." as it would be done for a naive getcwd() implementation.
(such OSes usually don't support the operation for non-directory files.)
i suspect such an implementation is impossible with littlefs though, because littlefs doesn't seem to have user-visible file-id to match the correct directory entry.

Oh interesting, that is clever.

It would also not work because littlefs doesn't really have ".." entries. They are just faked in the path parser.

I wonder if the lack of real ".." entries will become problematic when adding openat/*at functions...

@yamt
Copy link
Contributor

yamt commented May 8, 2024

I wonder if the lack of real ".." entries will become problematic when adding openat/*at functions...

yes. actually, it's problematic even for traditional chdir().
only way i can think of is something path-based, which is not simple to deal with renames.

btw, this get-path functionality "requirement" came from an attempt to implement F_SETLK style file lock. apache/nuttx#11724
i guess what's really necessary there is some kind of file identifier, not necessarily a path.
(actually a path is not so appropriate as it changes on a rename.)
does littlefs have something usable for the purpose?

@geky
Copy link
Member

geky commented May 8, 2024

i guess what's really necessary there is some kind of file identifier, not necessarily a path.
(actually a path is not so appropriate as it changes on a rename.)
does littlefs have something usable for the purpose?

Unfortunately, no. This is an area where littlefs diverges greatly from POSIX filesystems in that littlefs doesn't have inodes, so no unique ino, no hard-linking, etc...

This wasn't really an intentional design decision, just a side-effect of not having inodes as a requirement, unlike filesystem that start off in Linux, etc. It's in theory possible to extend littlefs to use inodes, but it would add a layer of indirection and complexity (and ino reclamation, bleh).

littlefs does map each file to an id internally, but this id can change if neighboring files are created/deleted. This is one reason we keep all open files in a linked-list, so we can keeps files up to date with any id changes. This id hasn't been exposed in lfs_stat because it could be confused for an ino without actually being useful...

btw, this get-path functionality "requirement" came from an attempt to implement F_SETLK style file lock. apache/nuttx#11724

This is an interesting puzzle. I'm not really sure what the best solution is. Some thoughts:

You could keep the file path around, though as you mention handling renames would be tricky. You'd also need to worry about messy path comparisons (a/b/c == /a/b/./c?)...

You could use a custom attribute to assign each file a fake ino when needed. Though making sure each ino is unique would be difficult (ino reclamation, bleh). This solution would be a bit easier if we had a way to iterate over all files in the filesystem, which isn't an unreasonable API...

If you keep track of each locked file handle, and we add some sort of comparison API (either lfs_file_handlecmp-like or by exposing the metadata-id), you could compare against all locked files to find conflicting locks. This would be less efficient than a hashtable, but littlefs's opened-linked-list already scales $O(n)$ with the number of open files so... I think this is my favorite solution so far.

There is some planned work to make custom attributes with multiple open file handles more intuitive, by "broadcasting" custom attribute updates on lfs_file_sync. This almost sounds like what you need, except you probably don't want the locked flag to actually be written to disk... Maybe we should allow custom attributes to not be written to disk? Maybe this is the wrong tool for this...

yes. actually, it's problematic even for traditional chdir().

This is probably something more familiar to OS-level developers, but outside of shell emulators and backwards compatibility, does chdir() really make sense for applications? It seems like you'd want to prefer openat/*at relative to an open directory handle these days.

@yamt
Copy link
Contributor

yamt commented May 9, 2024

There is some planned work to make custom attributes with multiple open file handles more intuitive, by "broadcasting" custom attribute updates on lfs_file_sync. This almost sounds like what you need, except you probably don't want the locked flag to actually be written to disk... Maybe we should allow custom attributes to not be written to disk? Maybe this is the wrong tool for this...

how do you plan to find the broadcast destinations?
it's what's exactly needed for F_SETLK. ie. find all open files (lfs_file_t) of a given on-disk file.

yes. actually, it's problematic even for traditional chdir().

This is probably something more familiar to OS-level developers, but outside of shell emulators and backwards compatibility, does chdir() really make sense for applications? It seems like you'd want to prefer openat/*at relative to an open directory handle these days.

in an ideal world, maybe.
but in reality, as far as i know, chdir is ubiquitous and far commonly used than openat family.

@xiaoxiang781216
Copy link

Hi @crafcat7, thanks for creating a PR for this.

This is a clever implementation, unfortunately I don't think it makes sense to add this API to littlefs. I need to default to rejecting features without a strong motivation in order to combat feature-creep/code cost, so sorry if this comes across a bit harsh.

This is a trend off, file name consume memory, saving file path for each open handle will increase the memory consumption if many files are opened.

@geky
Copy link
Member

geky commented May 9, 2024

how do you plan to find the broadcast destinations?
it's what's exactly needed for F_SETLK. ie. find all open files (lfs_file_t) of a given on-disk file.

There is an invasive linked-list going through all lfs_file_t and lfs_dir_t structs so we can keep these handles up-to-date.

I suppose we could expose this to the user, though the API would be complicated (we can't just say use the internals of lfs_file_t, switching to type-specific linked-lists was considered at one point for example).

Thinking about it, if you control the lfs_file_t structs, could you add an additional invasive linked-list outside of littlefs for the purpose of updating locks, etc?

in an ideal world, maybe.
but in reality, as far as i know, chdir is ubiquitous and far commonly used than openat family.

Fair enough, guess adding openat will be a learning experience.

file name consume memory, saving file path for each open handle will increase the memory consumption if many files are opened.

Fair point. But if lfs_fs_parent is not an option, the only option is saving the file path. In this case it's preferable to save the file path outside of littlefs so not everyone needs to pay the cost.

Maybe an option is to 1. add an API to iterate over all files, and 2. add file handle comparison (metadata-id), so the lfs_fs_parent solution could be implemented outside littlefs. Though I would warn this is still going to be very expensive ($O(n)$ all files).

@yamt
Copy link
Contributor

yamt commented May 9, 2024

how do you plan to find the broadcast destinations?
it's what's exactly needed for F_SETLK. ie. find all open files (lfs_file_t) of a given on-disk file.

There is an invasive linked-list going through all lfs_file_t and lfs_dir_t structs so we can keep these handles up-to-date.

are you talking about "lfs->mlist"?
to "broadcasting" custom attribute updates, you need to find out which entries on the list to apply updates,
don't you?

I suppose we could expose this to the user, though the API would be complicated (we can't just say use the internals of lfs_file_t, switching to type-specific linked-lists was considered at one point for example).

Thinking about it, if you control the lfs_file_t structs, could you add an additional invasive linked-list outside of littlefs for the purpose of updating locks, etc?

is there a way to determine if given two lfs_flie_t objects are referencing to the same on-disk file or not?

@geky
Copy link
Member

geky commented May 10, 2024

are you talking about "lfs->mlist"?
to "broadcasting" custom attribute updates, you need to find out which entries on the list to apply updates,
don't you?

Yep.

The exact logic to update ids/pairs, though messy, is here:
https://github.com/littlefs-project/littlefs/blob/d01280e64934a09ba16cac60cf9d3a37e228bb66/lfs.c#L2326

Extending this to also update custom attributes would not be too difficult. The only blocker is that WRONLY attributes are currently allowed to be const, which is a bit of a problem. So I've been waiting to address this until an eventual major version bump.

is there a way to determine if given two lfs_flie_t objects are referencing to the same on-disk file or not?

Not currently, but this isn't an unreasonable API to add. The file->m.pair + file->id fields uniquely determine each file (but may change on mutation).

Right now creating an API it is a bit tricky, since you need to both compare the block addresses with lfs_pair_cmp and compare the id. In some long-term ongoing work this is changing to a single uint32_t metadata-id which would be a bit easier to expose API-wise.

@yamt
Copy link
Contributor

yamt commented May 13, 2024

are you talking about "lfs->mlist"?
to "broadcasting" custom attribute updates, you need to find out which entries on the list to apply updates,
don't you?

Yep.

The exact logic to update ids/pairs, though messy, is here: https://github.com/littlefs-project/littlefs/blob/d01280e64934a09ba16cac60cf9d3a37e228bb66/lfs.c#L2326

Extending this to also update custom attributes would not be too difficult. The only blocker is that WRONLY attributes are currently allowed to be const, which is a bit of a problem. So I've been waiting to address this until an eventual major version bump.

is there a way to determine if given two lfs_flie_t objects are referencing to the same on-disk file or not?

Not currently, but this isn't an unreasonable API to add. The file->m.pair + file->id fields uniquely determine each file (but may change on mutation).

Right now creating an API it is a bit tricky, since you need to both compare the block addresses with lfs_pair_cmp and compare the id. In some long-term ongoing work this is changing to a single uint32_t metadata-id which would be a bit easier to expose API-wise.

ok.
so, at least this api can be provided today, right?
https://github.com/WebAssembly/wasi-filesystem/blob/e79b05803e9ffd3b0cfdc0a8af20ac743abbe36a/wit/types.wit#L576-L582

@geky
Copy link
Member

geky commented May 13, 2024

ok.
so, at least this api can be provided today, right?
https://github.com/WebAssembly/wasi-filesystem/blob/e79b05803e9ffd3b0cfdc0a8af20ac743abbe36a/wit/types.wit#L576-L582

Yes, if this API is added to littlefs. But it's not currently exposed.

That's interesting that WASI did not just expose ino, I wonder what other filesystems motivated this.

@yamt
Copy link
Contributor

yamt commented May 13, 2024

That's interesting that WASI did not just expose ino, I wonder what other filesystems motivated this.

iirc it's motivated by some windows filesystems (ReFS?) which have 128-bit file ID.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement needs minor version new functionality only allowed in minor versions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants