Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

localfs: reduce stat calls during info #1659

Merged
merged 1 commit into from
Aug 12, 2024

Conversation

skshetry
Copy link
Contributor

For symlinks, LocalFileSystem.info() was making 3 stat calls, first for the symlink file itself, second for the link target, and third to get size information of the link target.

This is wasteful, and size can be gathered from stat of the link target.

For symlinks, `LocalFileSystem.info()` was making 3 stat calls,
first for the symlink file itself, second for the link target,
and third to get `size` information of the link target.

This is wasteful, and `size` can be gathered from `stat` of the link
target.
@skshetry skshetry changed the title localfs: reduce stats calls during info localfs: reduce stat calls during info Aug 12, 2024
skshetry added a commit to iterative/dvc-data that referenced this pull request Aug 12, 2024
Reduces no. of stat calls and avoids _strip_protocol call which is slow
when we have large no. of files.

For reducing stat calls upstream, I have a PR in fsspec/filesystem_spec#1659.
skshetry added a commit to iterative/dvc-data that referenced this pull request Aug 12, 2024
Reduces no. of stat calls and avoids _strip_protocol call which is slow
when we have large no. of files.

For reducing stat calls upstream, I have a PR in fsspec/filesystem_spec#1659.
skshetry added a commit to iterative/dvc-data that referenced this pull request Aug 12, 2024
optimize localfs.info

Reduces no. of stat calls and avoids _strip_protocol call which is slow
when we have large no. of files.

For reducing stat calls upstream, I have a PR in fsspec/filesystem_spec#1659.
skshetry added a commit to iterative/dvc-data that referenced this pull request Aug 12, 2024
optimize localfs.info

Reduces no. of stat calls and avoids _strip_protocol call which is slow
when we have large no. of files.

For reducing stat calls upstream, I have a PR in fsspec/filesystem_spec#1659.
"type": t,
"created": out.st_ctime,
"islink": link,
}
for field in ["mode", "uid", "gid", "mtime", "ino", "nlink"]:
result[field] = getattr(out, f"st_{field}")
if result["islink"]:
if link:
result["destination"] = os.readlink(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are doing micro-optimisation, might the destination path also be known?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a way to get destination path without doing a readlink. stat struct does not have a path or a target path.

@martindurant martindurant merged commit 4b79654 into fsspec:master Aug 12, 2024
11 checks passed
@skshetry skshetry deleted the reduce-stat-calls branch August 12, 2024 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants