Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large file support (64-bit offsets) #2025

Open
inglorion opened this issue Apr 21, 2023 · 4 comments
Open

Large file support (64-bit offsets) #2025

inglorion opened this issue Apr 21, 2023 · 4 comments

Comments

@inglorion
Copy link
Contributor

A number of nix functions take file positions of type off_t (for example, ftruncate and pread). off_t is a 64-bit value on many systems (e.g. FreeBSD, 64-bit Linux), but there are some systems (e.g. 32-bit Linux) where off_t is 32-bit, limiting the maximum file position that may be specified. Such systems may offer alternative interfaces (e.g. ftruncate64, pread64) which take 64-bit file positions. Linux does this, for example.

Nix currently offers the base versions of these functions (one exception being that both lseek and lseek64 are provided). As a result, file positions beyond 32 bits cannot be passed on some systems, even if the system does actually support 64-bit positions. This is a feature request to support 64-bit file positions on systems where those are supported.

@asomers
Copy link
Member

asomers commented Apr 21, 2023

That's sensible. Patches welcome.

@inglorion
Copy link
Contributor Author

Happy to contribute. One thing we will have to decide is what the API will look like.

I think there is a case to be made for having the regular functions accept 64-bit offsets on all platforms where 64-bit offsets are supported. This is similar to how read_at in std::os::unix::fs::FileExt is implemented using pread or pread64 as necessary to get 64-bit offset support. The downside I see here is that this changes the size of offsets on some platforms (e.g. the offset parameter to pread on 32-bit Linux), which might break some existing code. I'm not sure if this is unacceptable; after all, the offset can already vary in size today and we would actually be making it more uniform, but it is something to be considered. The major advantage is that 64-bit offsets would then just work on all platforms.

Another option is to add *64() functions similar to what we already have with lseek64, where the *64() versions take 64-bit offsets. This keeps the API of the existing non-64 functions the same. The downside is that we end up adding several extra functions to the API that only differ from their non-64 counterparts on some platforms. This also comes with the pitfall that code that uses the non-64 versions will work fine with 64-bit offsets on many platforms, but not on others.

Personally, I have a preference for making the offsets 64-bit on platforms that support 64-bit offsets, without adding a bunch of *64() functions, but I would like to have the input of some existing contributors.

@inglorion
Copy link
Contributor Author

I should have a pull request for this sometime next week.

@inglorion
Copy link
Contributor Author

PR #2032 implements this for most functions. Asynchronous I/O and a few others require some additional work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants