Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access right fixes #13

Closed
wants to merge 2 commits into from
Closed

Access right fixes #13

wants to merge 2 commits into from

Conversation

evgenyigumnov
Copy link
Contributor

Bug fix for this problem: huggingface/candle#416

@evgenyigumnov evgenyigumnov changed the title after review fixes Access right fixes Aug 15, 2023
@Narsil
Copy link
Collaborator

Narsil commented Aug 15, 2023

Thank you for this fix !

@Wauplin Is that something known for Windows ? Should we juste move instead of symlink on windows ?

I see from here that windows is duplicating model file ? ??

https://github.com/search?q=repo%3Ahuggingface%2Fhuggingface_hub%20symlink&type=code

@Wauplin
Copy link

Wauplin commented Aug 16, 2023

@Narsil Yes, the cache-system on Windows when not running on Admin/dev mode is sub-optimal. To get more details about the underlying decision, check-out this comment where I tried to summarize the pros/cons.

I see from here that windows is duplicating model file ? ??

The 2 cases where model files can be duplicated are:

  • if a model file is already in the blobs/ folder and we try to download a new revision => we duplicate instead of move to avoid breaking a potential existing reference
  • if a model is downloaded from multiple revisions => snapshots are independent, hence files are duplicated in both

The generic use case where the user downloads once the model to use it do not duplicate anything.

@evgenyigumnov
Copy link
Contributor Author

@Narsil Yes, the cache-system on Windows when not running on Admin/dev mode is sub-optimal. To get more details about the underlying decision, check-out this comment where I tried to summarize the pros/cons.

I see from here that windows is duplicating model file ? ??

The 2 cases where model files can be duplicated are:

  • if a model file is already in the blobs/ folder and we try to download a new revision => we duplicate instead of move to avoid breaking a potential existing reference
  • if a model is downloaded from multiple revisions => snapshots are independent, hence files are duplicated in both

The generic use case where the user downloads once the model to use it do not duplicate anything.

I understand correctly that we end up using symbolic links under Windows?

@Wauplin
Copy link

Wauplin commented Aug 16, 2023

I understand correctly that we end up using symbolic links under Windows?

Only when possible yes (i.e. either admin or developer mode enabled).

@Narsil
Copy link
Collaborator

Narsil commented Aug 16, 2023

@evgenyigumnov Hey, I modified your code to follow huggingface_hub method.
Sorry to push on your branch, usually I try to avoid that, but I felt that was easier here, I can remove my commit if you want.

Could you confirm/infirm that work ? I don't have an easy setup to check.

This means the download + cache will work even with User privileges only.

@Narsil
Copy link
Collaborator

Narsil commented Aug 16, 2023

The CI error just needs a rebase.

@evgenyigumnov
Copy link
Contributor Author

evgenyigumnov commented Aug 16, 2023

@evgenyigumnov Hey, I modified your code to follow huggingface_hub method. Sorry to push on your branch, usually I try to avoid that, but I felt that was easier here, I can remove my commit if you want.

Could you confirm/infirm that work ? I don't have an easy setup to check.

This means the download + cache will work even with User privileges only.

No. This code help user to make decition to run project from admin or developer mode.
If user do this then project launch sucessfully.
At that moment user get stupid error about access deny... and he dont know how to fix problem.

@Narsil
Copy link
Collaborator

Narsil commented Aug 16, 2023

This new code should work on any machine, without access deny, have you tried it ?

@Narsil
Copy link
Collaborator

Narsil commented Aug 16, 2023

Ok this PR should work: #15

I have successfully set myself up a reproduceable env. Unfortunately I couldn't find setup in GH action.

@evgenyigumnov
Copy link
Contributor Author

evgenyigumnov commented Aug 17, 2023

This new code should work on any machine, without access deny, have you tried it ?

When I try run test in main branch I have error:

cargo test
Finished test [unoptimized + debuginfo] target(s) in 0.09s
Running unittests src\lib.rs (target\debug\deps\hf_hub-082cd1143db02bea.exe)

running 5 tests
test tests::token_path ... ok
test api::sync::tests::info ... FAILED
test api::sync::tests::dataset ... FAILED
test api::sync::tests::detailed_info ... ok
test api::sync::tests::simple ... ok

failures:

---- api::sync::tests::info stdout ----
thread 'api::sync::tests::info' panicked at 'assertion failed: (left == right)
left: RepoInfo { siblings: [Siblings { rfilename: ".gitattributes" }, Siblings { rfilename: "wikitext-103-raw-v1/test/0000.parquet" }, Siblings { rfilename: "wikitext-103-raw-v1/train/0000.parquet" }, Siblings { rfilename: "wikitext-103-raw-v1/train/0001.parquet" }, Siblings { rfilename: "wikitext-103-raw-v1/validation/0000.parquet" }, Siblings { rfilename: "wikitext-103-v1/test/0000.parquet" }, Siblings { rfilename: "wikitext-103-v1/test/index.duckdb" }, Siblings { rfilename: "wikitext-103-v1/train/0000.parquet" }, Siblings { rfilename: "wikitext-103-v1/train/0001.parquet" }, Siblings { rfilename: "wikitext-103-v1/validation/0000.parquet" }, Siblings { rfilename: "wikitext-103-v1/validation/index.duckdb" }, Siblings { rfilename: "wikitext-2-raw-v1/test/0000.parquet" }, Siblings { rfilename: "wikitext-2-raw-v1/test/index.duckdb" }, Siblings { rfilename: "wikitext-2-raw-v1/train/0000.parquet" }, Siblings { rfilename: "wikitext-2-raw-v1/train/index.duckdb" }, Siblings { rfilename: "wikitext-2-raw-v1/validation/0000.parquet" }, Siblings { rfilename: "wikitext-2-raw-v1/validation/index.duckdb" }, Siblings { rfilename: "wikitext-2-v1/test/0000.parquet" }, Siblings { rfilename: "wikitext-2-v1/train/0000.parquet" }, Siblings { rfilename: "wikitext-2-v1/validation/0000.parquet" }], sha: "29104e21c58d6f33e93ee0d78dbf7141b40fe46c" },
right: RepoInfo { siblings: [Siblings { rfilename: ".gitattributes" }, Siblings { rfilename: "wikitext-103-raw-v1/wikitext-test.parquet" }, Siblings { rfilename: "wikitext-103-raw-v1/wikitext-train-00000-of-00002.parquet" }, Siblings { rfilename: "wikitext-103-raw-v1/wikitext-train-00001-of-00002.parquet" }, Siblings { rfilename: "wikitext-103-raw-v1/wikitext-validation.parquet" }, Siblings { rfilename: "wikitext-103-v1/test/index.duckdb" }, Siblings { rfilename: "wikitext-103-v1/validation/index.duckdb" }, Siblings { rfilename: "wikitext-103-v1/wikitext-test.parquet" }, Siblings { rfilename: "wikitext-103-v1/wikitext-train-00000-of-00002.parquet" }, Siblings { rfilename: "wikitext-103-v1/wikitext-train-00001-of-00002.parquet" }, Siblings { rfilename: "wikitext-103-v1/wikitext-validation.parquet" }, Siblings { rfilename: "wikitext-2-raw-v1/test/index.duckdb" }, Siblings { rfilename: "wikitext-2-raw-v1/train/index.duckdb" }, Siblings { rfilename: "wikitext-2-raw-v1/validation/index.duckdb" }, Siblings { rfilename: "wikitext-2-raw-v1/wikitext-test.parquet" }, Siblings { rfilename: "wikitext-2-raw-v1/wikitext-train.parquet" }, Siblings { rfilename: "wikitext-2-raw-v1/wikitext-validation.parquet" }, Siblings { rfilename: "wikitext-2-v1/wikitext-test.parquet" }, Siblings { rfilename: "wikitext-2-v1/wikitext-train.parquet" }, Siblings { rfilename: "wikitext-2-v1/wikitext-validation.parquet" }], sha: "06ac3f4b846ef171cae5a48a35c3e85f2b44f636" }', src\api\sync.rs:621:9
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

---- api::sync::tests::dataset stdout ----
thread 'api::sync::tests::dataset' panicked at 'called Result::unwrap() on an Err value: RequestError(Status(404, Response[status: 404, status_text: Not Found, url: https://huggingface.co/datasets/wikitext/resolve/refs%2Fconvert%2Fparquet/wikitext-103-v1/wikitext-test.parquet]))', src\api\sync.rs:598:14

failures:
api::sync::tests::dataset
api::sync::tests::info

test result: FAILED. 3 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.79s

How to fix it? After that I can test my pull request on Linux and Windows machine

@Narsil
Copy link
Collaborator

Narsil commented Aug 17, 2023

Those errors are linked to actual changes on the hub, so everything is working fine.

Closing this as superseeded by #15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants