You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importpandasaspdfromupathimportUPathAWS_KEY="AKIAxxxxxxx"AWS_SECRET="xxxxxxxxxxxxxxx"bucket='upathtest'fkey=f"folder1/folder2/test1.xlsx"s3base=UPath(f"s3://{bucket}", key=AWS_KEY, secret=AWS_SECRET)
s3path=s3base/fkeyprint(list(s3base.iterdir())) # THIS WORKS!withs3path.open('w') asff:
ff.write("test1,test2") # THIS WORKS EITHER!df=pd.DataFrame()
df.to_excel(s3path) # !! This fails
Traceback
Traceback (most recent call last):
File "/mypath/venv/lib/python3.11/site-packages/s3fs/core.py", line 113, in _error_wrapper
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/aiobotocore/client.py", line 411, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mypath/try_fss.py", line 57, in <module>
main()
File "/mypath/try_fss.py", line 53, in main
test03()
File "/mypath/try_fss.py", line 47, in test03
pd.read_csv(s3path)
File "/mypath/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
return _read(filepath_or_buffer, kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 620, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
self._engine = self._make_engine(f, self.engine)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1880, in _make_engine
self.handles = get_handle(
^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/pandas/io/common.py", line 728, in get_handle
ioargs = _get_filepath_or_buffer(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/pandas/io/common.py", line 443, in _get_filepath_or_buffer
).open()
^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/fsspec/core.py", line 147, in open
return self.__enter__()
^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/fsspec/core.py", line 105, in __enter__
f = self.fs.open(self.path, mode=mode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/fsspec/spec.py", line 1303, in open
f = self._open(
^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/s3fs/core.py", line 689, in _open
return S3File(
^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/s3fs/core.py", line 2183, in __init__
super().__init__(
File "/mypath/venv/lib/python3.11/site-packages/fsspec/spec.py", line 1742, in __init__
self.size = self.details["size"]
^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/fsspec/spec.py", line 1755, in details
self._details = self.fs.info(self.path)
^^^^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/fsspec/asyn.py", line 118, in wrapper
return sync(self.loop, func, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/fsspec/asyn.py", line 103, in sync
raise return_result
File "/mypath/venv/lib/python3.11/site-packages/fsspec/asyn.py", line 56, in _runner
result[0] = await coro
^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/s3fs/core.py", line 1375, in _info
out = await self._call_s3(
^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/s3fs/core.py", line 366, in _call_s3
return await _error_wrapper(
^^^^^^^^^^^^^^^^^^^^^
File "/mypath/venv/lib/python3.11/site-packages/s3fs/core.py", line 145, in _error_wrapper
raise err
PermissionError: Forbidden
I tried to use client_kwargs - this does not work either.
The implementation in pandas.io.common of _get_filepath_or_buffer basically converts the provided UPath instance into a string and drops the storage_options.
This causes pandas to then try to interpret the returned s3 uri without the storage options.
The reason for this happening is that UPath incorrectly pretends to be local path, which is going to be fixed when we move the correct base class: PathBase which is not going to provide a __fspath__ dunder anymore for non-local paths.
In the future we could also try to add support for arbitrary PathBase subclasses in pandas. But at least for universal_pathlib the mentioned changes in UPath should happen first.
All that being said, you can either provide the buffer as you've done in the with context directly to .to_excel() or provide the storage_options explicitly as shown here:
Thank you for the answer. However, this does not help much, as the idea was in simply replacing the Path objects to UPath, without changing it everywhere. I am refactoring a big piece of code and was hoping this could help to transparently work with any path objects.
Given the current implementation in pandas, and the current implementation in universal_pathlib, what you can do to achieve what you're asking for is to not provide credentials explicitly, but set the credentials via any of the supported methods for s3fs described here: https://s3fs.readthedocs.io/en/latest/#credentials
I also recommend to subscribe to #193 to be notified once work starts to move UPath to its correct base class available in future versions of stdlib pathlib (and backported in pathlib-abc)
Traceback
I tried to use
client_kwargs
- this does not work either.AWS user has
AmazonS3FullAccess
policy attached.The text was updated successfully, but these errors were encountered: