Skip to content

Commit

Permalink
Merge pull request #477 from stac-utils/main
Browse files Browse the repository at this point in the history
Merge v1.0.0-rc.2 changes into 1.0 release branch
  • Loading branch information
Jon Duckworth authored Jun 25, 2021
2 parents 4b80218 + f742fa4 commit 7f9fef4
Show file tree
Hide file tree
Showing 12 changed files with 369 additions and 121 deletions.
19 changes: 18 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,22 @@

### Deprecated

## [v1.0.0-rc.2]

### Added

- Add a `preserve_dict` parameter to `ItemCollection.from_dict` and set it to False when
using `ItemCollection.from_file`.
([#468](https://github.com/stac-utils/pystac/pull/468))
- `StacIO.json_dumps` and `StacIO.json_loads` methods for JSON
serialization/deserialization. These were "private" methods, but are now "public" and
documented ([#471](https://github.com/stac-utils/pystac/pull/471))

### Changed

- `pystac.stac_io.DuplicateObjectKeyError` moved to `pystac.DuplicateObjectKeyError`
([#471](https://github.com/stac-utils/pystac/pull/471))

## [v1.0.0-rc.1]

### Added
Expand Down Expand Up @@ -381,7 +397,8 @@ use `Band.create`

Initial release.

[Unreleased]: <https://github.com/stac-utils/pystac/compare/v1.0.0-rc.1..main>
[Unreleased]: <https://github.com/stac-utils/pystac/compare/v1.0.0-rc.2..main>
[v1.0.0-rc.2]: <https://github.com/stac-utils/pystac/compare/v1.0.0-rc.1..v1.0.0-rc.2>
[v1.0.0-rc.1]: <https://github.com/stac-utils/pystac/compare/v1.0.0-beta.3..v1.0.0-rc.1>
[v1.0.0-beta.3]: <https://github.com/stac-utils/pystac/compare/v1.0.0-beta.2..v1.0.0-beta.3>
[v1.0.0-beta.2]: <https://github.com/stac-utils/pystac/compare/v1.0.0-beta.1..v1.0.0-beta.2>
Expand Down
50 changes: 50 additions & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,20 @@ StacIO
:members:
:undoc-members:

DefaultStacIO
~~~~~~~~~~~~~

.. autoclass:: pystac.stac_io.DefaultStacIO
:members:
:show-inheritance:

DuplicateKeyReportingMixin
~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pystac.stac_io.DuplicateKeyReportingMixin
:members:
:show-inheritance:

STAC_IO
~~~~~~~

Expand Down Expand Up @@ -213,11 +227,47 @@ STACError

.. autoclass:: pystac.STACError

STACTypeError
~~~~~~~~~~~~~

.. autoclass:: pystac.STACTypeError

DuplicateObjectKeyError
~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pystac.DuplicateObjectKeyError

ExtensionAlreadyExistsError
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pystac.ExtensionAlreadyExistsError

ExtensionTypeError
~~~~~~~~~~~~~~~~~~

.. autoclass:: pystac.ExtensionTypeError

ExtensionNotImplemented
~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pystac.ExtensionNotImplemented

ExtensionTypeError
~~~~~~~~~~~~~~~~~~

.. autoclass:: pystac.ExtensionTypeError

RequiredPropertyMissing
~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pystac.RequiredPropertyMissing

STACValidationError
~~~~~~~~~~~~~~~~~~~

.. autoclass:: pystac.STACValidationError


Extensions
----------

Expand Down
130 changes: 77 additions & 53 deletions docs/concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -225,72 +225,96 @@ written (e.g. if you are working with self-contained catalogs).

.. _using stac_io:

Using STAC_IO
I/O in PySTAC
=============

The :class:`~pystac.STAC_IO` class is the way PySTAC reads and writes text from file
locations. Since PySTAC aims to be dependency-free, there is no default mechanisms to
read and write from anything but the local file system. However, users of PySTAC may
want to read and write from other file systems, such as HTTP or cloud object storage.
STAC_IO allows users to hook into PySTAC and define their own reading and writing
primitives to allow for those use cases.

To enable reading from other types of file systems, it is recommended that in the
`__init__.py` of the client module, or at the beginning of the script using PySTAC, you
overwrite the :func:`STAC_IO.read_text_method <pystac.STAC_IO.read_text_method>` and
:func:`STAC_IO.write_text_method <pystac.STAC_IO.write_text_method>` members of STAC_IO
with functions that read and write however you need. For example, this code will allow
The :class:`pystac.StacIO` class defines fundamental methods for I/O
operations within PySTAC, including serialization and deserialization to and from
JSON files and conversion to and from Python dictionaries. This is an abstract class
and should not be instantiated directly. However, PySTAC provides a
:class:`pystac.stac_io.DefaultStacIO` class with minimal implementations of these
methods. This default implementation provides support for reading and writing files
from the local filesystem as well as HTTP URIs (using ``urllib``). This class is
created automatically by all of the object-specific I/O methods (e.g.
:meth:`pystac.Catalog.from_file`), so most users will not need to instantiate this
class themselves.

If you require custom logic for I/O operations or would like to use a 3rd-party library
for I/O operations (e.g. ``requests``), you can create a sub-class of
:class:`pystac.StacIO` (or :class:`pystac.DefaultStacIO`) and customize the methods as
you see fit. You can then pass instances of this custom sub-class into the ``stac_io``
argument of most object-specific I/O methods. You can also use
:meth:`pystac.StacIO.set_default` in your client's ``__init__.py`` file to make this
sub-class the default :class:`pystac.StacIO` implementation throughout the library.

For example, this code will allow
for reading from AWS's S3 cloud object storage using `boto3
<https://boto3.amazonaws.com/v1/documentation/api/latest/index.html>`_:
<https://boto3.amazonaws.com/v1/documentation/api/latest/index.html>`__:

.. code-block:: python
from urllib.parse import urlparse
import boto3
from pystac import STAC_IO
def my_read_method(uri):
parsed = urlparse(uri)
if parsed.scheme == 's3':
bucket = parsed.netloc
key = parsed.path[1:]
s3 = boto3.resource('s3')
obj = s3.Object(bucket, key)
return obj.get()['Body'].read().decode('utf-8')
else:
return STAC_IO.default_read_text_method(uri)
def my_write_method(uri, txt):
parsed = urlparse(uri)
if parsed.scheme == 's3':
bucket = parsed.netloc
key = parsed.path[1:]
s3 = boto3.resource("s3")
s3.Object(bucket, key).put(Body=txt)
else:
STAC_IO.default_write_text_method(uri, txt)
STAC_IO.read_text_method = my_read_method
STAC_IO.write_text_method = my_write_method
If you are only going to read from another source, e.g. HTTP, you could only replace the
read method. For example, using the `requests library
<https://requests.kennethreitz.org/en/master>`_:
from pystac import Link
from pystac.stac_io import DefaultStacIO, StacIO
class CustomStacIO(DefaultStacIO):
def __init__():
self.s3 = boto3.resource("s3")
def read_text(
self, source: Union[str, Link], *args: Any, **kwargs: Any
) -> str:
parsed = urlparse(uri)
if parsed.scheme == "s3":
bucket = parsed.netloc
key = parsed.path[1:]
obj = self.s3.Object(bucket, key)
return obj.get()["Body"].read().decode("utf-8")
else:
return super().read_text(source, *args, **kwargs)
def write_text(
self, dest: Union[str, Link], txt: str, *args: Any, **kwargs: Any
) -> None:
parsed = urlparse(uri)
if parsed.scheme == "s3":
bucket = parsed.netloc
key = parsed.path[1:]
s3 = boto3.resource("s3")
s3.Object(bucket, key).put(Body=txt, ContentEncoding="utf-8")
else:
super().write_text(dest, txt, *args, **kwargs)
StacIO.set_default(CustomStacIO)
If you only need to customize read operations you can inherit from
:class:`~pystac.stac_io.DefaultStacIO` and only overwrite the read method. For example,
to take advantage of connection pooling using a `requests.Session
<https://requests.kennethreitz.org/en/master>`__:

.. code-block:: python
from urllib.parse import urlparse
import requests
from pystac import STAC_IO
def my_read_method(uri):
parsed = urlparse(uri)
if parsed.scheme.startswith('http'):
return requests.get(uri).text
else:
return STAC_IO.default_read_text_method(uri)
STAC_IO.read_text_method = my_read_method
from pystac.stac_io import DefaultStacIO, StacIO
class ConnectionPoolingIO(DefaultStacIO):
def __init__():
self.session = requests.Session()
def read_text(
self, source: Union[str, Link], *args: Any, **kwargs: Any
) -> str:
parsed = urlparse(uri)
if parsed.scheme.startswith("http"):
return self.session.get(uri).text
else:
return super().read_text(source, *args, **kwargs)
StacIO.set_default(ConnectionPoolingIO)
Validation
==========
Expand Down
1 change: 1 addition & 0 deletions pystac/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from pystac.errors import (
STACError,
STACTypeError,
DuplicateObjectKeyError,
ExtensionAlreadyExistsError,
ExtensionNotImplemented,
ExtensionTypeError,
Expand Down
6 changes: 6 additions & 0 deletions pystac/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@ class STACTypeError(Exception):
pass


class DuplicateObjectKeyError(Exception):
"""Raised when deserializing a JSON object containing a duplicate key."""

pass


class ExtensionTypeError(Exception):
"""An ExtensionTypeError is raised when an extension is used against
an object that the extension does not apply to
Expand Down
16 changes: 13 additions & 3 deletions pystac/item_collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,16 +134,26 @@ def clone(self) -> "ItemCollection":
)

@classmethod
def from_dict(cls, d: Dict[str, Any]) -> "ItemCollection":
def from_dict(
cls, d: Dict[str, Any], preserve_dict: bool = True
) -> "ItemCollection":
"""Creates a :class:`ItemCollection` instance from a dictionary.
Arguments:
d : The dictionary from which the :class:`~ItemCollection` will be created
preserve_dict: If False, the dict parameter ``d`` may be modified
during this method call. Otherwise the dict is not mutated.
Defaults to True, which results results in a deepcopy of the
parameter. Set to False when possible to avoid the performance
hit of a deepcopy.
"""
if not cls.is_item_collection(d):
raise STACTypeError("Dict is not a valid ItemCollection")

items = [pystac.Item.from_dict(item) for item in d.get("features", [])]
items = [
pystac.Item.from_dict(item, preserve_dict=preserve_dict)
for item in d.get("features", [])
]
extra_fields = {k: v for k, v in d.items() if k not in ("features", "type")}

return cls(items=items, extra_fields=extra_fields)
Expand All @@ -166,7 +176,7 @@ def from_file(

d = stac_io.read_json(href)

return cls.from_dict(d)
return cls.from_dict(d, preserve_dict=False)

def save_object(
self,
Expand Down
Loading

0 comments on commit 7f9fef4

Please sign in to comment.