From 75298e72ce26b15c6c3e327896bdc5e18f6ee687 Mon Sep 17 00:00:00 2001 From: christinestraub Date: Tue, 3 Oct 2023 12:44:32 -0700 Subject: [PATCH] chore: update changelog & version --- CHANGELOG.md | 5 ++--- unstructured/__version__.py | 2 +- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 23878a7c2f..2c78797a1c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,7 +1,8 @@ -## 0.10.19-dev8 +## 0.10.19-dev9 ### Enhancements +* **Align to top left when shrinking bounding boxes for `xy-curt` sorting:** Update `shrink_bbox()` to keep top left rather than center * **bump `unstructured-inference` to `0.6.6`** The updated version of `unstructured-inference` makes table extraction in `hi_res` mode configurable to fine tune table extraction performance; it also improves element detection by adding a deduplication post processing step in the `hi_res` partitioning of pdfs and images. * **Detect text in HTML Heading Tags as Titles** This will increase the accuracy of hierarchies in HTML documents and provide more accurate element categorization. If text is in an HTML heading tag and is not a list item, address, or narrative text, categorize it as a title. * **Update python-based docs** Refactor docs to use the actual unstructured code rather than using the subprocess library to run the cli command itself. @@ -9,8 +10,6 @@ * **Adds Table support for the `add_chunking_strategy` decorator to partition functions.** In addition to combining elements under Title elements, user's can now specify the `max_characters=` argument to chunk Table elements into TableChunk elements with `text` and `text_as_html` of length characters. This means partitioned Table results are ready for use in downstream applications without any post processing. * **Expose endpoint url for s3 connectors** By allowing for the endpoint url to be explicitly overwritten, this allows for any non-AWS data providers supporting the s3 protocol to be supported (i.e. minio). -### Features - ### Features ### Fixes diff --git a/unstructured/__version__.py b/unstructured/__version__.py index acf12be0ae..d71d465e92 100644 --- a/unstructured/__version__.py +++ b/unstructured/__version__.py @@ -1 +1 @@ -__version__ = "0.10.19-dev8" # pragma: no cover +__version__ = "0.10.19-dev9" # pragma: no cover