From be093d2e66d1b08156727222bca30d298ed02d0e Mon Sep 17 00:00:00 2001 From: Newel H <37004249+newelh@users.noreply.github.com> Date: Wed, 16 Aug 2023 10:43:37 -0400 Subject: [PATCH] chore: Update dead links to correct pages (#1127) Summary Closes #1124 Updates dead links in repository README - Quick Start > Install for local development - Learn more > Batch Processing) Updates document dependencies to include tesseract-lang for additional language support (requirement for tests to pass) Testing All tests pass --- CHANGELOG.md | 6 ++++++ README.md | 6 +++--- unstructured/__version__.py | 2 +- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 2ab851ac67..fb93a120dc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,9 @@ +## 0.10.1-dev0 + +### Fixes +* Fix dead links in repository README (Quick Start > Install for local development, and Learn more > Batch Processing) +* Update document dependencies to include tesseract-lang for additional language support (required for tests to pass) + ## 0.10.0 ### Enhancements diff --git a/README.md b/README.md index 5499901cbc..3f8397b8d0 100644 --- a/README.md +++ b/README.md @@ -55,7 +55,7 @@ There are several ways to use the `unstructured` library: * [Run the library in a container](https://github.com/Unstructured-IO/unstructured#using-the-library-in-a-container) or * Install the library 1. [Install from PyPI](https://github.com/Unstructured-IO/unstructured#installing-the-library) - 2. [Install for local development](https://github.com/Unstructured-IO/unstructured#coffee-installation-instructions-for-local-development) + 2. [Install for local development](https://github.com/Unstructured-IO/unstructured#installation-instructions-for-local-development) * For installation with `conda` on Windows system, please refer to the [documentation](https://unstructured-io.github.io/unstructured/installing.html#installation-with-conda-on-windows) ### Run the library in a container @@ -117,7 +117,7 @@ installation. Depending on what document types you're parsing, you may not need all of these. - `libmagic-dev` (filetype detection) - `poppler-utils` (images and PDFs) - - `tesseract-ocr` (images and PDFs) + - `tesseract-ocr` (images and PDFs, install `tesseract-lang` for additional language support) - `libreoffice` (MS Office docs) - `pandoc` (EPUBs, RTFs and Open Office docs) @@ -244,4 +244,4 @@ Encountered a bug? Please create a new [GitHub issue](https://github.com/Unstruc |-|-| | [Company Website](https://unstructured.io) | Unstructured.io product and company info | | [Documentation](https://unstructured-io.github.io/unstructured) | Full API documentation | -| [Batch Processing](Ingest.md) | Ingesting batches of documents through Unstructured | +| [Batch Processing](unstructured/ingest/README.md) | Ingesting batches of documents through Unstructured | diff --git a/unstructured/__version__.py b/unstructured/__version__.py index 42437d967c..cc242ee053 100644 --- a/unstructured/__version__.py +++ b/unstructured/__version__.py @@ -1 +1 @@ -__version__ = "0.10.0" # pragma: no cover +__version__ = "0.10.1-dev0" # pragma: no cover