Skip to content

Commit

Permalink
Merge branch 'main' into cd.new-regions
Browse files Browse the repository at this point in the history
  • Loading branch information
cameron-dunn-sublime authored Oct 2, 2023
2 parents 8e45079 + 190d6dc commit b3c88ae
Show file tree
Hide file tree
Showing 4 changed files with 4 additions and 5 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ Guidelines for contributing can be found [here](https://github.com/target/strelk

## Known Issues
There is currently a known issue with compilation on ARM based hosts (e.g., Apple M1). Attempting to compile the current version of Strelka will lead to the following issue:
https://github.com/target/strelka/issues/188. You can bypass this compilation issue by removing `pymupdf` from the backend Python `requirements.txt` file and commenting out ScanPDF in the `backend.yml` file. Doing this will allow you to compile the current version of Strelka at the expense of being unable to scan PDF files.
https://github.com/target/strelka/issues/188. You can bypass this compilation issue by removing `pymupdf` from the backend Python `requriements.txt` file and commenting out ScanPDF in the `backend.yml` file. Doing this will allow you to compile the current version of Strelka at the expense of being unable to scan PDF files.

## Related Projects
* [Laika BOSS](https://github.com/lmco/laikaboss)
Expand Down
2 changes: 1 addition & 1 deletion build/python/backend/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM ubuntu:22.10
FROM ubuntu:22.04
ARG DEBIAN_FRONTEND=noninteractive
LABEL maintainer="Target Brands, Inc. [email protected]"

Expand Down
2 changes: 1 addition & 1 deletion build/python/backend/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ olefile==0.46
oletools==0.60.1
opencv-python==4.6.0.66
opencv-contrib-python==4.6.0.66
PyMuPDF==1.22.5 # https://github.com/pymupdf/PyMuPDF/issues/2617
PyMuPDF==1.19.6
pefile==2019.4.18
pgpdump3==1.5.2
pyelftools==0.27
Expand Down
3 changes: 1 addition & 2 deletions src/python/strelka/scanners/scan_ocr.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,8 @@ def scan(self, data, file, options, expire_at):
pdf_to_png = options.get('pdf_to_png', False)

if pdf_to_png and 'application/pdf' in file.flavors.get('mime', []):
# TODO: Use fitz builtin OCR support which also wraps tesseract
doc = fitz.open(stream=data, filetype='pdf')
data = doc.get_page_pixmap(0, dpi=120).tobytes('png')
data = doc.get_page_pixmap(0).tobytes('png')

with tempfile.NamedTemporaryFile(dir=tmp_directory) as tmp_data:
tmp_data.write(data)
Expand Down

0 comments on commit b3c88ae

Please sign in to comment.