forked from openpaperwork/pyocr
-
Notifications
You must be signed in to change notification settings - Fork 0
/
ChangeLog
46 lines (38 loc) · 1.61 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
0.3.0 --> 0.3.1:
* tesseract.detect_orientation(): Use a temporary file instead of stdin
to transmit the image to Tesseract. Tesseract 3.04 doesn't support
stdin + "-psm 0" (regression ?)
* tesseract.detect_orientation(): Improve output parsing reliability
* optim: Avoid unnecessary convert to RGB and allow using image formats
different from PNG
* TextBuilder + Cuneiform: add extra settings for Cuneiform
(cuneiform_dotmatrix, cuneiform_fax=False, cuneiform_singlecolumn)
0.2.4 --> 0.3.0:
* New API: pyocr.<tool>.can_detect_orientation() and
pyocr.<tool>.detect_orientation()
0.2.3 --> 0.2.4:
* Tesseract : add digit-only support
* Tesseract : add support for Tesseract subsets of layout analysis (-psm)
0.2.2 --> 0.2.3:
* Strip the alpha channel from images before running the OCR. It's basically
useless and can prevent the tool from working correctly.
* Make hOCR parsing more resistant (handle extra data around box positions)
* Fix: Take into account that new versions of Tesseract uses the file
extension .hocr instead of .html
0.2.1 --> 0.2.2:
* Fix Python 3 support
* Add support for Tesseract on Heroku
0.2.0 --> 0.2.1:
* Make it possible to use 'import pyocr' instead of 'from pyocr import pyocr'.
'from pyocr import pyocr' still works but is obsolete.
* Fix dependency list: depends on Pillow (it's untested with PIL)
* Fix pyocr.VERSION
0.1.2 --> 0.2.0:
* Python 3.x support
0.1.1 --> 0.1.2:
* Tesseract: Fix version parsing
* Tesseract: Fix Tesseract 3.02.01's hOCR format support
0.1 --> 0.1.1:
* hOCR: Parse lines as well as words
* tesseract.get_available_languages() : Fix fedora support
* Fix UTF-8 support