-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification for 4.14. The textangle property #101
Comments
(Note: I can also just guess on the meaning.) I agree that there is the image before the OCR ("text in the original image") and the image after the first steps of the OCR process, where the boxes are overlayed. If the latter image is derived by a rotation then we can measure this angle in one or the other direction (however, technically also the midpoint of the rotation should be known). This would also be my first idea for this However here is an example hOCR from Tesseract where this property occurs (with the value of 90). I haven't yet identified the corresponding area in the images... It seems also that Your hocr-proofreader looks great 🌟! We tried something similar with the ocr-gt-tools and there is @kba's hocrjs. |
The layout analysis usually splits the page to 'blocks'. The "textual content" refers to a 'block' that contains text. |
Tesseract can identify the rotation (0 / 90 / 180 270) of the page, and also the rotation of individual text blocks in the page. A text block can have a rotation that is different than other text blocks in the page. |
@zuphilip / @amitdo ah okay ... good to know, I also thought, @zuphilip Cool, I did some googling about hocr-gui-editors some time ago and didn't find them. I see you had the same problems and similar ideas ;-) I still have to try out "ocr-gt-tools" ... or is there an online demo? Maybe we can combine the projects or at least the ideas ... hocr-proofreader is just a <500 line JS prototype for now, so I am open for anything ;-) Greets from Munich! |
No, there is no online demo, but with the Dockerfile it should be easy to get it running locally, see https://github.com/UB-Mannheim/ocr-gt-tools/blob/master/INSTALL.md#docker-quickstart
I would love to collaborate on code and ideas. |
It seems that currently there is no way to express page skew with the hOCR format. You can get this info with the Tesseract API. |
@amitdo Oh yes, I forgot the licence. I added the MIT licence now. Okay, then I misunderstood the |
Do you have this conversion as some script or XSLT file? We have some other transformations between different OCR file formats collected in ocr-fileformats... |
Yes, it's currently an XSLT implementation with some PHP code ... but it's still a work-in-progress. Then I first started implementing the GUI to get a feeling, which information is really needed to display the results properly (and that's why I opened this ticket about page skew ;-)). I had a look at ocr-fileformat ... and yes, maybe it makes sense to integrate it there. Maybe I'm blindfolded, but where are your stylesheets? The "xslt" folder just has an "alto2.0__alto3.0.xsl". |
We would love to add it there, PRs are welcomed.
Actually, most of the stylesheets/scripts for validation and transformation are maintained outside and will be just in the installation process been integrated. See also https://github.com/UB-Mannheim/ocr-fileformat#license (further ideas and links can be found in the issues). |
I'll have a deeper look into the project :-) Also into ocr-gt-tools ... I got the Docker-Container running, but when pasting an URL, there is a permission denied error (dragging a file into it gives a "Keine URL erkannt" or similar error). I'll have a closer look at it, when I have some more time. |
Still not clear for me, if "textual content" refers to the "text in the original image" or to "the text bboxes in OCR-result", where the rotation-direction will be the opposite. If the text on the original page is rotated anti-clockwise, the page (and therefore the OCR result/bboxes) has been rotated clockwise to get straight.
I guess, the textangle refers to the rotation on the original page, right? To be more specific: If the lines in the original image runs upwards, this value is positive?
By the way: If it is interesting for someone: I currently started a "Web based JavaScript GUI library for proofreading/editing hOCR": https://github.com/not-implemented/hocr-proofreader ... the most helpful feature for me to find OCR errors, is the switch between the original image and the hOCR-text rendered at the same position. But it's still a prototype and a lot of work to do ;-)
The text was updated successfully, but these errors were encountered: