Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification for 4.14. The textangle property #101

Open
not-implemented opened this issue Jan 22, 2017 · 13 comments
Open

Clarification for 4.14. The textangle property #101

not-implemented opened this issue Jan 22, 2017 · 13 comments

Comments

@not-implemented
Copy link

The angle in degrees by which textual content has been rotate relative to the rest of the page (if not present, the angle is assumed to be zero); rotations are counter-clockwise, so an angle of 90 degrees is vertical text running from bottom to top in Latin script; note that this is different from reading order, which should be indicated using standard HTML properties

Still not clear for me, if "textual content" refers to the "text in the original image" or to "the text bboxes in OCR-result", where the rotation-direction will be the opposite. If the text on the original page is rotated anti-clockwise, the page (and therefore the OCR result/bboxes) has been rotated clockwise to get straight.

I guess, the textangle refers to the rotation on the original page, right? To be more specific: If the lines in the original image runs upwards, this value is positive?

By the way: If it is interesting for someone: I currently started a "Web based JavaScript GUI library for proofreading/editing hOCR": https://github.com/not-implemented/hocr-proofreader ... the most helpful feature for me to find OCR errors, is the switch between the original image and the hOCR-text rendered at the same position. But it's still a prototype and a lot of work to do ;-)

@zuphilip
Copy link
Collaborator

zuphilip commented Jan 22, 2017

(Note: I can also just guess on the meaning.) I agree that there is the image before the OCR ("text in the original image") and the image after the first steps of the OCR process, where the boxes are overlayed. If the latter image is derived by a rotation then we can measure this angle in one or the other direction (however, technically also the midpoint of the rotation should be known). This would also be my first idea for this textangle property.

However here is an example hOCR from Tesseract where this property occurs (with the value of 90). I haven't yet identified the corresponding area in the images... It seems also that textangle can be a property of different elements not on the page-level, which is strange for me.

Your hocr-proofreader looks great 🌟! We tried something similar with the ocr-gt-tools and there is @kba's hocrjs.

@amitdo
Copy link
Collaborator

amitdo commented Jan 22, 2017

The layout analysis usually splits the page to 'blocks'.

The "textual content" refers to a 'block' that contains text.

@amitdo
Copy link
Collaborator

amitdo commented Jan 22, 2017

Tesseract can identify the rotation (0 / 90 / 180 270) of the page, and also the rotation of individual text blocks in the page. A text block can have a rotation that is different than other text blocks in the page.

@not-implemented
Copy link
Author

@zuphilip / @amitdo ah okay ... good to know, I also thought, textangle makes sense only on page-level to rotate askew scanned pages (and that's the only use-case I need).

@zuphilip Cool, I did some googling about hocr-gui-editors some time ago and didn't find them. I see you had the same problems and similar ideas ;-) I still have to try out "ocr-gt-tools" ... or is there an online demo? Maybe we can combine the projects or at least the ideas ... hocr-proofreader is just a <500 line JS prototype for now, so I am open for anything ;-) Greets from Munich!

@zuphilip
Copy link
Collaborator

I still have to try out "ocr-gt-tools" ... or is there an online demo?

No, there is no online demo, but with the Dockerfile it should be easy to get it running locally, see https://github.com/UB-Mannheim/ocr-gt-tools/blob/master/INSTALL.md#docker-quickstart

Maybe we can combine the projects or at least the ideas ...

I would love to collaborate on code and ideas.

@amitdo
Copy link
Collaborator

amitdo commented Jan 22, 2017

@amitdo
Copy link
Collaborator

amitdo commented Jan 23, 2017

It seems that currently there is no way to express page skew with the hOCR format.

You can get this info with the Tesseract API.

@not-implemented
Copy link
Author

@amitdo Oh yes, I forgot the licence. I added the MIT licence now.

Okay, then I misunderstood the textangle option. But then we need a pageskew option or something like that in the spec ;-) (I currently convert OmniPage XML files to hOCR - and OmniPage XML has also a "skew" attribute - and of course this information is needed to properly display the results)

@amitdo
Copy link
Collaborator

amitdo commented Jan 23, 2017

@zuphilip
Copy link
Collaborator

I currently convert OmniPage XML files to hOCR...

Do you have this conversion as some script or XSLT file? We have some other transformations between different OCR file formats collected in ocr-fileformats...

@not-implemented
Copy link
Author

Do you have this conversion as some script or XSLT file?

Yes, it's currently an XSLT implementation with some PHP code ... but it's still a work-in-progress. Then I first started implementing the GUI to get a feeling, which information is really needed to display the results properly (and that's why I opened this ticket about page skew ;-)).

I had a look at ocr-fileformat ... and yes, maybe it makes sense to integrate it there. Maybe I'm blindfolded, but where are your stylesheets? The "xslt" folder just has an "alto2.0__alto3.0.xsl".

@zuphilip
Copy link
Collaborator

maybe it makes sense to integrate it there

We would love to add it there, PRs are welcomed.

Maybe I'm blindfolded, but where are your stylesheets?

Actually, most of the stylesheets/scripts for validation and transformation are maintained outside and will be just in the installation process been integrated. See also https://github.com/UB-Mannheim/ocr-fileformat#license (further ideas and links can be found in the issues).

@not-implemented
Copy link
Author

We would love to add it there, PRs are welcomed.

I'll have a deeper look into the project :-)

Also into ocr-gt-tools ... I got the Docker-Container running, but when pasting an URL, there is a permission denied error (dragging a file into it gives a "Keine URL erkannt" or similar error). I'll have a closer look at it, when I have some more time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants