You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had a user report he wouldn't get good results. It turned out he used --text-equiv-level line when there was no line text. Ways to improve dinglehopper's behavior here:
Warn if no text is extracted (and maybe do so in a smart way. "no text" can be valid on empty pages.)
Warn if there are grave inconsistencies between levels (harder; and line vs region text can differ in a small ways)
Warn if there are grave differences between GT and OCR (e.g. no GT text but lots of OCR text; need to think about this more)
Check if I could use OCR-D libs here (I'm somewhat skeptical to change something here because the text extraction code here is working, and OCR-D changes a lot comparatively)
The text was updated successfully, but these errors were encountered:
I had a user report he wouldn't get good results. It turned out he used
--text-equiv-level line
when there was no line text. Ways to improve dinglehopper's behavior here:The text was updated successfully, but these errors were encountered: