-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
more geometry heuristics for validate/repair #5
Comments
BTW, The third could be achieved with ad-hoc binarization and some simple Numpy statistics like And orientation checking could be done in a similar way like deskewing (i.e. entropy based), but with some kind of confidence measure. |
A good reference for additional checks are the validation error classes in Aletheia, p. 118/119. |
now renamed to https://github.com/OCR-D/ocrd_segment (there will be more processors) |
https://github.com/OCR-D/ocrd_segment is a better place for this. |
Moved the original issue from core here to have a better reminder of what is left to do. Out of the original list, we are still somewhere in the first item I think. (We do not yet check whether elements are properly contained within their parents' outline.) |
And the question then is, how does repair look like in that case? Shrink the element's polygon or xtend the parent's polygon? |
With #15 we now have covered the first item, except for repair. So far, we can only repair:
|
Partial Overlap of region
|
Yes, but for text regions we would need to bring in the concept of Allowable Merge (w.r.t. A merge is allowed iff And if a merge is not allowed between two overlapping text regions, then the intersecting foreground should somehow fall into that region which it is most consistent with (i.e. regarding its alignment and center of mass).
If BTW, do we want to go into the complexities of using PAGE-XML's |
I fear this implies drastic changes to
We have to distinguish here: Right now, we do not have any RO computation. It is more or less arbitrary! Maybe if the DFKI guys deliver this will change. I think we should sanitize and fix the RO ad hoc. |
Agreed. (The way this is formalised in PAGE-XML, it would still be impossible to separate/suppress foreground automatically.)
I disagree. Even if we don't know the reading order, that's a separate problem. No RO equals default RO (i.e. XML element order), right? Whatever the RO in the document, the repair decision always depends on it.
Fixing RO is another problem/step. And especially when we have overlapping regions, this becomes circular if all we can do is heuristics. IMHO a good RO detection would have to be data-driven, and informed by the precise |
Actually, I think that indeed RO = default RO. But your right, we should not base hacks on hacks. |
Well, or maybe just a little: Let's say we have a region segmentation like Tesseract that can output reading direction within regions (via orientation analysis), but is really bad on reading order between regions – creating XML elements more or less in random order. (The same could happen with a NN module without RO.) Now strictly when repairing we would be unable to merge or split most of the time (because 2 neighbouring/overlapping regions are XML successors only by chance). But we could still repair the unambiguous cases if we first added a new RO based on a top-down-left-to-right assumption (treating overlapping regions as neighbours), ... I think. At least as an extra option for the desparate. |
We should have heuristics to check for
@orientation
Originally posted by @kba in OCR-D/assets#28 (comment)
The text was updated successfully, but these errors were encountered: