better transparency handling for structure recognizer #2115
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Some recognizable structures have "transparent" cells. However, the Aho-Corasick algorithm doesn't support "wildcards", which is essentially what transparent cells are, for the purpose of matching. So, for a structure template that contains transparent cells to be recognized, the cells of the world must also have "empty" cells in the same place.
Despite this, we can accommodate certain instances of structure recognition when the transparent cells are "overlapped" (or "incurred upon") by some other entity. We do this by "masking out" entities from the world that don't participate in the set of structures we're trying to match against.
This often works well, except when the set of structures to be recognized contains:
S1
with transparent cellsS2
with some additional type of entityE
not participating inS1
E
occupies a transparent cell ofS1
.E
can't be masked out since we're trying to simultaneously recognize structures of typeS2
in the same Aho-Corasick pass. So a legitimate instance ofS1
will fail to be recognized.Workaround
We can make multiple Aho-Corasick passes: one pass for each distinct set of "masked entities". Notably:
Caveats
Note that this fundamental limitation of Aho-Corasick still exists for "incurring entities" of the same type as the entity participants in the structure to be recognized.
E.g., if we have a structure defined as:
then, depending on the order of entity placement, the following occurrence in the world will not be recognized:
Testing
Also in this PR