Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

markFeatureWriter mark class grouping is problematic #762

Open
khaledhosny opened this issue Jul 4, 2023 · 6 comments
Open

markFeatureWriter mark class grouping is problematic #762

khaledhosny opened this issue Jul 4, 2023 · 6 comments

Comments

@khaledhosny
Copy link
Collaborator

I was debugging a rather odd InDesign bug, where all my mark positioning gets lost, and I tracked it down to:

def _groupMarkClasses(self, markGlyphToMarkClasses):
# To compute the number of lookups that we need to build, we want
# the minimum number of lookups such that, whenever a mark glyph
# belongs to several mark classes, these classes are not in the same
# lookup. A trivial solution is to make 1 lookup per mark class
# but that's a bit wasteful, we might be able to do better by grouping
# mark classes that do not conflict.
# This is a graph coloring problem: the graph nodes are mark classes,
# edges are between classes that would conflict and the colors are
# the lookups in which they can go.
adjacency = {
# We'll get the same markClass several times in the dict
# comprehension below but it's ok, only one will be kept.
markClass: set()
for markClasses in markGlyphToMarkClasses.values()
for markClass in markClasses
}
for _markGlyph, markClasses in markGlyphToMarkClasses.items():
for markClass, other in itertools.combinations(markClasses, 2):
adjacency[markClass].add(other)
adjacency[other].add(markClass)
colorGroups = colorGraph(adjacency)
# Sort the groups, because the group that contains MC_top or MC_bottom
# needs to go to the end (as specified in self.anchorSortKey) so that
# they are applied last and "win" in case of conflict.
# We also sort alphabetically for reproducibility, both within each
# group and between groups.
return sorted(
[sorted(group) for group in colorGroups],
key=lambda group: (
# The first part sorts _top and _bottom at the end.
# There's a minus sign in front of the min because the original
# self.anchorSortKey was designed to put the _top and _bottom
# at the start (and now we want them at the end).
-min(
# Remove the MC prefix because that's how the mark classes
# are looking at this stage (the original
# self.anchorSortKey was applied at a different stage of
# the algorithm, on anchors instead of mark classes)
self.anchorSortKey.get(self._removeClassPrefix(markClass), 0)
for markClass in group
),
# Second part of the tuple sorts the groups lexicographically
group,
),
)

Usually I’d brush this as an InDesign bug and move on, but I noticed something with HarfBuzz. Try with this version of the font with mark class grouping:

$ hb-view Amiri-Regular.ttf "لَكم محَبر"

before

and now with this version without grouping:
after

The later output is the correct/expected one.

The file without grouping is smaller as well despite having more number of lookups (I noticed this with other fonts too), so I’m not sure what is the benefit of this mark class grouping over doing the simple thing?

@belluzj
Copy link
Collaborator

belluzj commented Jul 4, 2023

From what I remember, the point of grouping was to have fewer lookups because each lookup = one more loop over the input string to apply the lookup, if I understand correctly (so it was more a performance consideration than file size). That being said, I don't know how much the performance penalty would be, and whether some people would be ready to trade it for smaller file size, especially if the gains are not negligible.

I remember from discussing this issue some time ago that you used to use "ambiguous" anchor attachments, as per your comment here: #416 (comment) ; could this be such a case? Do you get the info message about ambiguous stuff?

@khaledhosny
Copy link
Collaborator Author

Yes, I get messages about ambiguous anchors, but I get that with pretty much all of my fonts that I build with fontmake.

@khaledhosny
Copy link
Collaborator Author

I’m using ufo2ft directly here, though, and I have some hacky code to force a particular lookup order. This probably gets broken when some anchors are merged together in the same lookup (This font has a long history, and for most of its life it was developed using FontForge where I had complete control on how the lookups are built and ordered, which I switched to UFO/ufo2ft I had to hack my way to get a matching output).

@belluzj
Copy link
Collaborator

belluzj commented Jul 4, 2023

Maybe you could use your own copy of the mark feature writer in which you disable grouping?

@anthrotype
Copy link
Member

maybe grouping should be smarter and avoid grouping ambiguous attachements

@khaledhosny
Copy link
Collaborator Author

I’m essentially using my own mark feature writer, it is just this is second time I get issues with this grouping logic (the other one was in a certain secret project, and I ended up forking the mark feature writer there as well, but I don’t remember the exact details of the issue).

I’m also not sure the number of lookups can be that a performance issue, it is not like you are going to end up with thousands of them. This font has too many anchor classes than most typical projects, and I got only 8 more lookups (that is out of 62 GPOS lookups).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants