Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GDEF definition to OpenType format #369

Merged
merged 4 commits into from
Jun 9, 2022
Merged

Add GDEF definition to OpenType format #369

merged 4 commits into from
Jun 9, 2022

Conversation

wezm
Copy link
Contributor

@wezm wezm commented Jun 8, 2022

Turns out there is a lot in GDEF. The weirdest/most interesting bit is probably u16_div_ceil where I need the ceiling of a division... so I implemented that with just integer ops. Unfortunately I've been unable to find a single font that exercises that path.

formats/opentype.fathom Outdated Show resolved Hide resolved
data <- match class_format {
1 => class_def_format_1,
2 => class_def_format_2,
_ => fail,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do some of these uses of fail need to be marked in a more forwards-compatible way, or are they closed for future extension?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of an open question in my mind, at least partially informed by the use case. I'm picturing two use-cases: exploring a font file, and parsing and using that data in a font file. I suppose I've been thinking more along the lines of the latter, which might actually be wrong for where we're at.

Where I've used fail in this PR there is no data to fall back on. For example, If someone was relying on getting class def information we'd be unable to provide it if it used a format we were unaware of. Presumably the alternative is return something in place of failing, do you have thoughts on what that could be?

In other places like the handling of the GDEF minor version we fall back on the newest version we know about as each of the revisions is additive so we can read up to the version we know.

Copy link
Member

@brendanzab brendanzab Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way I've been handling it is with the unknown_table format which is just a empty record format that always succeeds.

/// # Unknown table format
///
/// This is a placeholder for a table that has an unknown identifier (due to the
/// file conforming to a newer version of the specification), or for a table has
/// not yet been implemented.
let unknown_table : Format = {};

It used to be used as a fallback case to a match arm, before we switched to our current approach for font tables. This meant that the match arm would succeed if a new version was introduced that we didn't recognise, but no data would be read. Unfortunately this approach needs to be used with care… like you need to use it at the end of a substream, otherwise the format could continue to be read afterwards. I think this would only really occur in a poorly designed format I guess, though it might be useful to have some sort of way to skip to the end of the stream to protect against this kind of mistake.

Copy link
Member

@brendanzab brendanzab Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, we currently use unknown_table for the cmap subtables:

/// # Character Mapping subtable
let cmap_subtable = fun (platform : Repr platform_id) => {
/// The start of the character mapping sub-table
table_start <- stream_pos,
/// Format number of the subtable
format <- u16be,
data <- match format {
0 => cmap_subtable_format0 platform,
2 => cmap_subtable_format2 platform,
4 => cmap_subtable_format4 platform,
6 => cmap_subtable_format6 platform,
8 => cmap_subtable_format8 platform,
10 => cmap_subtable_format10 platform,
12 => cmap_subtable_format12 platform,
13 => cmap_subtable_format13 platform,
14 => cmap_subtable_format14 platform table_start,
_ => unknown_table,
},
};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve had a bit of a think about it, and don't think this is a huge issue right now, so feel free to ignore me. And yeah thanks for the consideration of the different use cases… seems that the desired behavior might be a bit application specific. Food for thought at the very least!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh I pondered it too and decided to change it to unknown_table:

  • Nothing else was using fail.
  • The doc comments on unknown_table are clear that it's for cases like this.
  • We can change it to something like { unknown <- {} } as you proposed in the future to make it show up in the output if we want.

formats/opentype.fathom Outdated Show resolved Hide resolved
Comment on lines +317 to +318
/// Curiously the table has two interpretations. The second interprtation appears to be have been
/// tacked on for variable fonts. The gist being that if the delta format is 0x8000 then the table
/// is a VariationIndex table, which names the fields differently and does not contain a delta
/// value array. E.g.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhh, I get it now! It's not particularly elegant, but this does seems like a place where an overlap format could be used:

overlap {
    // Initial pass to figure out the table format
    init <- {
        _skipped <- array8 4 u8,
        table_format <- u16be,
    },
    // Device and VariationIndex Tables
    table <- match init.table_format {
        0x0001 => device_table,
        0x0002 => device_table,
        0x0003 => device_table,
        0x8000 => variation_index_table,
        0x7FFC => unknown_table, // not sure if this was meant to be “anything up to`0x7FFC`”?
        _ => fail,
    },
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    0x7FFC => unknown_table, // not sure if this was meant to be “anything up to`0x7FFC`”?

From the comment in the spec I think they mean that each of the bits in 0x7FFC (as opposed to that particular value) are reserved for future use (and should be set to zero):

For future use — set to 0

formats/opentype.fathom Outdated Show resolved Hide resolved
@wezm wezm force-pushed the gdef branch 3 times, most recently from b4acdf7 to f877562 Compare June 9, 2022 06:28
Copy link
Member

@brendanzab brendanzab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, looks good!

@wezm wezm merged commit 810e908 into main Jun 9, 2022
@wezm wezm deleted the gdef branch June 9, 2022 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants