Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make designer dirs ascii #375

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

m4rc1e
Copy link
Collaborator

@m4rc1e m4rc1e commented Jun 14, 2021

Fixes #371

@m4rc1e
Copy link
Collaborator Author

m4rc1e commented Jun 14, 2021

This will still fail for certain combos such as

>>> strip_accents("Ǽplas").isascii()
False

@simoncozens
Copy link
Contributor

Is this designer directory exposed to the public anywhere? How is it used? I feel like doing something fragile like this will end up being a time bomb which will go off at some later date when a font gets added that's designed by 杉山誠 or युनिवर्सल थर्स्ट. I'd feel happier just using an MD5sum or some other kind of hash.

@m4rc1e
Copy link
Collaborator Author

m4rc1e commented Jun 14, 2021

Is this designer directory exposed to the public anywhere?

Designer dirs are here https://github.com/google/fonts/tree/main/catalog/designers

How is it used?

Each directory contains the metadata for each font author. If a user runs gftools add-designer and the dir doesn't already exist, one will be created.

Thinking about it, I'm not sure why we need to ensure that these dirs are ascii only. AFAIK win, mac and linux all support non-ascii names. I guess we want it because the current dirs are all ascii.

@m4rc1e
Copy link
Collaborator Author

m4rc1e commented Jun 15, 2021

@simoncozens After emailing the eng team, it seems like we're stuck with ascii folder names and we can only have Latin names. Due to these requirements, I'm going to do the following:

  • Raise an exception is a user enters a name which isn't Latinized
  • Drop accents and also convert æ to ae etc for dirs

@felipesanches
Copy link
Member

we were asked not to use unidecode due to its license. Please see fonttools/fontbakery#3316 and fonttools/fontbakery#3306

@felipesanches
Copy link
Member

on that PR I included a very crude implementation of the bare minimal needed for ascii-fying the designer names:
https://github.com/googlefonts/fontbakery/pull/3318/files

It would be good if we used the same code snippet on both tools

@m4rc1e
Copy link
Collaborator Author

m4rc1e commented Jun 15, 2021

Thanks Felipe! I'll drop the dependency.

Since we need this in both projects, perhaps we should roll our own with an Apache license?

cc @davelab6

@simoncozens
Copy link
Contributor

The original Unidecode tables by Sean Burke are available under the Perl Artistic License. But perhaps a suitable workaround would be to use something Felipe's bare minimum, and then raise an exception if the designer's name still isn't ASCII after bare-minimum processing, because it indicates that a name has got into the directory that shouldn't be in there.

(I still think you should just use a hash. But anyway.)

@moyogo
Copy link
Contributor

moyogo commented Jun 16, 2021

There's https://pypi.org/project/text-unidecode/ under Artistic License (Perl) 1.0 as another option.

@m4rc1e
Copy link
Collaborator Author

m4rc1e commented Jun 16, 2021

I've made a package which is basically just a large hashmap which covers the basic multilingual plane using the Sean Burke data. I'll bring it up on Friday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add-designer: make sure created directories are ascii only
4 participants