Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language name starts with a upper case #17121

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Arthur-Milchior
Copy link
Member

@Arthur-Milchior Arthur-Milchior commented Sep 23, 2024

Language name starts with a upper case

The menu provided to the users to select their languages is not yet
perfect. It's hard to know for certain what is the correct way to
display the languages to the users. The function provided by Android
was buggy, and so, a year and a half ago, the language names were
hard-coded in #13275. Except that nobody speaks the 93 languages in
which we have translations available.

On the long term, I hope that 92 translators will provide 92 language
string to use. #17120 should start the process to eventually fix this
error.

On the short term, there is one change that we can make that will
probably be, on average, an improvement. Using upper case for the
first letter of each name.

If I understand correctly Brayan's comment, this change would be
correct for Portugese. I can confirm it's correct for French. I can't
promise this won't make things worse for some language. But, if we got
some right previously, it was by accident, and I still hope this is,
on average, an improvement.

The upper cases were obtained by using the "set first later to upper
case" feature of emacs on each language name.


Copy link
Member

@david-allison david-allison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two are incorrect (to my understanding). I used https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes

Do we have this data anywhere in CLDR?


⚠️ EDIT: https://en.wikipedia.org/wiki/IETF_language_tag disagrees with the above

"Venda" to "ve", // Venda
"Tiếng Việt" to "vi", // Vietnamese
"Wolof" to "wo", // Wolof
"isiXhosa" to "xh", // Xhosa
"IsiXhosa" to "xh", // Xhosa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ This is incorrect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry which is referenced by your first wikipedia link only use "Xhosa" while the second link indeed uses isiXhosa.

"հայերեն (Հայաստան)" to "hy-AM", // Armenian (Armenia)
"Hrvatski" to "hr", // Croatian
"Magyar" to "hu", // Hungarian
"Hայերեն (Հայաստան)" to "hy-AM", // Armenian (Armenia)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

"اردو (پاکستان)" to "ur-PK", // Urdu (Pakistan)
"o‘zbek" to "uz", // Uzbek
"O‘zbek" to "uz", // Uzbek
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O‘zbek is not on Wikipedia as an Endonym

AnkiDroid/src/main/java/com/ichi2/utils/LanguageUtil.kt Outdated Show resolved Hide resolved
@Arthur-Milchior
Copy link
Member Author

I applied your comment.

Honestly, I'm not exactly sure where to get confirmation, appart from waiting for translators to provide feedback through crowdin. I don't expect to easily find native Uzbek` speaker for example

@david-allison
Copy link
Member

I flagged potential issues. Stick with the CLDR-proposed names unless you've confirmed either way

Copy link
Member

@david-allison david-allison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I aim to approve after this mix of reverts and changes

AnkiDroid/src/main/java/com/ichi2/utils/LanguageUtil.kt Outdated Show resolved Hide resolved
AnkiDroid/src/main/java/com/ichi2/utils/LanguageUtil.kt Outdated Show resolved Hide resolved
AnkiDroid/src/main/java/com/ichi2/utils/LanguageUtil.kt Outdated Show resolved Hide resolved
AnkiDroid/src/main/java/com/ichi2/utils/LanguageUtil.kt Outdated Show resolved Hide resolved
AnkiDroid/src/main/java/com/ichi2/utils/LanguageUtil.kt Outdated Show resolved Hide resolved
AnkiDroid/src/main/java/com/ichi2/utils/LanguageUtil.kt Outdated Show resolved Hide resolved
AnkiDroid/src/main/java/com/ichi2/utils/LanguageUtil.kt Outdated Show resolved Hide resolved
AnkiDroid/src/main/java/com/ichi2/utils/LanguageUtil.kt Outdated Show resolved Hide resolved
@david-allison
Copy link
Member

Note: only one comment was addressed

Copy link
Member

@david-allison david-allison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, cheers!

@david-allison david-allison added squash-merge The pull request currently requires maintainers to "Squash Merge" Needs Second Approval Has one approval, one more approval to merge labels Sep 26, 2024
Arthur-Milchior and others added 2 commits September 26, 2024 18:47
The menu provided to the users to select their languages is not yet
perfect. It's hard to know for certain what is the correct way to
display the languages to the users. The function provided by Android
was buggy, and so, a year and a half ago, the language names were
hard-coded in ankidroid#13275. Except that nobody speaks the 93 languages in
which we have translations available.

On the long term, I hope that 92 translators will provide 92 language
string to use. ankidroid#17120 should start the process to eventually fix this
error.

On the short term, there is one change that we can make that will
probably be, on average, an improvement. Using upper case for the
first letter of each name.

If I understand correctly Brayan's comment, this change would be
correct for Portuguese. I can confirm it's correct for French. I can't
promise this won't make things worse for some language. But, if we got
some right previously, it was by accident, and I still hope this is,
on average, an improvement.

The upper cases were obtained by using the "set first later to upper
case" feature of emacs on each language name.

íslenska, isiXhosa and isiZulu appear to start with lowercase letters
so these have not been updated

Fixed: ankidroid#17118

Co-authored-by: David Allison <[email protected]>
Santali and Sardinian were written in English, rather than their respective languages

These values were obtained using `ULocale`.

Sources (Unicode CLDR):

* `ᱥᱟᱱᱛᱟᱲᱤ` - https://github.com/unicode-org/cldr/blob/731f226f93f95635500bbbadccf96798c23e4c9a/common/main/sat.xml#L365C25-L365C32
* `sardu` - https://github.com/unicode-org/cldr/blob/731f226f93f95635500bbbadccf96798c23e4c9a/common/main/sc.xml#L369C24-L369C29
  * Casing rules do not appear in CLDR yet, but I assume that uppercasing the name is OK

Co-authored-by: David Allison <[email protected]>
@david-allison david-allison removed the squash-merge The pull request currently requires maintainers to "Squash Merge" label Sep 26, 2024
@david-allison
Copy link
Member

@Arthur-Milchior I've squashed the commits and force pushed.

I've added additional information to the first commit message, and written the second.

When feasible, could you review my changes and confirm you're happy with them, given I've set you as the author on both

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Second Approval Has one approval, one more approval to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

All languages whose name uses the Latin alphabet should probably be upper case.
2 participants