Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError raised in format_skeleton when using fuzzy matching #1084

Open
tomasr8 opened this issue May 12, 2024 · 5 comments
Open

KeyError raised in format_skeleton when using fuzzy matching #1084

tomasr8 opened this issue May 12, 2024 · 5 comments

Comments

@tomasr8
Copy link
Member

tomasr8 commented May 12, 2024

Overview Description

When using format_skeleton, a KeyError is raised despite using fuzzy=True.

Steps to Reproduce

from datetime import datetime

from babel.dates import format_skeleton


dt = datetime(2012, 1, 1, 14, 30, 59)
format_skeleton("G", dt, locale="cs_CZ", fuzzy=True)

From the docs:

fuzzy – If the skeleton is not found, allow choosing a skeleton that’s close enough to it.

The way I read it is that as long as I pass fuzzy=True (which is the default) a skeleton should always be found and I should not need to worry about KeyErrors in that case.

Even the example in the docs makes it seem a KeyError is only thrown with fuzzy=False:

format_skeleton('yMMd', t, fuzzy=False, locale='fi')  # yMMd is not in the Finnish locale, an error is thrown

If this is the expected behaviour, I think the docs should state that explicitly. If not, I would suggest adding allow_different_fields=True to the underlying match_skeleton() call.

Additional Information

Babel version: 2.15

@tomasr8
Copy link
Member Author

tomasr8 commented Jul 15, 2024

@akx What do you think? Should we fix this?

@akx
Copy link
Member

akx commented Jul 17, 2024

Yeah, I think so. My original commit for fuzzy skeletons (fuzzy skeletons, yikes what a horrible mental image) cd70395 refers to the ICU4J getBestSkeleton function as the reference implementation.

Apparently there's been at least one commit unicode-org/icu@edaebfa that has improved that function since – we should maybe look at whether that improvement would in fact fix this?

@akx akx added this to the Babel 2.16 milestone Jul 17, 2024
@tomasr8
Copy link
Member Author

tomasr8 commented Jul 18, 2024

Looks like the only thing that changed is more format specifier replacements which we could add as well:

    if 'k' in skeleton and not any('k' in option for option in options):
        skeleton = skeleton.replace('k', 'H')
    if 'K' in skeleton and not any('K' in option for option in options):
        skeleton = skeleton.replace('K', 'h')
    if 'a' in skeleton and not any('a' in option for option in options):
        skeleton = skeleton.replace('a', '')
    if 'b' in skeleton and not any('b' in option for option in options):
        skeleton = skeleton.replace('b', '')

This doesn't fix the KeyError issue though. I'm a bit unsure whether we should fix it or just make it clear in the docs that the function can in fact raise even when fuzzy=True. If we do want to fix this, I think we can implement the recommendation from the tr35 doc for missing fields: https://www.unicode.org/reports/tr35/tr35-72/tr35-dates.html#missing-skeleton-fields

@akx
Copy link
Member

akx commented Jul 19, 2024

For the time being, we could maybe just document that fuzzy=True may still raise, and fix that in a subsequent version using that recommendation?

@tomasr8
Copy link
Member Author

tomasr8 commented Jul 19, 2024

Yeah, that sounds like the best option

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants