Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xkbcli how-to-type: Enhance arguments parsing & doc #505

Merged
merged 2 commits into from
Sep 23, 2024

Conversation

wismill
Copy link
Member

@wismill wismill commented Sep 11, 2024

Currently the positional parameter of the CLI is either a Unicode code
point or a keysym. However their respective format is not documented.

It turns out that there are multiple issues due to the use of strtol:

  • Code points can be parsed as octal, decimal and hexadecimal while
    keysyms can only be parsed as hexadecimal. Some programs outputs
    keysyms in their decimal form (e.g. wev) so it is worth to bring
    symmetry with code points.
  • Octal format is unusual for both and is triggered by leading zeros,
    which is unintuitive in this context.
  • U+NNNN format is the standard format for Unicode code points but is
    not supported.
  • Plain characters are not supported, e.g.: a, é, ß, Æ, γ, 🦆, etc.
    Although this is probably the easiest format for most users.

Fixed the issues above:

  • Allow the code point to be passed exactly in the following formats:
    • Literal character (requires UTF-8 character encoding of the terminal);
    • Decimal number;
    • Hexadecimal number: either 0xNNNN or U+NNNN.
  • Allow the keysym to be passed exactly in the following formats:
    • Decimal number;
    • Hexadecimal number: either 0xNNNN;
    • Name.
  • Improve both --help message and manual.

@wismill wismill mentioned this pull request Sep 11, 2024
@wismill
Copy link
Member Author

wismill commented Sep 12, 2024

Hmm the use of strtol with base 0 parses strings starting with 0 as octal numbers. We should probably strip leading zeros to parse only decimal and hexadecimal numbers. It’s too easy to copy a Unicode code point from the standard format (e.g. U+0061) and get unexpected result.

@wismill wismill added the documentation Indicates a need for improvements or additions to documentation label Sep 12, 2024
@wismill wismill added this to the 1.8.0 milestone Sep 12, 2024
@wismill wismill changed the title xkbcli how-to-type: improve help message and manual xkbcli how-to-type: Enhance arguments parsing & doc Sep 12, 2024
@wismill wismill force-pushed the tools/how-to-type-doc branch 2 times, most recently from 51c102a to 71938d6 Compare September 12, 2024 17:02
@wismill
Copy link
Member Author

wismill commented Sep 12, 2024

A big part of the new code is about UTF-8 to UTF-32 decoding. It’s kept as an internal API for now. I added the related tests.

@wismill wismill requested a review from whot September 12, 2024 17:05
@wismill wismill marked this pull request as ready for review September 12, 2024 17:05
@wismill
Copy link
Member Author

wismill commented Sep 12, 2024

Note that we could go further and try to parse a keysym name without the --keysym flag, if previous attempts to parse as a code point / plain character failed. Maybe a new flag --codepoint would be useful then for completeness.

Add internal functions to convert UTF-32 to UTF-8, with corresponding
tests.
Currently the positional parameter of the CLI is either a Unicode code
point or a keysym. However their respective format is not documented.

It turns out that there are multiple issues due to the use of `strtol`:
- Code points can be parsed as octal, decimal and hexadecimal, while
  keysyms can only be parsed as hexadecimal. Some programs outputs
  keysyms in their decimal form (e.g. `wev`) so it is worth to bring
  symmetry with code points.
- Octal format is unusual for both and is triggered by leading zeros,
  which is unintuitive in this context.
- `U+NNNN` format is the standard format for Unicode code points but is
  not supported.
- Plain characters are not supported, e.g.: a, é, ß, Æ, γ, 🦆, etc.
  Although this is probably the easiest format for most users.

Fixed the issues above:
- Allow the code point to be passed exactly in the following formats:
  - Literal character (requires UTF-8 character encoding of the terminal);
  - Decimal number;
  - Hexadecimal number: either `0xNNNN` or `U+NNNN` (any digit count)
- Allow the keysym to be passed exactly in the following formats:
  - Decimal number;
  - Hexadecimal number: `0xNNNN` (any digit count);
  - Name.
- Improve both `--help` message and manual page.
@wismill wismill merged commit 0cd1087 into xkbcommon:master Sep 23, 2024
4 checks passed
@wismill wismill deleted the tools/how-to-type-doc branch September 23, 2024 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Indicates a need for improvements or additions to documentation tools: how-to-type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant