Keep the original letter-casing of keywords during the parsing phase #72

Delapouite · 2023-03-23T12:34:10Z

Hi

The org syntax allow to use both UPPERCASE and lowercase keywords. Example:

#+PROPERTIES
…
#+END

versus

#+properties
…
#+end

Currently, the parser forces the UPPERCASE output :

uniorg/packages/uniorg-parse/src/parser.ts

Line 1030 in a74a80b

const key = m[1].toUpperCase();

I understand that in a way this step can be beneficial to homogenize down the process pipeline.

But in situation where lots of org documents have been authored with the lowercase style, it means that in the case of pipeline doing read org files → parse them → do stuff → stringify → overwrite the file, this change of cosmetic style introduces a lot of noise, especially in diffs if the org files are versioned with git by examples.

Do you think we could keep the current behavior by default but add a new option to keep the case as authored in the original doc?

Thanks!

The text was updated successfully, but these errors were encountered:

rasendubi · 2023-03-23T17:45:36Z

Do you necessarily need to preserve the original spelling? Would having an option in uniorg-stringify to select uppercase/lowercase spelling work for you? I'm just worrying that allowing any case would complicate the processing and plugins. Besides upper- and lowercase, any mix is allowed (#+Title, #+tiTLe), so all processors would have to take that into account

Delapouite · 2023-03-23T20:04:25Z

I was not aware of the mixed-case possibilities. So I think you're right. Focusing on either upper or lowercase choice should already by a good enough option. Thanks

rasendubi · 2023-03-24T02:21:33Z

Just checked and the current behavior is also consistent with org-elements (the reference parser in emacs-lisp).

Given the following org document:

#+test: blah

it produces the following AST:

((section
  (:begin 1 :end 13 :mode first-section :granularity nil)
  (keyword
   (:key "TEST" :value "blah" :mode top-comment :granularity nil))))

The lower-casing can be implemented in two ways: as a unified plugin (that traverses all keywords and lower-cases keys) or as a configuration for uniorg-stringify.

The plugin could go like this:

unified()
  .use(uniorgParse)
  .use(otherPlugins)
  // This plugin should be added immediately before
  // uniorg-stringify to not mess up with other plugins.
  .use(() => (tree) => {
    // visit from unist-util-visit
    visit(tree, 'keyword', (keyword) => {
      keyword.key = keyword.key.toLowerCase();
    });
  })
  .use(uniorgStringify)

Adjusting uniorg-stringify is obviously more involved. Especially because it's currently lacking in options handling. Though if we implement handlers as in uniorg-rehype, that makes it much more powerful and I'm willing to accept a PR

Delapouite mentioned this issue May 18, 2024

feat: add options to uniorgStringify to customize handlers #107

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep the original letter-casing of keywords during the parsing phase #72

Keep the original letter-casing of keywords during the parsing phase #72

Delapouite commented Mar 23, 2023

rasendubi commented Mar 23, 2023 via email

Delapouite commented Mar 23, 2023

rasendubi commented Mar 24, 2023 •

edited

Loading

Keep the original letter-casing of keywords during the parsing phase #72

Keep the original letter-casing of keywords during the parsing phase #72

Comments

Delapouite commented Mar 23, 2023

rasendubi commented Mar 23, 2023 via email

Delapouite commented Mar 23, 2023

rasendubi commented Mar 24, 2023 • edited Loading

rasendubi commented Mar 24, 2023 •

edited

Loading