Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep the original letter-casing of keywords during the parsing phase #72

Open
Delapouite opened this issue Mar 23, 2023 · 3 comments
Open

Comments

@Delapouite
Copy link
Contributor

Hi

The org syntax allow to use both UPPERCASE and lowercase keywords. Example:

#+PROPERTIES
…
#+END

versus

#+properties
…
#+end

Currently, the parser forces the UPPERCASE output :

const key = m[1].toUpperCase();

I understand that in a way this step can be beneficial to homogenize down the process pipeline.

But in situation where lots of org documents have been authored with the lowercase style, it means that in the case of pipeline doing read org files → parse them → do stuff → stringify → overwrite the file, this change of cosmetic style introduces a lot of noise, especially in diffs if the org files are versioned with git by examples.

Do you think we could keep the current behavior by default but add a new option to keep the case as authored in the original doc?

Thanks!

@rasendubi
Copy link
Owner

rasendubi commented Mar 23, 2023 via email

@Delapouite
Copy link
Contributor Author

I was not aware of the mixed-case possibilities. So I think you're right. Focusing on either upper or lowercase choice should already by a good enough option. Thanks

@rasendubi
Copy link
Owner

rasendubi commented Mar 24, 2023

Just checked and the current behavior is also consistent with org-elements (the reference parser in emacs-lisp).

Given the following org document:

#+test: blah

it produces the following AST:

((section
  (:begin 1 :end 13 :mode first-section :granularity nil)
  (keyword
   (:key "TEST" :value "blah" :mode top-comment :granularity nil))))

The lower-casing can be implemented in two ways: as a unified plugin (that traverses all keywords and lower-cases keys) or as a configuration for uniorg-stringify.

The plugin could go like this:

unified()
  .use(uniorgParse)
  .use(otherPlugins)
  // This plugin should be added immediately before
  // uniorg-stringify to not mess up with other plugins.
  .use(() => (tree) => {
    // visit from unist-util-visit
    visit(tree, 'keyword', (keyword) => {
      keyword.key = keyword.key.toLowerCase();
    });
  })
  .use(uniorgStringify)

Adjusting uniorg-stringify is obviously more involved. Especially because it's currently lacking in options handling. Though if we implement handlers as in uniorg-rehype, that makes it much more powerful and I'm willing to accept a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants