Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify whitespace handling #46

Merged
merged 3 commits into from
Sep 23, 2024

Conversation

cxw42
Copy link
Member

@cxw42 cxw42 commented Feb 11, 2024

  • Section names can have trailing whitespace (the core tests expressly check this).
  • When deciding what kind of line we have, all leading whitespace is ignored, and the trailing LF or CRLF is also ignored.
  • Blank lines do not have to include whitespace.
  • Fix some typos (change Markdown single-backticks to rst double-backticks).

Fixes editorconfig/editorconfig#500

@cxw42 cxw42 added the bug Something isn't working label Feb 11, 2024
@cxw42 cxw42 requested a review from xuhdev February 11, 2024 20:38
@cxw42 cxw42 self-assigned this Feb 11, 2024
@cxw42 cxw42 requested a review from ppalaga February 12, 2024 02:23
Copy link
Member

@xuhdev xuhdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one question, otherwise LTGM

index.rst Outdated

- Blank: contains only whitespace characters.
- Blank: contains nothing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

Suggested change
- Blank: contains nothing.
- Blank: contains nothing or only whitespace characters.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I scratched my head a bit about that as well. The line just before the bullets says "once leading whitespace is removed", so there would be nothing left. I thought that was cleaner than adding "may contain leading whitespace" to each bullet.

How about moving blanks out of the bullets? Something like:

+When reading an EditorConfig file, the following are ignored:
+ - blank lines (empty, or nothing but whitespace)
+ - all beginning whitespace on each line.
+Each non-blank line must be one of the following, once leading whitespace is removed
+(and ignoring any trailing line separator):
- - Blank: contains only whitespace characters
  - Comment: ...

@Vampire
Copy link

Vampire commented Jun 4, 2024

Any news here? :-)

@florianb
Copy link
Member

@cxw42 thanks for opening this - i got a question, so this to clarify that all empty lines (using only invisible characters) are ignored and all other lines should be interpreted after trimming (removing all invisible characters)?

I am asking this because i was confused if that is meant with "once leading whitespace is removed (and ignoring any trailing line separator".

@Vampire
Copy link

Vampire commented Jul 25, 2024

I'd say the only meaning changes here are, that blank lines may be empty when before they needed to have any whitespace, except if you consider the line separator part of the line. In the latter case I'd say all here is just using clearer words without semantic changes

@florianb
Copy link
Member

Thanks @Vampire - i'm no native speaker so i wonder if the whole description is just difficult paraphrase of

  • A EditorConfig file is consumed line by line
  • Each line has to be trimmed before interpretation
  • Empty lines should be ignored

@cxw42
Copy link
Member Author

cxw42 commented Sep 15, 2024

Thanks @Vampire for pointing out my confusion in the ticket.

@florianb I like the idea of specifing processing imperatively. I have just force-pushed a change that implements that idea. Thanks!

Let me know what you all think!

@Vampire
Copy link

Vampire commented Sep 15, 2024

I would re-add an explicit mentioning of empty lines.
Currently it just says "if it not empty handle like type below".
And below the list of line types it says "all other lines are invalid".
So you can interpret this as "skip to the next line if a line is empty after the trim", but it would be open for interpretation again.
I think if you explicitly say "if it is empty, ignore the line" there should be no room for interpretation in that regard.

@cxw42
Copy link
Member Author

cxw42 commented Sep 15, 2024

Updated!

  • Re-added blank lines
  • Fixed a typo I noticed
  • Linked from the line types to relevant sections so that "process it this way" makes more sense.
Changes

$ git diff f4ebd49 5df4933
diff --git a/index.rst b/index.rst
index 2cf9e03..3ab63ce 100644
--- a/index.rst
+++ b/index.rst
@@ -79,18 +79,25 @@ EditorConfig files are in an INI-like file format.
 To read an EditorConfig file, take one line at a time.  For each line:
 
 #. Strip all leading and trailing whitespace
+#. If the remaining text is empty, ignore the line.
 #. If the remaining text is not empty, process the text as specified for its
    type below.
 
 The types of lines are:
 
-- Comment: starts with a ``;`` or a ``#``.
+- Comment: starts with a ``;`` or a ``#``.  Comment lines are ignored.
+
 - Section Header: starts with a ``[`` and ends with a ``]``.
+  These lines define globs; see :ref:`glob-expressions`.
+
    - May contain any characters between the square brackets (e.g.,
      ``[`` and ``]`` and even spaces and tabs are allowed).
    - Forward slashes (``/``) are used as path separators.
    - Backslashes (``\\``) are not allowed as path separators (even on Windows).
+
 - Key-Value Pair (or Pair): contains a key and a value, separated by an ``=``.
+  See :ref:`supported-pairs`.
+
    - Key: The part before the first ``=`` (trimmed of whitespace, but including
      any whitespace in the middle).
    - Value: The part after the first ``=`` (trimmed of whitespace, but including
@@ -119,7 +126,7 @@ This specification does not define any "escaping" mechanism for
 
 .. admonition :: Compatibility
 
-  The EditorConfig file format formerly allowed the use of ``;`` and ``;`` after the
+  The EditorConfig file format formerly allowed the use of ``;`` and ``#`` after the
   beginning of the line to mark the rest of a line as comment. This led to
   confusion how to parse values containing those characters. Old EditorConfig
   parsers may still allow inline comments.
@@ -135,6 +142,8 @@ The parts of an EditorConfig file are:
 - Section: the lines starting from a Section Header until the beginning of
   the next Section Header or the end of the file.
 
+.. _glob-expressions:
+
 Glob Expressions
 ================
 
@@ -187,6 +196,8 @@ precedence. If multiple EditorConfig files have matching sections, the rules
 from the closer EditorConfig file are read last, so pairs in closer
 files take precedence.
 
+.. _supported-pairs:
+
 Supported Pairs
 ===============
 

Copy link

@Vampire Vampire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

index.rst Outdated Show resolved Hide resolved
Change single-backticks to double-backticks as required by rst.
- Clarify that leading and trailing whitespace are ignored
  on all lines (so empty lines count as blank, which they technically
  did not before).
- Express the parsing logic imperatively to make it easier to read.
- Add links within the document from line types to relevant sections.
Copy link
Member

@xuhdev xuhdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! 🎆

@xuhdev xuhdev merged commit 329077a into editorconfig:master Sep 23, 2024
1 check passed
@cxw42 cxw42 deleted the issue500-clarify-whitespace branch September 23, 2024 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Spec incorrectly prohibits spaces around section-name square brackets
4 participants