Skip to content
This repository has been archived by the owner on Jul 15, 2021. It is now read-only.

Feature Request: Add comment nodes to AST #30

Open
jdrew1303 opened this issue Oct 24, 2016 · 3 comments
Open

Feature Request: Add comment nodes to AST #30

jdrew1303 opened this issue Oct 24, 2016 · 3 comments

Comments

@jdrew1303
Copy link

Adding in nodes to handle comments is needed to support other tooling that uses the parser to apply transformations to the entire codebase.

If we were to apply a refactoring using the following pipeline parse current codebase -> kebab-case all table and column names -> generate code from AST then the user would lose all comments that they had on their code.

It would also allow documentation generators to leverage the parser.

(If the comment nodes were to be based off of the ES6 ast, allowing it to use something like Estraverse, then that would be perfect 😉 👍 )

@nwronski
Copy link
Contributor

nwronski commented Oct 26, 2016

I thought of a problem if we wanted to add comments to the AST:

/* comments outside of a statement might work alright */
SELECT pants -- But what about 
FROM -- these kinds of comments
  laundry -- what would the AST look like?
WHERE
  clean = true;

Any ideas? Could we put these in the AST in a way that they wouldn't be disruptive and could also be used to reconstruct the original query with the comments in the right places?

@jdrew1303
Copy link
Author

jdrew1303 commented Oct 27, 2016

I've been thinking about this for a little while now and looking around to see what others are doing. I think esprima has a reasonably good way of handling this. They tie the comment to the nearest node as either a leadingComment (before the node) or trailingComment (after the node). They are handled as follows:

// Life, Universe, and Everything
var answer /* this is still legal*/ = 6 * 7; // this is trailing
{
    "type": "Program",
    "body": [
        {
            "type": "VariableDeclaration",
            "declarations": [
                {
                    "type": "VariableDeclarator",
                    "id": {
                        "type": "Identifier",
                        "name": "answer",
                        "trailingComments": [
                            {
                                "type": "Block",
                                "value": " this is still legal",
                                "range": [
                                    45,
                                    69
                                ]
                            }
                        ]
                    },
                    "init": {
                        "type": "BinaryExpression",
                        "operator": "*",
                        "left": {
                            "type": "Literal",
                            "value": 6,
                            "raw": "6"
                        },
                        "right": {
                            "type": "Literal",
                            "value": 7,
                            "raw": "7"
                        },
                        "leadingComments": [
                            {
                                "type": "Block",
                                "value": " this is still legal",
                                "range": [
                                    45,
                                    69
                                ]
                            }
                        ]
                    }
                }
            ],
            "kind": "var",
            "leadingComments": [
                {
                    "type": "Line",
                    "value": " Life, Universe, and Everything",
                    "range": [
                        0,
                        33
                    ]
                }
            ],
            "trailingComments": [
                {
                    "type": "Line",
                    "value": " this is trailing",
                    "range": [
                        79,
                        98
                    ]
                }
            ]
        }
    ],
    "sourceType": "script"
}

The area where it gets a bit weird is inside a function declaration with no args:

function noArg(/* I have no node to attach to*/){};

This creates an innerComment like so:

{
    "type": "Program",
    "body": [
        {
            "type": "FunctionDeclaration",
            "id": {
                "type": "Identifier",
                "name": "noArg"
            },
            "params": [],
            "body": {
                "type": "BlockStatement",
                "body": [],
                "innerComments": [
                    {
                        "type": "Block",
                        "value": " I have no node to attach to",
                        "range": [
                            15,
                            47
                        ]
                    }
                ]
            },
            "generator": false,
            "expression": false
        },
        {
            "type": "EmptyStatement"
        }
    ],
    "sourceType": "script"
}

I'm not sure if this would be needed (or worth the effort initially)? It would still be valid to do this in SQL though. What do you think?

@nwronski
Copy link
Contributor

I want to do this, but it would take huge changes to the entire grammar for the parser. I started experimenting with how this might be done, and came up with something that worked for comments that are between statements, but to get all the inline comments would take grammar changes every time the o rule is used in the grammar, which is ~230 times currently.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants