Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow bracket pairs to share open tokens or close tokens in the colorizer #132504

Merged
merged 1 commit into from
Sep 20, 2021

Conversation

DragWx
Copy link
Contributor

@DragWx DragWx commented Sep 7, 2021

This addresses issue #132209.

This change modifies the bracket pair colorizer to allow opening brackets to share the same closing bracket, and closing brackets to share the same opening bracket, which is a feature in certain languages, such as Verilog and LaTeX.

To do this, a pairsWith: number[] property was added to the Token class (from tokenizer.ts), used by closing brackets only, to allow them to specify multiple category numbers (used by opening brackets) to which they can match.

When loading bracket pairs from languages, new category numbers are only given to unique opening brackets, and duplicate closing brackets get the associated opening bracket's category number added to their pairsWith property.

Finally, every instance of comparing opening bracket category numbers with closing bracket category numbers was replaced with appropriate logic to check if the opening bracket category number exists within the closing bracket's pairsWith property.

The existing test within tokenizer.test.ts was updated to account for the new pairsWith property, and a very basic test for the correct category and pairsWith properties (plus all the other properties too, but especially these) of tokens loaded from languages was added to a new file brackets.test.ts.

This pull request doesn't address any other issues, only bracket sharing.

image

@ghost
Copy link

ghost commented Sep 7, 2021

CLA assistant check
All CLA requirements met.

@@ -458,9 +458,11 @@ export class InvalidBracketAstNode extends BaseAstNode {

public readonly unopenedBrackets: SmallImmutableSet<number>;

constructor(category: number, length: Length, denseKeyProvider: DenseKeyProvider<number>) {
constructor(pairsWith: number[], length: Length, denseKeyProvider: DenseKeyProvider<number>) {
Copy link
Member

@hediet hediet Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried using category: number instead?
All this code is highly performance critical, and it would be nice if we don't need an array here.

Basically, if a symbol begin1 is closed by end and begin2 is also closed by end, then begin1 and begin2 currently have distinct categories. Even more, end is registered twice with both the category of begin1 and of begin2. However, only the first registration of end is actually being matched, so begin2 and end never match.

I think a more performant solution would be to assign begin1 and begin2 the same category (which makes sense, as they are interchangeable from the perspective of bracket pair colorization). Then everything else would work out of the box.


Hmm, maybe this does not work.

Let's say begin11 and begin12 are both closed by end1.
Also, begin2 is closed by both end1 and end2.

Then, unfortunately, end1 and end2 aren't interchangeable.

begin2 ---------
    begin12    | These should match
end2 -----------

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least, pairsWith should also be an SmallImmutableSet I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, re: SmallImmutableSet. I can make that change if you'd like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, for the record, my first attempt was as you suggested, and it works for shared brackets which only have one-to-many relationships, but there are issues when brackets have many-to-many relationships. The plugin I'm using to test this scenario is LaTeX Workshop, which has brackets defined as follows:

["{", "}"],
["[", "]"],
["(", ")"],
["\\left(", "\\right)"],
["\\left(", "\\right."],
["\\left.", "\\right)"],
["\\left[", "\\right]"],
["\\left[", "\\right."],
["\\left.", "\\right]"],
["\\left\\{", "\\right\\}"],
["\\left\\{", "\\right."],
["\\left.", "\\right\\}"],
["\\left<", "\\right>"]

Trying, for example, to get \left( \right., \left. \right), but not \left. \right. to match is what made me jump to using a pairsWith list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be no performance impact for languages that just have { ... }, ( ... ) and [...].

Probably SmallImmutableSet can do that.

Comment on lines 51 to 73
let currIdx = idxOffset;

let existingOpeningBracket = categoryCache.get(pair[0]);
if (existingOpeningBracket === undefined) {
// Opening bracket is new, so create a new category for it and add it.
tokens.addBracket(languageId, pair[0], TokenKind.OpeningBracket, idxOffset, []);
categoryCache.set(pair[0], idxOffset);
idxOffset++;
} else {
// Opening bracket exists already, remember its category.
currIdx = existingOpeningBracket;
}

let existingClosingBracket = pairsWithCache.get(pair[1]);
if (existingClosingBracket === undefined) {
// Closing bracket is new, so tie it to the opening bracket's category and add it.
tokens.addBracket(languageId, pair[1], TokenKind.ClosingBracket, -1, [currIdx]);
pairsWithCache.set(pair[1], [currIdx]);
} else {
// Closing bracket exists already, so update its pairsWith property with the opening bracket's category.
existingClosingBracket.push(currIdx);
tokens.addBracket(languageId, pair[1], TokenKind.ClosingBracket, -1, existingClosingBracket);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code duplication is not so nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I can make it less redundant.

@DragWx
Copy link
Contributor Author

DragWx commented Sep 11, 2021

The Token class is changed such that category and pairsWith are now SmallImmutableSet<number> | null type.
They are only null when token is neither an opening nor closing bracket.
They are now directly compared with each other using intersects, which eliminated the need for DenseKeyProvider in parser.ts.
DenseKeyProvider was instead added to brackets.ts for use when scanning in brackets from language definition.
Tests updated to match updated Token class definition.

@hediet
Copy link
Member

hediet commented Sep 17, 2021

Thanks for the PR! I did some refactorings and will merge this PR soon.

I'd be very happy if you could fix the tests and do a git squash/rebase on the latest main!

I got rid of the "category" which tbh. was a design mistake I did. I made it very clear now that all bracket ids refer to the opening bracket.

Also, there should be no performance impact for languages that don't share brackets.

@DragWx
Copy link
Contributor Author

DragWx commented Sep 18, 2021

BPC tests updated, commits squashed and rebased with latest main. This is my first time doing a squash/rebase so I hope I did it correctly!

@hediet hediet merged commit 616310c into microsoft:main Sep 20, 2021
@hediet
Copy link
Member

hediet commented Sep 20, 2021

Thank you!

@github-actions github-actions bot locked and limited conversation to collaborators Nov 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Bracket Pair Colorization: Allow bracket pairs to share open tokens or close tokens
3 participants