Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong color for two import declarations with semicolon between #34

Open
willfaught opened this issue Jun 11, 2022 · 14 comments
Open

Wrong color for two import declarations with semicolon between #34

willfaught opened this issue Jun 11, 2022 · 14 comments

Comments

@willfaught
Copy link

(Originally reported at golang/vscode-go#2273.)

The first import path string is colored white instead of orange. If there are two import keywords, the second one is white too.

Steps to reproduce the behavior:

Start a new Go file and paste in this:

package p
import ("p";"q")
import "p";import "q"

Screenshot

Screen Shot 2022-06-01 at 4 59 08 PM

Versions

Found in VS Code, which apparently copies this repo for its Go syntax highlighting.

go version:

go version go1.18.2 darwin/amd64

code -v:

1.67.2
c3511e6c69bb39013c4a4b7b9566ec1ca73fc4d5
x64
@jeff-hykin
Copy link
Owner

jeff-hykin commented Jun 11, 2022

You seem knowledgeable about go. It might surprise you that I haven't run more than a hello world in Go, the reason I'm managing this repo is only because I understand textmate syntax highlighting, and am active enough to accept PR's. The previous repo VS code had quality PR's that were years old.

So if you want to fix some of the issues you're in luck, I updated the readme + documention/contributing.md should be up to date if you decide these issues are worth your time to fix.

@willfaught
Copy link
Author

@jeff-hykin Thanks. I looked at README, CONTRIBUTING, and main/, but I'm not clear on how this is all working. My understanding is that you've built a .tmLanguage.json file reader, AST, and writer in Ruby, and then have a pre-existing .tmLanguage.json file from the Atom editor that you override parts of as you do the read-transform-write process when generating the final .tmLanguage.json file. Do I have that right?

If so, is the recommended way to fix existing syntax by overriding the relevant pattern?

What's the diff between old.tmLanguage.json and modified.tmLanguage.json? Where is old used? Is one of those files the "final" "product" for the repo?

I see names/tags like "punctuation.separator.attribute" given all over the place. Is there a master list of these somewhere? Do these have special meaning in .tmLanguage.json files, or can you name them whatever you want?

@jeff-hykin
Copy link
Owner

jeff-hykin commented Jun 11, 2022

Yes that's pretty much right!

The old.tmLanguage.json is kept around as a reference incase something needs to be undone (or checked for regression). If it's easiest to modify the tmLanguage for a feature (like deleting a pattern) that's where the modified.tmLanguage.json comes in.

Sadly no... the names/tags are both free-form and have no true standard, which is a constant source of problems for all languages. Themes can match any free-form string, and grammars can add any free-form name. If they don't coordinate/cooperate, that's when stuff doesn't get highlighted correctly.

What we have instead is many half-standards.

The ruby library attempts to give warnings to really uncommon names, but it's not always right.

The old incomplete/broken standard that most all languages roughly follow is near the bottom of this page (ctrl-F for "Naming Conventions") https://macromates.com/manual/en/language_grammars

You can look at other languages to get an idea, for example here's all the C++ ones.
https://github.com/jeff-hykin/better-cpp-syntax/blob/master/autogenerated/cpp_scopes.txt

You can also run "Inspect TM Scopes" in the command pallet, and click on words in other languages to see what it's scoped/tagged as.

The best effort at creating a real standard is this:
Skip to "List of Syntax Classes"
atom/flight-manual.atom.io#564

However it's not backwards compatible, and I don't think any themes are aware of it.

@jeff-hykin
Copy link
Owner

I'll see if I can help you get started with overriding this import pattern

@jeff-hykin
Copy link
Owner

jeff-hykin commented Jun 12, 2022

Okay I converted the go-import pattern into the main.rb as an example/reference of the process.

You can make changes and see what happens

  • clone the repo
  • commands/start (takes a long time on the first run)
  • project build
  • Then launch "debugger: Start" through VS Code and open a go file in the new window
  • If the changes are good, run project test which will generate test case output. If the output is different, see why (either an improvement or a regression)

@willfaught
Copy link
Author

So is modified.tmLanguage.json copied into VSC verbatim, or are they running your transformer process to get the final .tmLanguage.json file? I don't see an obvious "plug this into your syntax highlighter" .tmLanguage.json artifact under main/ or /.

If names/tags are arbitrary, then how do TM themes color syntax items consistently? Does each TM theme know the particular names/tags used by each programming language .tmLanguage.json? Does renaming an existing name/tag break TM themes?

I don't see any matches in the VSC command palette for "Inspect TM Scopes". I see a "Developer: Inspect Editor Tokens and Scopes". Is that it?

Thanks for the steps. I'll take a look. So I shouldn't follow the setup.md steps to get set up? That document says not to clone.

@jeff-hykin
Copy link
Owner

jeff-hykin commented Jun 12, 2022

So is modified.tmLanguage.json copied into VSC verbatim, or are they running your transformer process to get the final .tmLanguage.json file?

Oh, VS Code (like the debugger) is using autogenerated/go.tmLanguage.json, that's the only file it needs for highlighting.

If names/tags are arbitrary, then how do TM themes color syntax items consistently?

Easy; they don't. They're just inconsistent 🙃

Does each TM theme know the particular names/tags used by each programming language .tmLanguage.json

Many of the good ones do this. Also many syntaxes piggyback off using tags similar to more popular languages (sometimes to a fault). But for example, regex in ruby and regex in JavaScript I believe do not use the same tags even though they're really the same thing. The whole world of syntax highlighting is a mess.

"Developer: Inspect Editor Tokens and Scopes". Is that it?

Oh yeah that's it. They must've renamed it, which is a good thing.

So I shouldn't follow the setup.md steps to get set up? That document says not to clone.

Oh definitely follow the setup, I was trying to give a high level overview, mostly of the project run and project test.
I should probably just add what I said to the setup.md document

@jeff-hykin
Copy link
Owner

Also if you want to do general work, I think there's a ton of low hanging fruit for the go lang syntax.

@willfaught
Copy link
Author

VS Code (like the debugger) is using autogenerated/go.tmLanguage.json, that's the only file it needs for highlighting.

Why not check that file in, and then provide the Ruby helpers for future modifications/fixes that operate on that file?

Based on https://macromates.com/manual/en/language_grammars, it seems that .tmLanguage is the actual TM format, which is just property list format. Why is this repo making .json artifacts instead? I would think .tmLanguage would be more portable/compatible. Is this to conform to some VSC requirement for an equivalent JSON representation?

if you want to do general work, I think there's a ton of low hanging fruit for the go lang syntax

Like what? Do you mean the current GitHub issues?

@jeff-hykin
Copy link
Owner

jeff-hykin commented Jun 12, 2022

Why not check that file in, and then provide the Ruby helpers for future modifications/fixes that operate on that file?

I'm not exactly sure how this is being imagined (idk what "checking in" would be) but maybe this example will help; many times keywords are needed as a dont-match list in other patterns. Patterns like vars, function names, package names, etc. In any serialized format, like .JSON, that means that keyword-list must be copied and pasted all over the place. And sometimes there are edgecases or modifications; so not only is it copy and paste, but copy-paste-modify-exclude-extend. That is only a tiny case of the massive amount of code duplication, like pattern reuse, or pattern escaping, that textmate files require. Even trivial things, like adding a comment after a tag, or unit tests, are impossible in the .JSON. The C++ syntax generates a 19,000 line JSON file, it has a single line regex pattern thats bigger than the entire go syntax without whitespace. The C++ syntax doesn't import an existing .tmLanguage, it purely generates one.

In general it's unmaintainable to merely modify the JSON, rather than generate it (which is partly why so many syntaxes are poorly maintained).

Why is this repo making .json artifacts instead?

I've never seen anyone use the original .tmLanguage format. It is very old. However I have seen people use XML/plist, also Sublime Text syntaxes use .tmLanguage.yml. However the vast majority of syntaxes use JSON and I believe VS Code and Atom only accept JSON and XML as valid formats.

Like what? Do you mean the current GitHub issues?

Nah, just whatever you look at and think "it would be nice if __ was highlighted".

@willfaught
Copy link
Author

Thanks for the explanation.

OK, I think my initial understanding wasn't right. I thought the input and output formats for your Ruby tool were the same, and the tool was a helper for making one-time modifications/transformations to the input and then writing the output back out and saving it (perhaps by overwriting modified.tmLanguage.json). I asked why you don't check in go.tmLanguage.json because it seemed to me that you could just replace modified.tmLanguage.json with it. Like, that would then be your starting point for future modifications, and in general the Ruby tool would be a no-op, since all the changes would be saved in the output file.

I understand what you mean about the tool helping with duplication, but I'm confused on how modified.tmLanguage.json and go.tmLanguage.json differ, since they seem (by their file extension) to have the same format, and therefore the same structure/meaning. Wouldn't both have the same duplication problem? What is the tool doing with modified.tmLanguage.json to manage the duplication that (I guess?) is already present in it?

@willfaught
Copy link
Author

Sorry, I just saw that autogenerated/go.tmLanguage.json is checked in. :) That part makes sense, now.

@willfaught
Copy link
Author

I'm not understanding e849293. You modified the old.*, modified.*, and go.* files. I thought old.* was just a reference? Can you explain the change process there?

@jeff-hykin
Copy link
Owner

jeff-hykin commented Jun 13, 2022

You modified the old.*

Actually I'm glad you caught that, I was comparing an auto-generated pattern to the original pattern within the file and must've forgot to undo the change before committing. Next time I have the codebase open I'll revert that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants