-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default arguments support #24 #62
base: main
Are you sure you want to change the base?
Conversation
Macro extension is done in two phases, attaching arguments and expanding macros. Only in the first phase has a call to `parseArgspec`, so it seems natural to deal with default arguments in the first phase. We additionally attach default arguments to the AST during attachMacroArgs and then it will be picked up by existing macro expanders. This breaks an invariant that printRaw(attachArgs(X)) = printRaw(X).
Doing so required a rewrite of pegjs grammar use for parsing xparse argument specifications. Previously, a lot of things were modeled by an ArgSpec struct `Group`, but in the end those used in O,R,D and those for e,E,u required different behavior in many aspects, so the pegjs grammar now differentiate them via `group` and `collection`. Also, the ArgSpec does not have Group anymore, everything is parsed to an array of strings. This should fix siefkenj#46.
Windows Previously, the test was failing when run on Windows terminals due to how it treats quotes passed as arguments. We now use 'cross-spawn' module to get around of this issue.
On a second thought, I am uneasy about attaching default args to AST, I'll think about moving that under If we choose that way, I may leave |
This is exciting :-). I will review it soon. Just a comment, though. |
@@ -25,7 +25,7 @@ describe("unified-latex-util-argspec", () => { | |||
"o m o !o m", | |||
"!o r() m", | |||
"O{somedefault} m o", | |||
"m e{^}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this test removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was replaced with m e^
. Now the printRaw
function does something smarter, and it omits braces when there's a single embellishment token of single width (which is something that wasn't even accepted at pegjs grammar level before)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to have it default to always using braces. You can add e^
as a separate test though, just to make sure we support the non-braced version.
Would you be able to make |
Sure, done at #63 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all this work!
I've made several comments. I think we can simplify the peggy grammar further. As well, we can move most logic about default arguments to the expander. If _renderInfo
of the macro is supplied with the parsed defaultArgs
, then the expander can have all the smarts to use them appropriately.
packages/unified-latex-util-pegjs/grammars/xparse-argspec.pegjs
Outdated
Show resolved
Hide resolved
= "{" content:($(!"}" !braced_group .) / braced_group)* "}" { | ||
return { type: "group", content: content }; | ||
= "{" content:(control_word_or_symbol / non_brace / braced_group)* "}" { | ||
return content; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature is (Array | string)[]
? That makes me a little uncomfortable...I prefer the {type: "group", content: ...}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this comment was resolved. Did I miss the change?
packages/unified-latex-util-pegjs/grammars/xparse-argspec.pegjs
Outdated
Show resolved
Hide resolved
packages/unified-latex-util-pegjs/grammars/xparse-argspec.pegjs
Outdated
Show resolved
Hide resolved
embellishmentTokens: args.content.map(groupContent).flat(), | ||
defaultArg: g, | ||
embellishmentTokens: args, | ||
embellishmentDefaultArg: g |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this as defaultArg
packages/unified-latex-util-pegjs/grammars/xparse-argspec.pegjs
Outdated
Show resolved
Hide resolved
packages/unified-latex-util-arguments/libs/gobble-single-argument.ts
Outdated
Show resolved
Hide resolved
packages/unified-latex-util-arguments/libs/gobble-single-argument.ts
Outdated
Show resolved
Hide resolved
@@ -25,7 +25,7 @@ describe("unified-latex-util-argspec", () => { | |||
"o m o !o m", | |||
"!o r() m", | |||
"O{somedefault} m o", | |||
"m e{^}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to have it default to always using braces. You can add e^
as a separate test though, just to make sure we support the non-braced version.
I need to get this going, and I propose to ditch Typescript projects in favor of lerna. Typescript project not allowing circular dependency is the problem. Lerna allows us to do everything that Typescript project does. In the long term, I see benefits of disallowing circular dependencies, and in order to implement contextual macro expansion #43, I anyway see the needs for some architectural change. However, it'd be much better if we can do it incrementally, and introducing circular dependencies among projects in the meantime would be inevitable. There are many testimonies in microsoft/TypeScript#33685 that some had to abandon TS projects because there is no workaround for circular dependency issues. I can make a separate PR that does this, merge that here and then make this PR to a state that is reviewable. |
Initially we have assumed that 'until's token list can be parsed in a way similar to that of embellishments, but it turned out that there are some differences. Thus, I have simplified pegjs grammar for until and also worked out to make multi-token stop work.
@theseanl I am strongly against circular dependencies. I think refactoring things into their own package is the solution most of the time. Why do you think a circular dependency is required? |
I didn't say that it is required. What I think is that attempting such a refactoring you are talking about is likely beyond the scope of this PR, and will slow down the development process. The reason for this is that in order to parse default arguments, we need IMO, we anyway will need some major architectural change to implement contextual macro expansion, and it will require passing down some config or host objects all the way down to a lot of places, including I am not sure about your plans for tackling those problems, but I would first focus on default arguments support, and the then to contextual macro parsing which is likely to resolve the tight coupling issue in the meantime. This package is a nodejs module. IMO circular dependency is not something that is tabooed in Javascript environment, ES modules specification is carefully defined so that it handles circular dependencies in a graceful and predictable way. I don't see a reason to avoid circular dependencies. |
Frankly, I think the root cause is that we are re-parsing argspecs after stringifying it. |
It seems appropriate to split |
That indeed seems a quick and easy solution to our situation without getting my hands dirty with cleaning up all the mess :) |
Previously, typescript project references were used to wire imports between workspace packages. Now this is done by package.json export conditions. This nodejs feature has already been used to support consuming packages in both esm and cjs format. We add an artificial condition "_bundle", and use the respective bundler's export conditions feature https://esbuild.github.io/api/#conditions and https://www.typescriptlang.org/tsconfig#customConditions to achieve that IDE and TS compiler resolve any workspace package imports to their respective Typescript source file, not their build output .d.ts files. This has an additional benefit of reducing bundle size of cjs build. Previously, CJS build operated on ESM build output; They fed dist/index.js files to esbuild. This changes this so that esbuild directly operates on Typescript source files even in CJS build. As esbuild will have more information, it will be able to better tree-shake unused codes.
also used double quotes for interoparabilities with Windows
Until argspec's behavior was fixed, and while doing so, a support for multi-token stops was added. Also, now it properly supports macro delimiters (which was really just a by-product of applying uniform treatment to any logic related to finding braces). This fixes siefkenj#46.
e416cd8
to
2067555
Compare
Based on new observations above, I've simplified the pegjs part a bit and worked on gobbling until arguments. While doing so, I thought I could support multiple stop tokens, so I did it. It is currently 'dirty' in that it may introduce unnecessary string splits while doing stop tokens matching. Having extraneous string split should be harmless, but we may anyway 'clean' it if needed... It currently lacks support for default arguments referencing other arguments, such as It currently does have circular dependency. I'll try to see if it can be removed in next iterations, but I'm not sure if it will be done. The other PR should have removed any build-related issues that may arise from circular dependency. |
In order to deal with default arguments referencing other arguments, it seems to be better to move the logic from attachArgs to expandMacros.
Now expandMacros can correctly expand macros whose default arguments reference other arguments. A special care was taken in order to follow the behavior of xparse in case of circular references. In most of the cases, xparse throws a compilation error, but it works in case like `O{siefkenj#2} O{siefkenj#1}`. From this, we can reasonably guess that xparse treats default arguments that directly copies another argument's value in a special way. This behavior was implemented with a simple algorithm, and it turns out to be consistent with xparse's behavior in cases considered in unit tests.
Implemented support for default arguments referencing other arguments! It turned out to be quite tricky, but in the end it boiled down to a simple algorithm. Now I think this PR is feature complete, ready to be reviewed.
|
5dd6453
to
21a1733
Compare
37ae5c0
to
ebcf3aa
Compare
cb5bf6a
to
7d12d13
Compare
Since package.json is re-written before publishing, we are free to point `import` condition to `.ts` source files. However, CLI tests need to be able to resolve to `dist/*.js` files, so a custom condition `prebuilt` pointing to those are kept in every package.json.
Now that #67 is merged, this should be easier to review :-) |
@@ -83,6 +83,8 @@ export function printLatexAst( | |||
return printVerbatimEnvironment(path, print, options); | |||
case "whitespace": | |||
return line; | |||
case "hash_number": | |||
return "#" + node.number.toString(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not #${node.number}
?
packages/unified-latex-util-arguments/libs/gobble-single-argument.ts
Outdated
Show resolved
Hide resolved
packages/unified-latex-util-macros/tests/macro-expansion.test.ts
Outdated
Show resolved
Hide resolved
packages/unified-latex-util-macros/tests/macro-expansion.test.ts
Outdated
Show resolved
Hide resolved
= "{" content:($(!"}" !braced_group .) / braced_group)* "}" { | ||
return { type: "group", content: content }; | ||
= "{" content:(control_word_or_symbol / non_brace / braced_group)* "}" { | ||
return content; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this comment was resolved. Did I miss the change?
c705354
to
7848c5b
Compare
7b69a61
to
916e10f
Compare
The code path for optional argument without a default value wasn't being tested. Added a test and fixed it so that it properly expands to -NoValue-.
The goal of this PR is to add support for macros with default arguments in macro expansion, e.g. as used in CLI's
-e
option. This should fix #24.This PR mostly modifies
unified-latex-util-macros
package. Also the argspec parser was modified to correctly provide default arguments. In doing so, I have simplified the AST by removingGroup
and always returning an array ofstring
s. Some complexity were moved to the argspec serializer. Previously default arguments and embellishment tokens were encoded in the sameGroup
s, but it seemed to me that they need to behave differently in several aspects, so now in pegjs they are represented as different termsgroup
andcollection
s. This irons out several places which only had supported specifying a single character, but actually should have supported arbitrary token. Also #46 was fixed.