Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added parse_lark_grammar.py and the syntax parameter/registry #1019

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

MegaIng
Copy link
Member

@MegaIng MegaIng commented Oct 15, 2021

I think the core design here stands. I would like feedback on that already.

Primarily missing here is tests, examples and docs.

@erezsh
Copy link
Member

erezsh commented Oct 15, 2021

It needs a little tidying up but overall it looks alright.

A few things:

  • We already use '' as a magic constant. I think anything else shouldn't concern us.

  • RE_FLAGS isn't a good enough reason for having a recursive import.

  • I think the default for syntax should be "lark". And it shouldn't override the extension, if one exists. If anyone wants to augment the lark syntax, they can use "grammar.better_lark" or whatever.

    edit: So maybe it's better to rename it to default_syntax

@MegaIng
Copy link
Member Author

MegaIng commented Oct 15, 2021

We already use '' as a magic constant. I think anything else shouldn't concern us.

??

RE_FLAGS isn't a good enough reason for having a recursive import.

Fair enough. For some reason I thought it was more than just that.

Currently the order is (syntax_hint, extension, "lark"). You propose (extension, syntax_hint, "lark") or (extension, default_syntax)?

@erezsh
Copy link
Member

erezsh commented Oct 15, 2021

Oops sorry. I meant <string>

@erezsh
Copy link
Member

erezsh commented Oct 15, 2021

The last one. Unless you can think of a good reason to choose one of the others.

@MegaIng
Copy link
Member Author

MegaIng commented Oct 15, 2021

My idea was to always fall back to lark in case people have grammars named .grammar or .g or something like that.

@MegaIng
Copy link
Member Author

MegaIng commented Oct 15, 2021

That reminds me, we should also because of this start loading files with different extensions.

@erezsh
Copy link
Member

erezsh commented Oct 15, 2021

That's a good point. I know some users had those for Lark 0.x

I guess my thinking is that it's okay to force them to use .lark for 1.x?

If we do it your way, I have a few questions -

  • what happens when someone does %import foo (bar)? Do we look for foo.g? And how do we know how to do it? If it's only for Lark.open, we can say syntax only overwrites that.

P.s. another option is that they can register_syntax('g', lark_parse_grammar) or something of that sort

  • If we have foo.ABNF? Do we use lark and throw a confusing syntax error?

  • If we %import foo and have foo.abnf but syntax="lark", will it try to parse it as lark?

And a general question - what happens if we have both foo.lark and foo.abnf? Makes me wonder if perhaps we shouldn't try to guess file extensions automatically on import in the first place..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants